Scale AI

company

Verified

https://scale.com/

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

tu-trinh-scale authored a paper about 4 hours ago

HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?

tu-trinh-scale updated a bucket 13 days ago

ScaleAI/hil-bench-swe-images

agosai new activity 20 days ago

ScaleAI/audiomc:Judge prompt for ARS

View all activity

Papers

SciPredict: Can LLMs Predict the Outcomes of Scientific Experiments in Natural Sciences?

Agentic Rubrics as Contextual Verifiers for SWE Agents

View all Papers

authored a paper about 4 hours ago

HiL-Bench (Human-in-Loop Benchmark): Do Agents Know When to Ask for Help?

Paper • 2604.09408 • Published 7 days ago

updated a bucket 13 days ago

ScaleAI/hil-bench-swe-images

in ScaleAI/audiomc 20 days ago

Judge prompt for ARS

#4 opened 20 days ago by

submitted a paper to Daily Papers 22 days ago

SciPredict: Can LLMs Predict the Outcomes of Scientific Experiments in Natural Sciences?

Paper • 2604.10718 • Published 24 days ago • 4

published a bucket 29 days ago

ScaleAI/hil-bench-swe-databases

updated a bucket about 1 month ago

ScaleAI/hil-bench-sql-artifacts

updated a dataset about 1 month ago

ScaleAI/hil-bench

Viewer • Updated Mar 31 • 200 • 69 • 1

published a dataset about 1 month ago

ScaleAI/MultiChallenge

Viewer • Updated Mar 31 • 266 • 378 • 1

updated a dataset about 1 month ago

ScaleAI/MultiChallenge

Viewer • Updated Mar 31 • 266 • 378 • 1

mohit-raghavendra

in ScaleAI/SWE-Atlas-QnA about 1 month ago

Criterion Granularity Mismatch in Rubric (Example: Task 6905333b74f22949d97ba9f1, Criterion 1.11)

#2 opened about 2 months ago by

andrewpark-scaleai

updated a dataset about 1 month ago

ScaleAI/SWE-Atlas-QnA

Viewer • Updated Mar 31 • 124 • 244 • 14

andrewpark-scaleai

in ScaleAI/SWE-Atlas-QnA about 1 month ago

cannot pull images from today

#4 opened about 1 month ago by

mohit-raghavendra

updated a dataset about 1 month ago

ScaleAI/SWE-Atlas-QnA

Viewer • Updated Mar 31 • 124 • 244 • 14

updated a bucket about 1 month ago

ScaleAI/hil-bench-swe-databases

published a bucket about 1 month ago

ScaleAI/hil-bench-sql-artifacts

published a dataset about 1 month ago

ScaleAI/hil-bench

Viewer • Updated Mar 31 • 200 • 69 • 1

updated a dataset about 2 months ago

ScaleAI/lhaw

Viewer • Updated Mar 20 • 285 • 367 • 6

mohit-raghavendra

published a dataset 2 months ago

ScaleAI/SWE-Atlas-QnA

Viewer • Updated Mar 31 • 124 • 244 • 14

jda

in ScaleAI/SWE-bench_Pro 2 months ago

image-name mismatch

#3 opened 7 months ago by

updated a dataset 2 months ago

ScaleAI/RaR-Science

Viewer • Updated Feb 24 • 22.9k • 52 • 1