Profiled
← All opportunities
gigRemote OKRemote$50–$150/hour depending on domain expertise (specialist rates higher)

AI Evaluations Specialist — Contract

Surge AI

Surge AI hires domain experts as contract evaluators for frontier-lab training pipelines — your job is to design, write, or judge outputs against rubrics in your specialty (law, medicine, finance, academic research, software engineering, creative writing, math, etc.). The labs use the resulting data for RLHF, evals, and red-team curricula. How the engagement works: paid hourly, project-based, flexible hours (typically 10–30hrs/week while a project runs), remote, projects last weeks to months. Specialist rates apply if you have credentialed expertise (e.g. licensed attorney, MD, working software engineer at a senior level) — Surge has a known-good rate card you'll see during onboarding. Honest fit signals: — Deep domain expertise that the labs can't crowd-source. Generalist applicants get routed to lower-rate tasks and may not be matched at all in tight-domain projects. — You can write clearly and follow detailed rubrics. The work IS judgment calls, but they're judgment calls within tight definitions. — You're comfortable with NDA-bound contract work (you won't be allowed to talk publicly about specific projects or labs you work with). What's not a fit: anyone looking for full-time work (Surge is genuinely contract-only), anyone without specialist credentials hoping to make top rates, or anyone uncomfortable with their judgments being used to train commercial AI systems.
Sign in to applySign in to flag