One API call. Real-world data from real, verified humans.
Post a job. Verified workers capture photos, audio, and video from their phones — delivered to your folder.
Works with Claude Code, Cursor, and any MCP-compatible agent.
For workers: earn with the iOS appWorks with the tools you already use
- Claude
- OpenAI
- Cursor
- Stripe
- AWS S3
How it works
Post a Job
One API call. Describe what you need, set your price, target a location.
Verified Workers Capture
Workers browse jobs on the FirstHand iOS app and upload photos, audio, or video from their phone.
AI Scores & Delivers
Auto-scored by an AI ensemble. Approved files land in your folder with structured annotations.
Pick your lane. Post a job. Get annotated files in days.
One API, three use cases — authentic UGC, ground-truth reference data, and multimodal training sets. The chips on each card are real auto-generated annotations from the API response.

User-Generated Content
Replace stock photos with authentic, geo-targeted content from real people — commission the exact neighborhood, venue, or scenario you need.
- • Storefront photos from specific neighborhoods (SoHo, Williamsburg)
- • Lifestyle videos filmed at landmark locations
- • Audio reviews from local restaurants and cafes
- • In-the-wild screenshots of your app in real user environments
Ground Truth for AI Eval
Benchmark your vision, OCR, or speech model against human-verified reference data — structured labels included, no separate labeling pipeline.
- • Street-sign photos with per-word OCR confidence for your OCR eval
- • Subway-platform audio with speaker count and full transcripts
- • Intersection video with frame-level scene + action labels
- • iOS screen captures for your agent eval, pre-annotated
Multimodal Training Data
Stream diverse, pre-annotated JSONL straight into your fine-tuning pipeline — images, audio, and video, geo-tagged and quality-scored.
- • Thousands of street-level photos with object + scene labels
- • Multilingual audio with full transcripts and speaker diarization
- • Short-form video with action tracking and keyframe descriptions
- • Every file: pre-annotated JSON, ML-ready, no labeling stop
What you actually get
Every approved file lands in your folder
pre-annotated and ML-ready.
No labeling pipeline to wire up. No manual review. Just structured JSON and a pre-signed URL — the same envelope for images, audio, and video.
AI-scored, 1–5 stars
Claude Vision + Whisper ensemble scores every file. 3+ stars auto-approve; 1–2 stars get rejection feedback and a retry.
Pre-annotated, no separate labeling stop
Object detection, OCR, scene classification, color palettes, speaker counts, transcripts — structured metadata, not just a file.
Geo-tagged at the source
Capture latitude / longitude plus accuracy. Target specific neighborhoods, venues, or landmarks via the job spec.
Pre-signed download URL
Stable file URL that expires in 7 days. Stream straight into S3, your pipeline, or your CMS.
Agent-native, webhook-ready
Call from curl, TypeScript / Python SDKs, or your Claude Code or Cursor agent via the official MCP server — create_job, poll, download, done. Subscribe to submission.approved webhooks to fan out to downstream jobs.
{
"id": "file_01JRV8KX4M9E2B7DQFGA3HN5WT",
"object": "file",
"job_id": "job_01JRV7MN2P8D1A6CREFZ4GK9YS",
"content_type": "image/jpeg",
"size_bytes": 1843200,
"width": 4032,
"height": 3024,
"download_url": "https://files.firsthandapi.com/file_01JRV8KX4M.jpg?token=eyJ...",
"ai_star_rating": 5,
"annotations": {
"objects": [
"street_sign",
"traffic_signal",
"building",
"skyscraper"
],
"scene": "urban intersection",
"text_extraction": "E 37 St | Park Av | ONE WAY",
"color_palette": [
"#4A6FA5",
"#8B4513",
"#F5C6CB",
"#D4D4D4",
"#2F4F4F"
]
},
"geolocation": {
"latitude": 40.7487,
"longitude": -73.9856,
"accuracy_meters": 16.4
},
"device": {
"make": "Apple",
"model": "iPhone 13 Pro Max"
},
"created_at": "2026-04-15T15:48:27Z"
}Example response. See the full schema in the API reference →
Earn money capturing real-world content
Browse jobs on the FirstHand iOS app. Capture photos, audio, and video. AI scores every submission in seconds. Cash out via Stripe in $10 minimum increments.
“I take photos during my commute and lunch breaks. Last month I earned $380 just from photos of street signs and storefronts. It is the most honest gig work I have found.”— Marcus W., FirstHand worker, Chicago
Simple, per-file pricing
You set the price. Only pay for approved files. No subscriptions, no minimums, no sales calls — just pre-paid credits.
Credits
Pre-paid credits, no subscription
- Set any price per file
- Only pay for approved files
- AI quality scoring (1–5 stars)
- Low-balance webhook alerts
- S3 folder delivery
- Webhook notifications
Enterprise
For large-scale collection
- Everything in Credits
- Dedicated worker pools
- Custom AI scoring models
- SSO / SAML
- SLA guarantees
- Dedicated account manager
Ready to collect your first files?
Post a job. Get annotated files in days. $2.50 in free credits, no contracts.