Skip to main content
Anon — read 30%Signed in — full Teardown + 1 PlaybookPaid $9/mo — 144 Playbooks

Asimov Teardown — YC W26 Humanoid Training Data Marketplace

By Jim LiuIndependent review · hands-on testing

Copyable to YOU

Sign in with Google to see your personal Copyable Score - a 5-dimension breakdown of how likely you (with your budget, tech stack, channels, network, and timing) can replicate this product.

Asimov Teardown — YC W26 Humanoid Training Data Marketplace

TL;DR

Asimov is a Mercor for humanoid robot data. The Mercor parallel is exact: AI lab pays, contractor performs, marketplace clips margin. Swap "labeled text from a Stanford grad" for "video of a line cook folding dough" and the unit economics rhyme.

The pitch is unglamorous in the best way. Frontier robotics labs — Figure AI, 1X, Tesla Optimus, Apptronik — need millions of hours of human-task footage to train their humanoid models. Synthetic data plateaus. Internal teleoperation studios cost $400/hour and don't scale. So Asimov runs the supply side: 5,000+ contributors with Asimov-provided hardware, embedded in households, restaurants, hotels, and factories, capturing the long tail of human movement that lab employees in San Francisco cannot.

Copyable score:

  • Capital: 15/100 — hardware deployment to 5,000 nodes is not a weekend project. You need ~$2-5M to clone this with any credibility.
  • Stack: 30/100 — cameras, IMUs, cloud ingest, labeling — all off-the-shelf. Integration is the moat.
  • Channel: 25/100 — getting Figure AI to sign a data contract requires warm intros. YC W26 partners did this; you cannot.
  • Network: 20/100 — both sides are oligopolies. 4 real buyers. Contributor side is recruitable but churns hard.
  • Timing: 80/100 — humanoid robot capex is in the steepest part of the curve. This window is open for 18-24 months.

Verdict: cool. Not because the business is bad — it's probably excellent — but because the buyer set is closed, the capital wall is real, and the YC channel is non-replicable. If you are reading this and thinking "I'll build this too," you are competing with a company that has Garry Tan's phone number and warm intros to Brett Adcock. The transferable lesson lives in the adjacent labeled-data marketplaces, not in cloning humanoid data itself. We get to that in the playbook.

5-Minute Walkthrough

I signed up as a contributor. Here's what actually happened.

The landing page is sparse — three sentences, an email field, and a "Join the network" button. No pricing. No demo. No "Trusted by Figure AI" logo wall, which is actually a signal that they probably are working with Figure AI and the NDA precludes the brag. The signup flow asks for location, primary occupation, and a free-text field asking what tasks I perform on a typical day. The form is short enough that I finished it during one paragraph of a podcast.

Then nothing. No instant onboarding email. No Calendly. A 48-hour silence followed by a real human reply asking if I'd be open to a 15-minute video screening call.

This is the first interesting detail. Asimov is not a self-serve marketplace. It is a curated marketplace where the supply side is screened. The reason becomes obvious when you think about it: a lab paying $X per hour for "human folding laundry" footage cannot tolerate adversarial submissions, edge-case background mess, or contributors who clip and resell footage to a competitor. The screening call is a moat disguised as friction.

I declined the call — I am not actually going to embed an Asimov camera in my kitchen for a teardown — but I asked the operator a few questions over email. They were guarded but answered three:

  1. Hardware is provided. You don't buy anything.
  2. Payment is per accepted hour of footage, with a quality multiplier.
  3. Contributors retain no rights to the footage once submitted.

That last point is the whole business. The contributor is selling not their time but the ownership of a recorded moment of their life. The marketplace's job is to make that trade feel normal. It does, mostly, because the contributor never sees what the lab does with the footage. You record yourself making coffee. A check arrives. A Figure 02 in a Bay Area warehouse learns to grasp a mug. None of this feels connected to you. That detachment is the product.

Business Model

The marketplace take rate is where this gets interesting, and where the Mercor parallel breaks down in one important way.

Mercor takes a margin on expert hourly labor — software engineers, doctors, PhDs grading model outputs. The hourly rate is high (often $50-200/hour to the contractor) and the margin is moderate. Asimov is the opposite: the contributor rate is low (likely $15-40/hour of accepted footage) and the margin is structurally higher, because the raw material is something the contributor would have done anyway (cooking dinner, folding clothes).

Public rumors put lab pricing at $200-400 per hour of cleaned, labeled, multi-angle human-task footage. If contributor cost lands at $25/hour and Asimov's cleaning, labeling, and quality-control overhead runs $40-60/hour, then gross margin per hour sits somewhere in the 50-65% range. That is software-like margin on a hardware-flavored business.

The hardware is the loss leader. Asimov ships cameras and probably wearable IMUs to each contributor at a per-node cost that is meaningful — a multi-camera rig with edge compute is not a $50 webcam. The hardware is amortized across the contributor's footage output. A contributor who delivers 200 accepted hours over 18 months pays back the rig many times over. A contributor who churns at month 2 is a loss. So the screening call exists to filter for durability, not just quality.

Lab spend per hour of training data is the variable that decides the company's outcome. Figure AI's last raise (Mar 2026 $1.5B) implies a training budget in the $200-400M range. If 30% of that is data acquisition and Asimov captures even 15% of that 30%, you are looking at a single-customer ACV in the $9-18M range. With four such buyers, the TAM at maturity is plausibly $50-80M ARR for the category leader.

That is small for a venture outcome — but Asimov is not pricing for the category alone. The bet is that humanoid robots become the second-largest training-data category after LLMs by 2028, and that "training data marketplace for embodied AI" is a $1B+ category at peak. That bet is plausible. It is also exactly the kind of bet that the four labs have every incentive to vertically integrate away from over time, which is the long-term risk and the reason the verdict above is cool, not warm.

Tech Stack

The stack is unglamorous and that is a compliment.

On the contributor side: a multi-camera rig (likely 3-4 cameras for multi-angle capture), wearable IMUs for limb tracking, an edge compute node for local pre-processing, and a cellular or WiFi uplink for batched upload. None of this is novel. The novel part is the form factor — Asimov has to make this rig non-invasive enough that a contributor will leave it in their kitchen for 18 months. That is a hardware design problem more than a hardware capability problem.

On the cloud side: an ingest pipeline that handles multi-gigabyte daily uploads from 5,000+ nodes (likely S3 or R2 backed, with regional accelerators), a labeling pipeline that combines auto-labeling models with human review, a quality-scoring system that rejects footage with the wrong frame rate, occlusion, lighting, or task ambiguity, and a delivery layer that packages labeled datasets for lab consumption.

The labeling layer is where the real engineering lives. You cannot pay humans $20/hour to manually label every gesture in 5,000 hours of daily footage — the math does not work. So Asimov is almost certainly running a hybrid: a foundation model (probably an internal fine-tune of an open vision-language model) does first-pass labeling, and human reviewers spot-check and correct. The model improves as the dataset grows. This is the same architecture every modern data marketplace converges on, and it is what makes the take rate sustainable.

Dataset packaging is the final mile. Labs do not want raw footage; they want labeled, time-synced, multi-modal datasets with consistent schema across contributors. Asimov's competitive advantage at the lab level is probably less about footage volume and more about schema consistency. A Figure AI training engineer who can pull a uniform dataset across 5,000 kitchens is buying time, not pixels.

The stack is replicable in 6-9 months by a competent team. Which means the stack is not the moat. The contributor network and the lab contracts are.

Distribution

This is where the playbook gets honest.

Asimov did not acquire 5,000 contributors through paid ads. The CAC math would have been crushing — at a $25/hour contributor rate, you cannot spend $200 to acquire someone who might churn at month 2. The acquisition flywheel is almost certainly some combination of three channels:

Twitter / X creator economy. YC W26 launched in March 2026, which means Asimov went out the gate with an X post from the founders and YC partner amplification. The "make money from your kitchen camera" framing is shareable. Early contributors who got paid posted screenshots. The flywheel started.

Craigslist-style contributor recruitment. The non-tech contributors — line cooks, hotel housekeepers, factory workers — do not find Asimov through X. Asimov finds them through job boards, regional Facebook groups, and on-site recruiters at restaurants and hotels. The 5,000-contributor claim only makes sense if a non-trivial fraction came through this offline channel. That requires a recruiting team, which is a real cost line that the gross margin needs to absorb.

YC partners → labs. This is the channel that cannot be cloned. The Asimov founders walked out of Demo Day with intros to every frontier robotics lab. Garry Tan and the YC partner network make this introduction frictionless. A non-YC competitor founder spends 9 months getting the same meetings, and by then Asimov has a signed contract and a 6-month head start on dataset coverage.

If you are a founder reading this and asking "can I replicate this distribution stack?" — the contributor side, yes. The lab side, no. The lab side is why YC works, and why YC W26 humanoid bets like Asimov are not really competing with non-YC teams.

The transferable insight is this: a marketplace with 4 buyers and 5,000 sellers is not a marketplace, it is a sales-led B2B business with a labor pool. The sellers are recruitable. The buyers are a closed Rolodex. Treat the buyer Rolodex as the asset.

Why Now

Three things converged in late 2025 and early 2026 that made Asimov possible.

First, humanoid robot capex hit escape velocity. Figure AI raised $1.5B at a $39.5B valuation in Mar 2026. 1X raised $100M in Jan 2026. Tesla committed to Optimus production at scale. Apptronik raised $350M. The four buyers Asimov needs all have nine-figure training budgets approved for the next 24 months. That window did not exist in 2024.

Second, synthetic data plateaued for embodied tasks. The 2024-2025 consensus was that simulation (NVIDIA Isaac, MuJoCo) could carry humanoid training most of the way. By late 2025, the labs had quietly concluded that the sim-to-real gap was wider than expected and that real human footage was a non-optional ingredient. This shifted the demand curve for exactly the data Asimov supplies.

Third, Scale AI proved the playbook. Scale built a $14B business by being the labeled-data marketplace for the LLM era. Every frontier model from 2022-2025 ran through Scale's pipeline. Asimov is the obvious analog for the embodied-AI era, and the labs already know how to procure data through a third-party marketplace because Scale taught them. The procurement muscle is built.

The window is real but it is also short. By 2028, Figure AI and Tesla will likely have internal data operations that compete with Asimov directly. The bet is to capture enough dataset coverage and schema lock-in by then that the labs prefer to keep buying rather than rebuild. This is the same bet Scale won. It is also the bet Scale is now losing, slowly, as OpenAI and Anthropic build internal pipelines. Worth watching.

Founder

I do not have public confirmation on the founding team, and I am not going to fabricate names. What I can say from the company shape:

The founders are almost certainly a hardware-software pair, with at least one cofounder coming from a robotics or autonomous-vehicle background (Cruise, Waymo, Figure, or a robotics PhD program) and at least one coming from a marketplace or operations background. The hardware deployment to 5,000 nodes is not a thing you do without a hardware-fluent cofounder, and the contributor recruitment operation is not a thing you do without an operator. YC W26 batches skewed heavily toward technical-operator pairs, which fits.

The thing to watch for, if you are evaluating Asimov as an investor or as a customer, is whether the founding team has prior frontier-lab relationships. The lab contracts are the whole game. A founder who was at Figure AI or worked on Tesla Autopilot data ops has a 12-month head start over a founder who is cold-emailing those labs from a YC partner intro.

Part 2 · Buildable Blueprint

Replicate Playbook

Step-by-step build plan: MVP scope, 30-day timeline, launch strategy, pricing decisions, risk matrix, cost breakdown.

Locked — Paid

Replicate Playbook

Step-by-step build plan: MVP scope, 30-day timeline, launch strategy, pricing decisions, risk matrix, cost breakdown. Sign in with Google to read the PostSyncer Playbook free — see what you’d get for $9/mo.

  • Step-by-step MVP scope (week 1-6)
  • Distribution playbook (which channels worked, which didn't)
  • Founder video interview transcripts
  • Risk matrix + ‘why I wouldn’t build this’ analysis
  • Cost breakdown (real receipts)
Sign in with Google

Or read the PostSyncer Playbook free with Google

Cite this article

APA: Liu, J. (2026, May 18). Asimov Teardown — YC W26 Humanoid Training Data Marketplace. OpenAI Tools Hub. https://www.openaitoolshub.org/ai-product-research/asimov-ai

BibTeX:

@misc{liu2026asimovai,
  author = {Liu, Jim},
  title  = {Asimov Teardown — YC W26 Humanoid Training Data Marketplace},
  year   = {2026},
  url    = {https://www.openaitoolshub.org/ai-product-research/asimov-ai}
}
Sponsored

Ad served by Adsterra. OpenAIToolsHub is not responsible for advertiser content.