Skip to main content

Submagic Teardown — David Zitoun's $800K MRR Solo Bootstrap AI Video Captions

Copyable to YOU

Sign in with Google to see your personal Copyable Score - a 5-dimension breakdown of how likely you (with your budget, tech stack, channels, network, and timing) can replicate this product.

TL;DR

David Zitoun, a French solo founder, took Submagic from zero to roughly $800K MRR (~$10M ARR) inside 18 months without raising a dollar of venture capital. The product itself is unglamorous on paper: you upload a vertical video, Submagic transcribes it with a Whisper-class model, drops animated captions on top in one of a dozen creator-popular styles, and exports a TikTok-ready MP4. That's it. No moat in the AI. No moat in the rendering. The moat is David's Twitter feed (@davidzitoun), where he posted MRR screenshots, churn numbers, server costs, and product demos for 18 months straight while building in public.

This teardown walks through what's actually copyable and what isn't. The product is copyable in a weekend. The distribution is copyable but takes 6-12 months of consistent posting. The timing window (early Whisper API + creator-economy explosion of 2023-2024) is mostly closed — but a niched-down version aimed at one creator format (podcast clip captions, sports highlight captions, gaming clip captions) is still wide open.

Quick Facts

Metric Value Source
Founder David Zitoun (solo) Twitter @davidzitoun, multiple BIP threads
Country France Founder's public bio
Launch Mid-2023 Twitter timeline
Reported MRR ~$800K (Q3-Q4 2024) Founder Twitter screenshots
Estimated ARR ~$10M Derived from MRR
Team size ~3-5 (Zitoun + small team late 2024) Job posts + interviews
Pricing $16/mo Essential, $33/mo Pro, $69/mo Business submagic.co/pricing
Funding Bootstrapped (zero VC) Repeated founder statements
Primary channel TikTok demos + Twitter BIP + creator affiliates Observable
Core tech Whisper-class transcription + animated overlay rendering Standard stack

The Product

I watched 12 Submagic demos on TikTok before opening the dashboard. Every single demo follows the same 11-second loop: founder uploads raw vertical clip, hits one button, jump-cuts to a stylized output with yellow-and-white kinetic text bouncing in sync with each spoken word. That's the entire product.

Inside the actual app:

  1. Upload a video (drag-drop, or paste a YouTube URL).
  2. Auto-transcribe runs in the background. Submagic uses a Whisper-derivative — based on speed and accuracy I'd guess whisper-large-v3 or a finetune via Replicate/Groq.
  3. Pick a caption template from a gallery (~30 templates: "Beast" mimicking MrBeast yellow-glow, "Hormozi" mimicking Alex Hormozi white-on-black, "Iman" mimicking Iman Gadzhi, and so on). The templates are explicitly modeled on top YouTube creators. This is the entire UX hook.
  4. Edit captions word-by-word, fix transcription errors, change colors, adjust timing. The editor is the part most users never touch — they pick a template and export.
  5. Add B-roll, sound effects, emoji overlays (these were added in 2024 to defend against Opus Clip).
  6. Export to MP4 with burned-in captions. Render takes 30-90 seconds for a 60-second clip.

The technical work behind "burning captions onto a video" is FFmpeg + a custom overlay renderer (likely Remotion or a Canvas/WebGL pipeline). None of this is novel. The novelty is the gallery of templates that look exactly like what MrBeast / Hormozi / Iman use, which lets a small creator visually claim creator-economy status before they have it.

Pricing tiers (as of late 2024 / early 2025):

  • Essential — $16/mo ($192/yr if monthly, ~$120/yr if annual): 25 videos/month, 720p export
  • Pro — $33/mo (~$396/yr): 100 videos/month, 1080p export, all templates, B-roll
  • Business — $69/mo (~$828/yr): 400 videos/month, 4K export, team seats

Free tier: 3 videos to try, watermarked. This is the conversion funnel.

The David Zitoun Solo Story

David Zitoun is unusual in this market because he's French, he's loud on Twitter, and he refused VC money even at $400K MRR. His Twitter feed is the marketing engine. A representative tweet pattern: he posts a screenshot of Stripe MRR, comments "from $40K to $200K MRR in 4 months — here's what worked," and threads through 12 product decisions. The thread goes viral. New users sign up. MRR ticks up. He screenshots the new MRR. Cycle repeats.

This is Build In Public as a customer acquisition channel, run on hard mode (solo, in English-as-a-second-language, against US-based competitors with bigger teams).

What he revealed across 2023-2024:

  • First $10K MRR took ~3 months, mostly from a single viral TikTok where he showed the product side-by-side with CapCut.
  • From $10K to $100K MRR took another 4 months, driven by affiliate creators on TikTok plus a partnership pattern where he'd DM 50

Sign in to read this report

You have read your 1 free report. Sign in with Google to unlock 2 more.

Sign in with Google