DataPizza · B2C · AI Agent · 2025 — Now

Moodio — All‑in‑One Video Agent

A web‑based agent that helps anyone brainstorm, generate and edit targeted AI video outputs — with a behavior‑driven data flywheel improving the model with every session. Work in progress — beta launched March 2026.

Role
Solo Designer
+ Vibe‑Code Co‑Builder
Timeline
Jan 2025 — Now
Team
1 designer (me)
6 research & engineering
3 product & data
5 filmmakers · 3 advisors
Tools
Figma · Cursor · Claude
Next.js · LLM APIs
Build Method · Vibe‑Coded with Engineering

This product didn't go through traditional handoff. I designed every flow in Figma and vibe‑coded each screen shoulder‑to‑shoulder with engineering in Cursor + Claude — taking ideas from sketch to production in the same afternoon. The result: design and code stayed one conversation, not two artifacts.

Section 01

Where professional craft compounds.

Turn every project into an AI asset. Accelerate professional film creation.

One agent that runs your studio's whole workflow. Built by a CMU foundation-model research team — working on expert data curation, agent evaluation, and data-flywheel post-training. Top-conference research adopted by 100+ AI labs including Google DeepMind, ByteDance, and xAI.

Section 02

Three Core Capabilities

One agent for the whole studio — built on three complementary pillars.

  • Industry-leading visual retrieval. Search across millions of frames with cinematic precision — outperforms both Google Gemini Embedding 2 and Alibaba Qwen-3-VL Embedding on Moodio's benchmarks.
  • AI workflow spanning the full film pipeline. From brief to mood, shot list to cut — one chat surface that switches between retrieval, generation, and inline edit without leaving the canvas.
  • Personalized agent shaped by your team. A data flywheel built from real user interactions — the agent learns each studio's taste, references, and conventions session by session.
Section 03

Moodio is a research-driven company.

Our research has become a benchmark in video understanding and cinematic generation evaluation — published at CVPR, NeurIPS, and ECCV, adopted by frontier labs at Google DeepMind, ByteDance, xAI, Kling, Midjourney and more.

NeurIPS 2025
Spotlight · Top 3%

CameraBench

Understanding camera motion in any video. Adopted by Google DeepMind, xAI, Kling, and frontier video labs.

Read paper
CVPR 2026
Highlight · Top 3%

CHAI

Building precise video language with human-AI oversight. The framework behind Moodio's caption quality and retrieval precision.

Read paper
ECCV 2024
Adopted in Google Imagen-3

VQAScore

State-of-the-art metric for evaluating text-to-visual generation. Google DeepMind named it the strongest replacement for CLIPScore.

Read paper
CVPR 2024
Best Paper · SynData Workshop

GenAI-Bench

Benchmark for compositional text-to-visual generation. Uniquely adopted in the official Google Imagen-4 technical report.

Read paper
Whitepaper
Cinematic Video Search · SOTA

Moodio Retrieval

The first to bring visual reference into the video generation workflow. Retrieval outperforms both Google Gemini Embedding 2 and Alibaba Qwen-3-VL Embedding.

Read paper
Whitepaper · Forthcoming
Agent Post-Training

Moodio Agent

First end-to-end video production agent with reward learning from real user interactions. The data flywheel behind studio-grade workflow.

Read paper
Section 04

Frontier labs and studios use our datasets and models.

Two numbers tell the story.

2M+
HuggingFace downloads
100+
industry labs using
our research

In use at

Google DeepMind
ByteDance
xAI
xAI
Kling
Kling
MJ
Midjourney
Runway
Runway
Luma
Luma
Adobe
Adobe
Tencent
Tencent
Microsoft
SenseTime
SenseTime
Xiaomi
Next project

SOCA — AI Job‑Seeking Platform