The first platform where developers test their vibe coding skills.

Solve real coding challenges on your own machine with your own AI agent, Claude Code, Cursor, or Codex. Kodwai scores how well you direct the agent, not what you memorized, and ranks you on a public leaderboard.

fully freebring your own agentclaude code, cursor or codex
kodwai - The first platform that scores how you Vibe Code | Product Hunt
// live demo

browsesolve with your agentsubmitscore

Made for developers who want to work at
Google
Meta
Apple
Microsoft
Amazon
Stripe
Netflix
Vercel
//01the premise

LeetCode doesn't prove much anymore.

The work changed. You point an agent at a real problem, catch it when it is confidently wrong, and check what it actually shipped. That judgment is the skill, and it is the thing kodwai scores.

// three reasons the old test fails

  1. 01the format is stale

    Built for a different era

    Whiteboard puzzles and LeetCode grinds were designed for engineers working alone with nothing but an editor. Point an agent at one and it clears the puzzle in seconds. You learn nothing about the engineer.

  2. 02green is not the same as good

    Passing tests proves little

    A single careless prompt can make the suite go green and still show no judgment at all. No verification, no decomposition, no recovery when the agent goes confidently wrong. The checkmark hides everything that matters.

  3. 03the real work is unmeasured

    Nothing measures how you work

    You spend your day directing an agent: writing the spec, catching hallucinations, checking what actually shipped. That is the skill that decides who is good now, and until kodwai, nothing put a number on it.

//02how it works

From pick to scored, in five steps.

No sandbox, nothing to install that fights you. You work on your own machine with your own agent, and Kodwai scores the whole session.

  1. 01

    Pick a challenge

    Browse real, ticket-sized problems across every category you actually ship in. Filter by difficulty and pick one that looks like the work you actually do.

  2. 02

    Run the CLIcli

    Start it from your terminal and choose your agent. We download PROBLEM.md, starter files and tests, init a git repo, and start the timer.

    $npx @kodwai/cli challenge <slug>Claude CodeCursorCodex
  3. 03

    Solve on your machine

    Work the problem with your own agent in your own editor. No sandbox to fight, no artificial constraints, just how you really build.

  4. 04

    Submitcli

    One command packages your code, git history, test runs, agent transcript, and the time you took, then ships it for scoring.

    $npx @kodwai/cli submit
  5. 05

    Get your score

    Direction, Outcome, and Lift land with per-signal evidence, so you can see why each axis scored the way it did. Then you are on the leaderboard.

//step 02 · run the cli

One command pulls the problem, sets your agent, inits git, and starts the clock. Your agent opens right where you left off. Then you build it your way.

94
and you are scoreddirection 48 · outcome 33 · lift 13
//03the aha

See exactly how you vibe code.

Same challenge, two developers. A careless one-shot prompt can pass the tests. It still scores low, because passing tests is not skill. kodwai reads the whole session, so the score rewards how you drive.

focus
both shown · highlighted in rust
Welcome to Claude Code (v2.1.34)
cwd: ~/kodwai/rate-limiter · model: claude-opus-4-7
session · one-shot prompt
Careless one-shot
algorithm: rate limiter · hard · 45 min
build a sliding-window rate limiter, make the tests pass
Done. Added RateLimiter with a deque per key.
12 passed
ship it
no verification · no edge probing · 1 turn
0/ 100
low scoretests green, judgment absent
Direction
21 / 50
Outcome
33 / 35
Lift
4 / 15

Tests are green, but no steering, no verification, no recovery. Direction collapses the total.

Welcome to Claude Code (v2.1.34)
cwd: ~/kodwai/rate-limiter · model: claude-opus-4-7
session · driven session
Engineer who drives
algorithm: rate limiter · hard · 45 min
spec first: per-key window, monotonic clock, no memory leak
Plan: window store, eviction, concurrency guard.
9 passed, 3 failing
the burst test races. add a per-key lock, prove it.
pytest -k concurrency → 3 passed
verified · race fixed · 4 commits
0/ 100
high scoresteered, checked, hardened
Direction
47 / 50
Outcome
33 / 35
Lift
12 / 15

Tests green and the agent was steered, verified, and hardened. Direction carries the score.

kodwai reads the whole session: the prompts, the recovery, the test runs, the commits. The score is dominated by Direction, the part a one-shot cannot fake.

//04the challenges

Problems worth shipping.

15 live challenges across 10 categories and three difficulties. Each one is scoped like a real ticket, not a riddle. Pick the track that looks like the work you actually do.

easybackend

Bookshelf REST API

Junior Backend Engineer interview. Build a small REST API from scratch with CRUD, filters, validation, persist...

~60 minsolve
hardbackend

Multi-Currency Wallet Ledger with Idempotent Transfers

Senior Backend Engineer interview. Build the core double-entry ledger for a multi-currency wallet: atomic tran...

~120 minsolve
hardbackend

Process / Task Orchestrator-Lite

Platform / Infra interview. Build a task orchestrator that runs a DAG with dependency ordering, a global concu...

~120 minsolve
//05the score

What the score actually measures.

A one-shot “solve this” prompt clears the tests, so passing tests is not enough. The score is dominated by how you direct the agent, the part a careless prompt cannot fake.

0/ 100

sample run · rate limiter

direction45 / 50
outcome34 / 35
lift12 / 15
Direction50 pts

how you steer, verify, and decompose

Intent fidelity96
Verification rigor92
Spec precision89
Decomposition86
Recovery84
Engagement90
Outcome35 pts

what actually shipped, and whether it holds

Tests passed100
Code quality93
Complexity88
Lift15 pts

the edges a one-shot prompt misses

Edge-case coverage82
evidence · one signal
Verification rigor
axis · direction+6 pts
you → transcript · turn 14
before we move on, write a test that fires 1k concurrent requests and assert no tokens leak past the window

Why it scored. You forced the agent to prove the concurrency claim instead of trusting it. Cited from turn 14, 41s before the first commit.

scored 0 to 100 direction 50 outcome 35 lift 15 every signal cites its evidence

//06climb

Rank, earn, and prove it.

Every scored run moves you up the global leaderboard and builds a public profile you can send to anyone.

global leaderboarddifficulty-weighted
Rank 1 medalJamie Brooks@jamieCLAUDE CODE96 /100
Rank 2 medalSarah Chen@schenCLAUDE CODE94 /100
Rank 3 medalKenji Tanaka@ktanakaCURSOR93 /100
04Alex Mendez@amendezCODEX91 /100
05Priya Rao@priyarCLAUDE CODE90 /100
your spot: unranked · run one to claim itjoin
public profile
Jamie BrooksJamie Brooks@jamie · claude-codeRANK 1
96Direction48 / 50Outcome34 / 35Lift14 / 15
earned+9

Badges that stack up.

shareable to x & linkedin
First BloodFirst Blood
Five DownFive Down
Ten StrongTen Strong
Quarter CenturyQuarter Century
On FireOn Fire
Week WarriorWeek Warrior
Monthly MachineMonthly Machine
Top 10%Top 10%
Speed DemonSpeed Demon
PerfectionistPerfectionist
PolyglotPolyglot
Claude MasterClaude Master
Cursor ProCursor Pro
Early AdopterEarly Adopter

Milestones, streaks, skill and agent badges land automatically as you submit. Your profile at kodwai.com/developers/you shows your score, your rank, the badges you have earned, and the agents you drive. Built to send to anyone, including a hiring manager instead of a take-home.

//07the proof

Numbers we are happy to stand behind.

No vanity metrics. Just what the platform is, what it costs you, and how honestly it measures the way you actually work.

the category1st
platform built to score how you drive AI agents, not what you memorized.
the setup0%
local. Your machine, your agent, your editor. No sandbox to fight, no fake constraints.
the price$0
to play. Fully free, bring your own agent, keep your machine.
the score0 axes
Direction, Outcome, and Lift, every signal citing its own evidence.
measured, not marketedevery signal cites its own evidence
//09questions

Frequently asked questions.

Everything worth knowing before your first run. Still curious, the answer is one message away.

What is vibe coding, and how do you score it?// scoring

Vibe coding is building real software by directing an AI agent instead of typing every line yourself. Kodwai scores the session across three axes: Direction (how you steer, verify, and decompose), Outcome (what actually shipped and whether it passes), and Lift (the edge cases a one-shot prompt misses). Every signal cites its own evidence from your transcript, commits, and test runs.

Which agents and languages are supported?// agents · langs

Bring your own agent. Claude Code and Cursor are first-class, and anything you run in your terminal works, including Codex CLI, Aider, Cline, and more. Challenges span every mainstream category and most mainstream languages, since you solve on your machine with your own setup.

Do I solve challenges locally or in a sandbox?// local

Locally, always. The CLI downloads the problem, starter files, and tests, inits a git repo, and starts the timer. You work in your own editor with your own agent. There is no browser sandbox to fight and no artificial constraints.

Is it really free?// pricing

Yes. Solving challenges, your score, your profile, and the leaderboard are free for developers. The hiring track is the paid product, for teams running interviews.

How can a score be fair if a one-shot prompt passes the tests?// fairness

Passing tests is necessary but not sufficient. The score is dominated by Direction, the part a careless prompt cannot fake. A solution that clears tests with no steering, no verification, and no decomposition scores poorly on the axis that matters most.

What does the public profile show?// profile

Your score, your rank, the badges you have earned, and the agents you drive, at kodwai.com/developers/you. It is built to send to anyone, including a hiring manager instead of a take-home.

Question that is not here?hakan@kodwai.com
//10· begin

Stopgrindingpuzzles.
Provehowyoubuild.

Fully free. Your own agent, your own machine, your own editor. You pick your path on the way in.