Autonomous AI ResearchMarch 2026 repo

The five-minute loop, explained

Autoresearchis Karpathy's five-minute loop for autonomous AI research.

An agent edits train.py, runs a short experiment, checks val_bpb, and keeps only the ideas that win. This page shows what the repo does, why the constraint matters, and how to get to a first run.

Creator

Andrej Karpathy

Launch

March 2026

Core loop

5-minute experiments

Loop snapshotSource-backed
Autoresearch research loop artwork

Primary source

The README spells out the files, the five-minute training budget, and the rules that drive the loop.

Core mechanic

The agent edits train.py, runs the experiment, reads the metric, and keeps only the better idea.

Overview

What makes autoresearch different from a normal training repo

A real feedback loop, not just automation

The agent proposes a change, runs the experiment, reads the result, and keeps only the ideas that improve the metric.

One editable file keeps the system legible

Most research decisions stay inside train.py, which makes each iteration easy to inspect and compare.

A fixed budget keeps comparisons honest

Every run gets the same five-minute wall-clock window, so changes compete under the same constraint.

This page is built for fast understanding

If you want the definition, the loop, and the shortest path to a first run, the key details are all here.

How it works

The loop stays small so each improvement is easy to judge

One editable training file, one fixed wall-clock budget, and one metric keep the system narrow enough to inspect and strong enough to iterate.

01

Inspect the setup

The agent reads the instructions in program.md, understands the experiment target, and decides what to change in train.py.

02

Edit one file

Architecture, optimizer, hyperparameters, batch size, and training logic all live in the same training file.

03

Train for five minutes

Every experiment gets the same wall-clock budget, which makes short-run results more directly comparable.

04

Keep or discard

The run is scored on validation bits per byte. Better changes stay. Worse changes get thrown away.

Why it matters

It turns research judgment into something an agent can repeat.

Most training repos automate execution. Autoresearch automates the cycle of proposing an idea, testing it under the same five-minute constraint, and keeping the change only if the metric improves.

Fixed budget

Every run competes inside the same wall-clock window.

Reviewable diffs

The edit surface stays narrow enough to inspect.

Agent-native loop

The research process becomes the thing being optimized.

Deep dive

What Is Autoresearch? How Karpathy's Five-Minute Research Loop Works

A clear explanation of what autoresearch is, who built it, how the loop works, and how to run it.

Explainer3 min read
Read the full explainer

Quick start

How to go from reading about autoresearch to running it

The README assumes a single NVIDIA GPU, Python 3.10+, and the `uv` package manager. If that matches your setup, this is the shortest path to a first run.

01

Install dependencies

The original README uses uv as the package manager for dependency installation and run commands.

uv sync
02

Prepare data and tokenizer

This is the one-time prep stage: data download, tokenizer training, and utility setup.

uv run prepare.py
03

Run a baseline experiment

Before going autonomous, confirm the local environment can complete a normal training run.

uv run train.py
04

Hand the loop to the agent

Once the manual path works, the agent can start proposing and evaluating changes through program.md.

prompt the agent with program.md

FAQ

The fastest answers to the questions people ask first

Start here if you want the creator, the loop, the metric, or the hardware requirements without reading the whole repo first.

Autoresearch comes from Andrej Karpathy. The repo lives at karpathy/autoresearch on GitHub and frames the project as a compact autonomous research loop.