What format does OpenAI fine-tuning data need to be in?

A JSONL file — one JSON object per line. For chat models each line is {"messages": [ … ]}, where every message has a "role" (system, user, assistant, or tool) and "content". Each example must contain at least one assistant message, which is the target the model learns to produce. The validator checks every line against these rules.

How many examples do I need to fine-tune a model?

The OpenAI API requires a minimum of 10 examples, but that floor rarely produces useful results. The docs recommend 50–100 well-chosen examples for a noticeable improvement on a task. The readiness checklist marks under 10 as a fail, 10–49 as a warning, and 50+ as good.

Why is my fine-tuning JSONL file invalid?

The most common reasons are: a line is not valid JSON (a stray comma, a missing bracket); a message uses a misspelled role like "asistant"; an example has no assistant message; "content" is missing on a normal message; or the whole file was saved as one big JSON array instead of one object per line. The per-line table names the exact line and problem for each.

How many tokens are in my fine-tuning dataset?

The token card sums an estimate for every example. It uses OpenAI's documented overhead — 3 tokens per message plus 3 priming tokens per example — on top of the message text. The text estimate is approximate (≈4 characters per token); for an exact cl100k or o200k count, use the dedicated token counter linked below. Training cost scales with total tokens × epochs.

Can each line in a fine-tuning file have a system message?

Yes. A system message is optional per example but allowed, and including a consistent system prompt that matches how you will call the model in production usually helps. The stats card shows how many of your examples include a system message so you can spot inconsistency. The "Count system messages" toggle controls whether system text is included in the token totals.

Is my dataset uploaded anywhere?

No. Parsing, validation, and token counting all run in your browser with JavaScript. The file you paste or upload never leaves your device — no server call, no API key, no logging. You can open the page, disconnect from the internet, and it still works. That matters because fine-tuning data often contains real customer conversations.

What happens if an example is longer than the token cap?

OpenAI truncates training examples that exceed the model's context length, dropping tokens from the end — so the model may never see the assistant reply it was supposed to learn. The validator flags any example above the per-example token cap you set (default 65,536) as "will be truncated" so you can shorten or split it before training.

Does it support the legacy prompt/completion format?

Yes — switch the Format control to Legacy. Each line is then checked for a string "prompt" and a string "completion", with anything else flagged. Most new fine-tuning uses the chat format, so that is the default, but the older completion format is still validated for projects that need it.

AI · Developer

AI Fine-Tuning Dataset Validator (JSONL)

Paste or upload an OpenAI chat fine-tuning .jsonl dataset and instantly catch the format errors that get an upload rejected — malformed lines, bad roles, missing assistant replies, unknown keys — then see your example count and an estimated training-token total. Runs entirely in your browser; the file never leaves your device.

By Induwara Ashinsana— Executive Director, Ryzera TechnologiesUpdated Jun 24, 2026

Validate your JSONL

100% in your browser

Format

Token estimate basis

Tokens

Per-example token cap

Dataset (JSONL — one example per line)

Upload .jsonl

5 non-blank lines. Nothing is uploaded — validation runs on this device.

2 errors found across 5 lines

3 of 5 lines are valid examples. Fix the rows below before uploading.

Valid examples

5 non-blank lines parsed

Messages

1 example with a system msg

Est. training tokens

106

max/example: 33 · median: 17

Over token cap

None exceed the per-example cap

Readiness checklist

At least 10 examples
5 examples — below the 10 the API requires.
Every example has an assistant reply
At least one example has no assistant message.
No fatal format errors
2 errors the API would reject.

Per-line issues

Line	Severity	Message
4	error	Line 4: no assistant message — the model has no target to learn from. Every example needs at least one assistant reply.
5	error	Line 5: invalid role "asistant" — must be one of system, user, assistant, tool.

Sources: OpenAI Cookbook — chat fine-tuning data prep · OpenAI Docs — Supervised fine-tuning & best practices. Token figures are estimates; for exact cl100k/o200k counts use the dedicated token counter. Linked under “Sources & references” below.

How it works

The validator mirrors the checks in OpenAI's reference Cookbook script chat_finetuning_data_prep, which is the same logic the platform applies when you upload a training file. It runs as a single deterministic pass over your input — identical input and settings always produce the same report — and nothing is sent over the network.

Line parsing. A JSONL file is one JSON object per line. Each non-blank line is parsed with JSON.parse; a failure is reported as invalid JSON for that line. A whole-file JSON array (a common paste mistake) is caught and explained rather than silently mis-parsed.
Top-level structure. In chat mode each object must hold a non-empty messages array. Keys other than messages, tools, parallel_tool_calls, or functions are flagged as warnings, exactly as the Cookbook flags unexpected keys.
Message checks. Every message needs a role in {system, user, assistant, tool} and, for ordinary messages, a non-empty content. Assistant messages may carry a function_call or tool_calls instead of content. Unrecognized message keys become warnings.
Assistant-presence. Each example must contain at least one assistant message — without a target reply the model has nothing to learn. When an example already has another error (say a misspelled role), the missing-assistant flag is suppressed so the root cause is reported once, not twice.
Token estimation. Following the Cookbook's num_tokens_from_messages, each example's tokens are the sum of its message text plus a fixed overhead — 3 tokens per message and 3 priming tokens per example. The text itself is estimated (≈4 characters per token, or a closer per-word approximation), since bundling a full tokenizer would bloat the page; for exact counts, the page links to the dedicated token counter.
Readiness rules. The example count is checked against the API floor of 10 and the recommended 50. Examples longer than your per-example token cap are flagged as truncated from the end, per OpenAI's best-practices guidance.

As a credibility cross-check, every token total is also computed a second, simpler way — a flat characters-÷-4 over the same counted text with no overhead — so you can see the two estimates bracket the real figure. The exact value from OpenAI's tokenizer sits between them.

Worked examples

A — a clean 3-example dataset

0 errors · 3 valid examples

Input JSONL

{"messages":[{"role":"system","content":"You are terse."},{"role":"user","content":"Hi"},{"role":"assistant","content":"Hello."}]}
{"messages":[{"role":"user","content":"2+2?"},{"role":"assistant","content":"4"}]}
{"messages":[{"role":"user","content":"Capital of Sri Lanka?"},{"role":"assistant","content":"Sri Jayawardenepura Kotte (commercial: Colombo)."}]}

All three lines are valid JSON objects, each with a non-empty messages array.
Every example has at least one assistant message, and every role is valid → 0 errors, 3 valid examples.
One example (line 1) includes a system message; the stats card reports that.
Readiness: example-count FAILS (3 < 10), assistant-presence PASSES, no-errors PASSES — so the format is correct but the dataset is too small to train on yet.

B — a broken dataset (edge case)

3 errors across 4 lines · 1 valid example

Input JSONL

{"messages":[{"role":"user","content":"Hi"},{"role":"assistant","content":"Hey"}]}
{"messages":[{"role":"user","content":"No answer here"}]}
{"messages":[{"role":"user","content":"Typo role"},{"role":"asistant","content":"oops"}]}
{"messages":[{"role":"user","content":"Bad json"]}

Line 1 is valid.
Line 2 has no assistant message → error.
Line 3 misspells the role as "asistant" → invalid-role error. Because that example already has an error, the missing-assistant flag is suppressed — you fix the typo once.
Line 4 is missing a closing brace → invalid-JSON error.
Total: 3 errors, 1 valid example — exactly what the OpenAI API would reject on upload.

C — token estimate for one example

Heuristic basis, system message counted

Input JSONL

{"messages":[
  {"role":"user","content":"2+2?"},
  {"role":"assistant","content":"4"}
]}

User message: 3 (per-message) + ceil(4/4)=1 for the role text + ceil(4/4)=1 for "2+2?" = 5 tokens.
Assistant message: 3 + ceil(9/4)=3 for "assistant" + ceil(1/4)=1 for "4" = 7 tokens.
Plus 3 priming tokens per example: 5 + 7 + 3 = 15 estimated training tokens.
Multiply the dataset total by your number of epochs to gauge training cost; switch the basis to Approx cl100k for a tighter estimate.

Frequently asked questions

Sources & references

The format rules and token overhead on this page were last cross-checked against the OpenAI sources above on 2026-06-24. The token text estimate is approximate — for an exact cl100k/o200k count, use the AI token counter.

Related tools

LiveAI

Fine-Tuning Cost Calculator

Estimate the one-time training cost and ongoing monthly inference cost of fine-tuning an LLM (GPT-4o, GPT-4o mini, GPT-3.5 Turbo, Together AI Llama) from your token count, epochs, and usage — in USD and LKR.

Open tool

LiveAI

AI Token Counter

Count tokens for any text against GPT-5, GPT-4o, Claude 4.x, Gemini 3, and Llama 4. See how much of each model's context window you'll use before sending. Runs entirely in your browser, no signup, sources cited.

Open tool

LiveAI

Tool-Use Token Cost

Estimate the hidden token cost of re-sending tool/function definitions on every LLM API call, per model, with prompt-caching and schema-trimming savings.

Open tool

Rate this tool

Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a format edge case the validator misses, or want another provider's schema added?

Email me at [email protected] — most fixes ship within 24 hours.