induwara.lk
induwara.lkAI · Sampling

LLM Temperature & Top-p Sampling Visualizer

Drag the temperature, top-p, and top-k sliders and watch a language model's next-token probabilities sharpen, flatten, and get truncated in real time — with the exact softmax math, token by token. No API key, no cost, runs in your browser.

By Induwara AshinsanaUpdated Jun 11, 2026
Sampling visualizer
Probabilities sum to 1.0 ✓
Provider mode

No provider clamp — explore the full math, temperature 0–2 and top-k on.

Distribution preset

2–12 tokens. Logits range −20 to 20. These are illustrative scores, not a real model run.

Off (no truncation)

Off (keep all)

Most likely next token
cat
57.93%
Effective choices
4
tokens with non-zero prob
Entropy
1.6 bits
Spread out

Final sampling probability

cat
57.93%
dog
21.31%
bird
12.93%
fish
7.84%

Step-by-step math

TokenLogitSoftmax @ T=1@ current TTop-kTop-pFinal prob
cat257.93%57.93%57.93%
dog121.31%21.31%21.31%
bird0.512.93%12.93%12.93%
fish07.84%7.84%7.84%

Order: temperature → top-k → top-p → renormalize. Sources: .

How it works

A language model does not pick the next word directly — it outputs a raw score (a logit) for every token in its vocabulary. Sampling parameters turn those logits into a probability distribution and then decide how adventurously to draw from it. This tool applies the three parameters people actually search for, in the common reference order used by Hugging Face's generation code: temperature → top-k → top-p → renormalize.

  1. Tempered softmax. Each logit is divided by the temperature T, then run through softmax: p_i = exp(z_i / T) / Σ_j exp(z_j / T). Lower T makes the top token more dominant (sharper, more confident); higher T flattens the distribution (more random). At T = 0 the math collapses to greedy decoding: the single highest token gets probability 1. We subtract the maximum scaled logit before exponentiating (the log-sum-exp trick) so large logits or tiny temperatures never overflow.
  2. Top-k truncation. If k > 0, keep only the k highest-probability tokens and zero the rest (Fan et al., 2018). A fixed cutoff, regardless of how confident the model is.
  3. Top-p (nucleus) truncation. If p < 1, sort the surviving tokens by probability and keep the smallest prefix whose cumulative probability reaches p(Holtzman et al., 2019). Unlike top-k, the cutoff adapts to the distribution's shape.
  4. Renormalize. Divide the survivors by their sum so they total 1 again — these are the actual probabilities the sampler draws from. The badge in the calculator confirms they reconcile to 1.0.

The summary chips add an entropy readout, H = −Σ p·log₂ p in bits, as a single "how random is this" number, and an effective choicescount — how many tokens still carry non-zero probability after truncation. The provider toggle clamps the slider ranges to each API's documented limits (OpenAI temperature 0–2 with no public top_k; Anthropic temperature 0–1 with optional top_k). Providers can differ in edge-case clamping and tie-breaking, so this tool states its reference order rather than claiming to mirror any one backend exactly.

Worked examples

A · Temperature only

cat=2.0, dog=1.0, bird=0.5, fish=0.0

  1. exp(z) @ T=1 = [7.389, 2.718, 1.649, 1.000], Σ = 12.756
  2. T = 1.0 → cat 57.93%, dog 21.31%, bird 12.93%, fish 7.84%
  3. T = 0.5 (z/T = [4,2,1,0]) → cat 83.10%, dog 11.25%, bird 4.14%, fish 1.52%
  4. T = 0 (greedy) → cat 100%, the rest 0 — same answer every run

B · Top-p truncation

T = 1.0, p = 0.8

  1. Sorted cumulative: 0.5793 (<0.8) → +0.2131 = 0.7924 (<0.8) → +0.1293 = 0.9216 (≥0.8) stop
  2. Keep cat, dog, bird; drop fish
  3. Renormalize over 0.9216 → cat 62.85%, dog 23.12%, bird 14.02%, fish 0%
  4. Effective choices = 3

C · Top-k truncation

T = 1.0, k = 2

  1. Keep the two highest: cat, dog; drop bird, fish
  2. Renormalize over 0.5793 + 0.2131 = 0.7924
  3. → cat 73.11%, dog 26.89%
  4. Effective choices = 2, entropy ≈ 0.84 bits

Frequently asked questions

Sources & references

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.