How do I check if a comment is toxic?

Paste the comment into the box above and press Check. The tool runs a profanity-word scan instantly and, when enabled, a toxicity model that scores the text across six categories. You get a single verdict — Clean, Flagged, or Strongly flagged — plus the exact words that triggered it.

What is a toxicity score?

Each category (toxic, obscene, threat, and so on) gets an independent probability between 0 and 1 from the toxic-bert model. A score of 0.9 means the model is 90% confident the text fits that category. The scores are independent, so they do not add up to 1 — a message can be high on several at once.

Is there a free content moderation tool?

Yes — this one. There is no signup, no payment, and no daily limit. The profanity scan runs entirely in your browser, and the toxicity model runs on the server at no cost to you. Nothing you paste is stored or logged.

What are the six Jigsaw toxicity categories?

They come from Google Jigsaw's Toxic Comment Classification Challenge: toxic, severe_toxic, obscene, threat, insult, and identity_hate. Each one targets a different kind of harmful language, and the model scores all six separately for every piece of text.

Can I detect hate speech in text automatically?

The model's identity_hate category targets content attacking a person's race, religion, gender, or similar identity. It is a useful signal, but no automated tool is perfect — treat a high identity_hate score as a strong prompt for a human to review, not a final ruling.

What does the flag threshold do?

It sets how confident the model must be before a category counts as flagged. Strict (0.3) catches borderline content but produces more false positives; Lenient (0.7) flags only clear-cut cases; Balanced (0.5) sits in between and is the default. The profanity scan is unaffected — any matched word always flags.

Does it work for Sinhala or Tamil?

Not in this version. The LDNOOBW word list and the toxic-bert model are both English-centric, so Sinhala and Tamil results would be unreliable. If you paste mostly non-Latin text the tool warns you. Translate to English first, or treat the scores as indicative only.

Is my text stored or used to train anything?

No. The profanity scan happens in your browser and never leaves your device. The model call sends the text once to score it and the server keeps nothing — no logging, no storage, no training. Close the tab and it is gone.

Utility · Moderation

AI Content Moderation Checker — Free Toxicity Checker

Paste any comment, review, or message and check it for toxicity, profanity, threats, insults, and hate speech across the six Jigsaw categories. Offending words are highlighted, sources are cited, and nothing is stored. No signup.

By Induwara Ashinsana— Executive Director, Ryzera TechnologiesUpdated Jun 22, 2026

Check content for moderation6 categories · English

Sources cited

Paste the text to check

The profanity scan runs in your browser; the model runs server-side. Nothing is stored.87 / 1,000

Try a sample

Flag threshold

One click runs the profanity scan instantly and the toxic-bert classifier server-side as a second opinion.

What this does

Reads any English text and returns a single moderation verdict — Clean, Flagged, or Strongly flagged — backed by two checks: a transparent profanity-word scan that highlights every offending word, and the six Jigsaw toxicity scores (toxic, severe_toxic, obscene, threat, insult, identity_hate) from a server-side model. Pick a threshold and press Check.

Methodology: deterministic LDNOOBW profanity scan + BERT-base · 110M parameters · multi-label sigmoid head. Six independent sigmoid scores; a label flags at the chosen threshold (inclusive ≥). Sources linked under “Sources” below.

How it works

The checker runs two independent layers and combines them into one verdict — the same pattern as a profanity filter sitting next to a machine-learning classifier. Each layer is transparent, and either can flag a message on its own.

Layer 1 — deterministic profanity scan. The text is lowercased, split into word tokens, and each token is checked for an exact match against a curated 72-term subset of the LDNOOBW list (“List of Dirty, Naughty, Obscene and Otherwise Bad Words”), the open profanity list used by Shutterstock. This layer always runs in your browser, needs no network, and highlights every matched word. It also computes a transparent density figure:

severity = min(1, (matches ÷ words) × 5)

The 5× multiplier means a profanity density of 20% or more saturates to 100%, so a single bad word in a short message still registers while one in a long, otherwise clean paragraph scores low. This is a stated heuristic, not a vendor figure.

Layer 2 — toxicity model. When configured, the text is sent once to the unitary/toxic-bert classifier through the Hugging Face Inference API on the server — no model weights are ever downloaded to your browser. It returns an independent sigmoid probability between 0 and 1 for each of the six categories. Because the head is multi-label rather than softmax, the six scores do not sum to 1; a message can be high on several categories at once. A category counts as flagged when its score is at or above your chosen threshold (Strict 0.3, Balanced 0.5, Lenient 0.7).

Combined verdict. The text is flagged when any model category crosses the threshold or any profanity word is matched. It escalates to strongly flaggedwhen the model's top score reaches 0.85, when a high-harm category (severe_toxic, threat, or identity_hate) is flagged, or when the profanity density reaches 60%. The verdict maps to a plain action: Clean → “Likely safe to publish”, Flagged → “Review before publishing”, Strongly flagged → “Recommend removing”. No score is invented — model probabilities are shown verbatim, and the only computed numbers are the profanity ratio and the threshold comparisons.

The six categories

Toxic

Rude, disrespectful, or unreasonable language likely to make someone leave a discussion.

Severe toxic

Very hateful, aggressive, or disrespectful content — toxicity at its most extreme.

Obscene

Vulgar, sexually explicit, or profane language.

Threat

A statement of intent to inflict physical or other harm on a person or group.

Insult

An inflammatory or negative comment directed at a person (a personal attack).

Identity hate

Hateful content targeting a person's race, religion, gender, sexual orientation, disability, or other identity.

Worked examples

The profanity layer is fully hand-checkable. These three reconcile exactly with the formula above and with the tool's built-in verifyWorkedExamples() check. (The neural scores are not hand-computable, so only the deterministic numbers are shown.)

Happy customer — Clean

“Thank you so much for the fast delivery, the product is great”

Tokens (words): 12
Profanity matches: 0
severity = min(1, 0 ÷ 12 × 5) = 0%
No word flagged → Verdict: Clean → Likely safe to publish

Short angry message — Strongly flagged

“this is crap”

Tokens (words): 3
Profanity matches: 1 ("crap")
severity = min(1, 1 ÷ 3 × 5) = min(1, 1.667) = 100%
severity ≥ 60% → Verdict: Strongly flagged → Recommend removing

Same word, longer text — Flagged

“the food was good but the service was crap honestly”

Tokens (words): 10
Profanity matches: 1 ("crap")
severity = min(1, 1 ÷ 10 × 5) = 50%
Word flagged but density below 60% → Verdict: Flagged → Review before publishing

Frequently asked questions

Sources & references

The taxonomy, model, and profanity list were last cross-checked on 2026-06-22. This v1 is English-only; image, audio, and Sinhala/Tamil moderation are out of scope. A high score is a prompt for human review, not a final ruling.

Related tools

LiveUtility

UUID Generator

Generate UUID v4 (random) or v7 (time-ordered) in your browser. Bulk up to 1,000 at a time, copy or download as .txt/.csv, inspect any UUID for version, variant, and embedded timestamp. RFC 9562 conformant.

Open tool

LiveUtility

Stopwatch

Browser-based stopwatch with millisecond precision, unlimited lap recording, CSV export, and keyboard shortcuts. No signup, no ads — runs entirely in your browser.

Open tool

LiveUtility

Alarm Clock

Free browser-based alarm clock. Set multiple alarms with custom sounds, weekday repeats, snooze, and a tab-title countdown — runs entirely in your browser, no signup.

Open tool

Rate this tool

Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.