induwara.lk
induwara.lkDocuments · Privacy-first

Compress PDF

Shrink any PDF file entirely in your browser — no signup, no upload, no watermarks. Pick lossless, balanced, or strong compression. Text and vector graphics stay selectable and searchable; embedded JPEG images are re-encoded only when you choose to. Sources cited.

By Induwara AshinsanaUpdated May 11, 2026
PDF CompressorIn-browser · no upload
Files stay on your device

Everything happens in your browser. Nothing is uploaded.

Built against the ISO 32000-2 PDF 2.0 specification, using pdf-lib for the page-tree pass and the browser's Canvas API for the JPEG re-encode. Header, EOF, and object-walk integrity checks verified 2026-05-11 — full source list in the "Sources & references" section below.

How it works

A PDF is, in spec terms, a body of numbered objects (pages, fonts, images, content streams) plus a cross-reference table that tells a reader the byte offset of each object. Most of a modern PDF's bulk lives in two places: the raster images embedded as Image XObjects (ISO 32000-2 §8.9.5) and the uncompressed metadata around the page tree. This tool targets both, in two independent passes — and never touches the vector text or graphics that make a PDF text-selectable.

Pass one is lossless and runs at every level. The PDF is re-saved through pdf-lib with useObjectStreams: true. This activates PDF 1.5+ object streams (ISO 32000-2 §7.5.8) and compressed cross-reference streams (§7.5.7), which together pack dictionaries and small objects into Flate-compressed containers. The savings on a PDF that has never been optimised before are typically 5–15%; on one already written with object streams the savings are minimal.

Pass two runs only at the Balanced and Strong levels. The compressor walks every indirect object via context.enumerateIndirectObjects(), finds every Image XObject (/Subtype /Image) whose filter is /DCTDecode (a JPEG byte stream), and re-encodes it through the browser's Canvas API at the quality factor for the chosen level — 0.78 for Balanced and 0.55 for Strong. Each new JPEG replaces the stream contents only when the re-encoded bytes are actually smaller; otherwise the original image is kept intact. Existing alpha masks ( /SMask) and any private vendor dictionary keys are preserved across the swap.

For each input the compressor does these steps in order:

  1. Validate. Confirms the MIME type, a non-zero size, and a per-file cap of 100.0 MB. The first 1 KB is scanned for the literal %PDF-n.m header (§7.5.2), and the last 2 KB for the %%EOF terminator (§7.5.5). Files missing either are rejected with an explanation.
  2. Estimate image count. A byte-stream scan counts occurrences of /Subtype /Image (excluding the /ImageMask and related names). This is fast and works on every uncompressed PDF. The exact count is reported after the walk, and the two are cross-checked — if they disagree by more than one, the page tells you which path was authoritative.
  3. Restructure. pdf-lib re-saves the document with object streams and a compressed cross-reference. This pass alone is lossless and always runs.
  4. Re-encode JPEGs (Balanced/Strong only). Every JPEG Image XObject above 4.0 KB is decoded via new Image().src = url, drawn onto a same-size offscreen canvas, and re-encoded via canvas.toBlob("image/jpeg", q). Images that don't actually shrink are left untouched.
  5. Honesty check. If the final byte size is greater than or equal to the source, the tool hands you back your original file unchanged and tells you so. Producing a larger "compressed" file is not a feature.

The merge is performed entirely client-side: the PDF bytes never leave your browser tab, and the pdf-lib library (~270 KB) is dynamic-imported only when you press Compress — so the initial page bundle stays small.

Worked examples

Scanned five-page receipt batch

A phone-scanner export of five receipts (PDF 1.5, each page a 1600×2400 JPEG at quality ~92). The Strong level decodes each JPEG, re-encodes at quality 55, and re-saves with object streams.

  1. Level: Strong
  2. Input: 5.0 MB → Output: 1.5 MB
  3. Saved: 3.5 MB (70% smaller)
  4. Images re-encoded: 5
  5. Preserved: Page count, page ordering, page dimensions, EXIF-free.

Quarterly report with twelve illustrations

A 2.5 MB business report (PDF 1.7) with body text plus 12 JPEG illustrations. Balanced re-encodes the JPEGs at quality 78; vector text and tables stay as native PDF operators, so all text remains selectable and searchable.

  1. Level: Balanced
  2. Input: 2.5 MB → Output: 1.6 MB
  3. Saved: 896.0 KB (35% smaller)
  4. Images re-encoded: 12
  5. Preserved: Selectable text, vector tables, links, page references.

LaTeX thesis chapter (vector-only)

A 820 KB LaTeX-generated chapter with no embedded images. Lossless restructure re-saves with object streams and a compressed cross-reference; no pixel data is touched.

  1. Level: Lossless restructure
  2. Input: 820.0 KB → Output: 768.0 KB
  3. Saved: 52.0 KB (6.34% smaller)
  4. Images re-encoded: 0
  5. Preserved: Every page byte-identical to the source; bookmarks, fonts, hyperlinks intact.

Compression levels at a glance

Lossless

Lossless restructure

Re-saves the file with object streams and compressed cross-references. Pages stay byte-identical to your source.

  • JPEG quality:
  • Strips XMP / outline: no
  • Typical reduction: 1%15%

Balanced (default)

Balanced

Re-encodes embedded JPEG images at quality 78 and strips extra metadata. Best ratio of size vs. visible quality on most files.

  • JPEG quality: 78%
  • Strips XMP / outline: yes
  • Typical reduction: 20%60%

Strong

Strong

Re-encodes embedded JPEG images at quality 55 and strips extra metadata. Best for screen-only sharing of scans and photo-heavy reports.

  • JPEG quality: 55%
  • Strips XMP / outline: yes
  • Typical reduction: 40%80%

Frequently asked questions

Sources & references

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.