nimbril
Guides

How to Redact a PDF So It Can't Be Undone

A black rectangle hides text from your eyes, not from a copy-paste — here's why most "redacted" PDFs leak, and how flattening to pixels makes the redaction permanent.

The black box is a lie: why most redactions leak

A PDF stores two things that feel like one: the picture you see and a separate text layer underneath it. When you draw a black rectangle in a typical editor, you cover the picture but leave the text layer untouched. Anyone can select the area, copy it, and paste plain readable text into a notes app — or, in 2025 and beyond, just drop the file into ChatGPT, Claude, or Gemini and ask it to read what's under the bars. The model reads the text layer directly and hands it back in seconds.

This isn't theoretical — it's how real documents leaked

In 2019, lawyers for Paul Manafort filed a court document with sensitive passages covered by black rectangles; reporters copied the text out and it was national news the same day. The same failure resurfaced in the 2025 reporting around the Epstein file releases, where supposedly redacted text was recovered simply by pasting into Microsoft Word. Independent reviews have found that a majority of files claimed to be redacted still exposed hidden text — including documents redacted with name-brand professional tools. The black box looks final. The data underneath usually isn't.

Moving boxes and 'flatten' aren't always enough either

Two subtler traps catch careful people. First, a black box added as an annotation is movable or deletable — open the file in a different reader and the rectangle can be dragged aside or removed to reveal what's beneath. Second, even 'flattening' the layers can be insufficient if the operation merges the visuals but doesn't excise the words from the PDF's content stream. The only thing that guarantees safety is removal: the sensitive text has to stop existing in the file, not just stop being visible.

The reliable fix: turn the page into pixels

The most bulletproof way to make a redaction irreversible is to flatten the marked page into a flat image and rebuild the PDF as an image-only document. Once the page is a picture, there is no separate text layer left to copy, no annotation to drag away, and nothing for an AI extractor to read. The black mark is now part of the pixels themselves — burned in. This is also why a redacted-then-flattened PDF survives the airplane-mode test for trust: there's literally no hidden data left to reveal, no matter what tool someone opens it with.

Don't forget the metadata trail

Even a perfectly flattened page can betray you through the file's edges. PDFs carry metadata — author names, software, sometimes revision history — and scanned documents may include OCR text layers or embedded thumbnails generated from the original. A complete redaction means stripping that hidden information too, so a forensic look with a metadata extractor turns up nothing. If your tool only blacks out the body and leaves the document properties intact, you've closed the front door and left a window open.

The compliance angle: 'never uploaded' is the safest redaction

If you handle medical records, legal filings, HR files, or anything under an NDA or HIPAA, where the file goes matters as much as how it's redacted. Uploading a sensitive document to a cloud redaction service means trusting a third party with the unredacted original — the exact data you're trying to protect. The cleanest path is to redact entirely on your own device, so the file with the secrets in it never travels anywhere. nimbril's Redact tool runs fully in your browser: the PDF never leaves your machine, the marked regions are flattened into the pixels, and the result is an image-only PDF with no recoverable text. Verify it yourself — turn on airplane mode and it still works, because nothing was ever uploaded.

Frequently asked

Why can people still read text under a black box in a redacted PDF?

Because the black box only covers the visible image layer. The PDF keeps a separate text layer underneath, which stays fully selectable, searchable, and copy-pasteable. Covering text is not the same as removing it — the words are still in the file's content stream until you actually delete them or flatten the page to an image.

Does flattening a PDF make redaction permanent?

Flattening to a true image makes it permanent, because rasterizing the page replaces the text layer with pixels — there's nothing left to copy or drag away. But be careful: some 'flatten' operations merge annotations visually without excising the underlying text. The safest result is an image-only PDF where the redacted page is genuinely a picture, with no text layer behind it.

Can AI tools like ChatGPT undo a bad redaction?

Yes, easily, if the redaction is just a black box. You can drop the file into an AI assistant and ask it to read what's under the bars, and it reads the intact text layer directly. That's exactly why proper redaction must remove the data rather than hide it — there's nothing for an AI to extract once the page is flattened to pixels.

Is it safe to use an online redaction tool for confidential files?

It depends entirely on whether the file gets uploaded. Most online tools send your document to a server, which means handing the unredacted original to a third party — a real concern for HIPAA, NDA, or legal material. On-device tools that run in your browser never transmit the file, so the sensitive original never leaves your machine. nimbril's Redact tool works this way and keeps functioning even in airplane mode.

What about metadata after I redact a PDF?

Redacting the visible text isn't the whole job. PDFs can carry author names, editing software, revision history, and OCR text layers from scans. A thorough redaction strips this hidden information too, so a metadata extractor finds nothing. If your tool leaves document properties intact, consider stripping metadata as a final step.