Batch Redaction Across Every Page with One API Call

The Problem

Large PDFs mix sensitive text, images, vector paths, and form fields across hundreds of pages. Finding and removing every type manually is slow and inconsistent.

Per-element calls or overlay boxes add network overhead, leave artifacts, and miss content hidden in vector shapes, annotations, or off-pattern text.

You need document-level matching by content or coordinates, true removal (not masking), and the ability to send thousands of targets through one fast request while preserving the original layout.

Pattern Matching Across Documents

  • Redact text, images, vector paths, form fields, and annotations together in one operation
  • Match by content (regex or strings) or by position (coordinates and bounding boxes) across the document
  • Queue every target and redact across all pages with a single batch API call
  • True removal—no overlays or masking—while layout, pagination, and formatting stay intact
  • Built for scale: the same workflow handles one file or thousands
Source PDF
Saved PDF

PATIENT INTAKE FORM

Westside Medical Center | Form MR-2025-INT

Patient Information

Name:Sarah Johnson

Date of Birth:03/15/1985

SSN:482-55-7891

Phone:(555) 867-5309

Email:sarah.johnson@email.com

Address:1847 Oak Avenue, Portland, OR 97201

Emergency Contact

Name:Michael Johnson

Relationship:Spouse

Phone:(555) 234-5678

Medical Information

Primary Diagnosis:Type 2 Diabetes

Medications:Metformin 500mg (twice daily)

Allergies:Penicillin, Sulfa drugs

Blood Type:O+

Insurance Information

Provider:BlueCross BlueShield

Policy Number:BC-449281-PPO

Group Number:GRP-78452

Consent: I authorize Westside Medical Center to use and disclose my health information for treatment, payment, and healthcare operations. I understand that I may revoke this authorization at any time by submitting a written request.

Sarah Johnson

Patient Signature

01/15/2025

Date

CONFIDENTIAL - Protected Health Information (PHI) under HIPAA

API Calls

Full Code

Works instantly in guest mode - no API key required.

Why It Works

  • Document-level coverage: Search all pages at once for content matches or positional selectors
  • One-call throughput: Collect every element type and submit once; thousands of objects redact together
  • Any object type: Text lines, images, paths/shapes, form fields, and annotations delete the same way
  • True redaction: Content is permanently removed from the PDF structure - not overlays, not masking, actual deletion
  • Flexible patterns: One regex can match multiple field types - SSNs, phones, dates, policy numbers with similar formats
  • Layout preserved: Redaction removes objects from the PDF structure without shifting surrounding content
  • Scales to batches of files: Repeat the same batch request across folders or pipelines

Coming Soon: Automatic PII Detection

We're currently developing automatic PII detection and recognition using a machine learning model with Named Entity Recognition (NER).

This will eliminate the need for manual regex patterns - the system will automatically identify and flag sensitive information like names, SSNs, phone numbers, addresses, and more across your documents.

Start Using PDFDancer Today

Get started in seconds with our free tier. No credit card required.