Python Developer Needed: High-Performance PDF Redaction & Anonymization API (PyMuPDF)
Worldwide
We are a health-tech / neurotechnology platform (SaaS) looking for an experienced Python developer to build a lightweight, high-performance microservice to automate the anonymization (redaction) of medical reports (PDFs). Our web application generates automated qEEG medical reports. Currently, historical reports are stored in a secure backup vault. When a user or system requests a historical PDF, we need a middleware/microservice to intercept the file, digitally destroy specific Patient Identifiable Information (PII) on-the-fly (in memory), and stream the clean PDF to the client browser in milliseconds. We previously attempted raw byte/string replacement with pdftk and RegEx, but due to internal PDF font structures and layout kerning arrays (TJ / Tj syntax objects), raw text replacement corrupts the files. Therefore, we require a robust, visual-coordinate-based redaction approach using libraries like PyMuPDF (fitz) or Apache PDFBox. Key Responsibilities: Develop a Python script/microservice that searches for specific visual anchor labels (e.g., "Subject ID:", "Client ID:") within a PDF document. Dynamically compute the visual boundaries (bounding boxes) following these anchors to cover unknown patient codes or file names. Fysically and irreversibly destroy/redact the underlying characters using proper PDF redaction methods (e.g., page.apply_redactions() in PyMuPDF), rendering the text completely unselectable and unsearchable. Apply an invisible mask (white fill) over the redacted area to preserve the original, professional template design perfectly. Wrap this functionality in a lightweight API framework (preferably FastAPI or Flask) so our web application back-end can communicate with it via internal HTTP requests.
$250.00
Fixed-price- ExpertExperience Level
- Remote Job
- One-time projectProject Type
Skills and Expertise
Activity on this job
- Proposals:50+
- Last viewed by client:3 days ago
- Interviewing:0
- Invites sent:0
- Unanswered invites:0
About the client
- NetherlandsEindhoven11:45 AM
- $47K total spent10 hires, 4 active
- 1,243 hours
- Tech & ITSmall company (2-9 people)
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by