Python Developer Needed: High-Performance PDF Redaction & Anonymization API (PyMuPDF)

Posted 3 days ago

Worldwide

Summary

We are a health-tech / neurotechnology platform (SaaS) looking for an experienced Python developer to build a lightweight, high-performance microservice to automate the anonymization (redaction) of medical reports (PDFs). Our web application generates automated qEEG medical reports. Currently, historical reports are stored in a secure backup vault. When a user or system requests a historical PDF, we need a middleware/microservice to intercept the file, digitally destroy specific Patient Identifiable Information (PII) on-the-fly (in memory), and stream the clean PDF to the client browser in milliseconds. We previously attempted raw byte/string replacement with pdftk and RegEx, but due to internal PDF font structures and layout kerning arrays (TJ / Tj syntax objects), raw text replacement corrupts the files. Therefore, we require a robust, visual-coordinate-based redaction approach using libraries like PyMuPDF (fitz) or Apache PDFBox. Key Responsibilities: Develop a Python script/microservice that searches for specific visual anchor labels (e.g., "Subject ID:", "Client ID:") within a PDF document. Dynamically compute the visual boundaries (bounding boxes) following these anchors to cover unknown patient codes or file names. Fysically and irreversibly destroy/redact the underlying characters using proper PDF redaction methods (e.g., page.apply_redactions() in PyMuPDF), rendering the text completely unselectable and unsearchable. Apply an invisible mask (white fill) over the redacted area to preserve the original, professional template design perfectly. Wrap this functionality in a lightweight API framework (preferably FastAPI or Flask) so our web application back-end can communicate with it via internal HTTP requests.

  • $250.00

    Fixed-price
  • Expert
    Experience Level
  • Remote Job
  • One-time project
    Project Type
Skills and Expertise
Mandatory skills
Flask
RESTful API
Python
Activity on this job
  • Proposals:50+
  • Last viewed by client:3 days ago
  • Interviewing:
    0
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since Jan 28, 2019
  • Netherlands
    Eindhoven11:45 AM
  • $47K total spent
    10 hires, 4 active
  • 1,243 hours
  • Tech & IT
    Small company (2-9 people)

Explore similar jobs on Upwork

Back-test and Code Stock StrategiesHourly‐ Posted 4 weeks ago
PHP
Python
Java
JavaScript
Meta Graph API Integration Specialist NeededFixed-price‐ Posted 2 months ago
API
Python
PHP
JavaScript
WordPress

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo