Fix Python NER pipeline for anonymising names in Excel files

Posted 5 days ago

Worldwide

Summary

I have a data privacy project involving Excel files (.xlsx/.xlsb) containing care log entries with people's names mixed into free-text narrative columns. I need help completing and debugging a Python pipeline that anonymises (encodes) these names into placeholder codes, and reverses (decodes) them back to the original names later. The challenge: the same person's name appears in many different forms throughout a file — full name, first name only, initials, lowercase, and occasional typos/misspellings. The pipeline needs to recognise all these variants as the same person and assign them a single consistent code (e.g. CLIENT_001, STAFF_002), not a different code for each spelling. What I need help with: Debugging name-clustering logic so name variants reliably merge into one identity (currently some real names get split into multiple codes, and occasionally the model picks up garbled/incorrect text as a "name") Improving performance (NER currently runs slower than it should on larger files) General code review and robustness improvements to the existing Python notebook Tech stack: Python, Jupyter, spaCy, Microsoft Presidio, openpyxl, pywin32 (Excel COM) Looking for someone experienced with NLP/NER pipelines and Python data processing. Data being used is synthetic/sample data.

  • $30.00

    Fixed-price
  • Intermediate
    Experience Level
  • Remote Job
  • Ongoing project
    Project Type
Skills and Expertise
Mandatory skills
Python
Machine Learning
Activity on this job
  • Proposals:15 to 20
  • Last viewed by client:3 days ago
  • Hires:
    1
  • Interviewing:
    1
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since Sep 1, 2019
  • Algeria
    Albida8:19 PM
  • $425 total spent
    19 hires, 1 active
  • Individual client

Explore similar jobs on Upwork

Local Lead GenerationHourly‐ Posted 1 week ago
Web Scraping
Data Scraping
Data Extraction
Lead Generation
Data Entry
Data Mining
Data Collection
Data Processing
Web Scraping Framework
Web Crawler Framework
Web Scraping Software
Web Scraping Plugin
Web API
Search Tool
Search Engine
Administrative Support
Data Entry
Voice Recording
HR & Business Services
Chinese

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo