Python Developer for Patent XML/PDF Data Ingestion Pipeline - Fixed Scope Trial
Worldwide
I need a Python data engineer to build a fixed-scope ingestion/parser milestone for patent data. This is not a full platform rebuild. The goal is to take provided sample patent data files and implement a clean, testable ingestion pipeline that can parse XML/PDF-source metadata into structured outputs. Sample is provided. Initial milestone scope: 1. Inspect provided Chinese patent XML sample package structure. 2. Build a Python parser for the sample XML files. 3. Extract key patent fields including: - publication/application identifiers - claims - claim numbers - independent/dependent claim indicators where available - description sections - bibliographic metadata - legal/current-owner metadata where available 4. Output parsed data into clean structured tables or files suitable for PostgreSQL loading. 5. Provide clear source-to-target field mapping. 6. Add basic tests using the provided sample files. 7. Provide runnable setup instructions. Possible follow-on work may include: - Korean PDF description extraction - Japanese bulk XML full-text parsing - PostgreSQL integration - translation pipeline integration Important: - This first milestone does not include dashboard work. - This first milestone does not include production deployment. - This first milestone does not include legal translation. - Please do not estimate a large open-ended rebuild. I am looking for a practical fixed-scope parser/data pipeline milestone. Ideal freelancer: - Strong Python experience - Comfortable with XML parsing - Comfortable with messy real-world data files - Experience with ETL/data pipelines - PostgreSQL experience is helpful - Patent data experience is a plus but not required Please include in your proposal: 1. Similar XML/ETL parsing work you have done. 2. How you would structure the parser. 3. What you would deliver for the fixed-price milestone. 4. Confirmation that you understand this is a bounded trial milestone, not a full platform rebuild.
$500.00
Fixed-price- IntermediateExperience Level
- Remote Job
- Ongoing projectProject Type
Skills and Expertise
Activity on this job
- Proposals:20 to 50
- Last viewed by client:last week
- Interviewing:1
- Invites sent:0
- Unanswered invites:0
About the client
- INDHyderabad9:36 PM
- $542 total spent4 hires, 2 active
- 9 hours
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by