Build a PDF Document Classification Script Using Python
Worldwide
We are looking for a Python developer to build a standalone script that automatically categorizes PDF documents into predefined categories. The script will extract text from each PDF, process the content, and output a structured result showing the category for each file. This is a small, self-contained task focused on clean, working code. Job Responsibilities: 1. Develop a Python script that reads PDF files from a folder. 2. Extract text from each PDF using a Python library (e.g., `PyPDF2` or `pdfplumber`). 3. Implement text preprocessing such as cleaning, tokenization, and stop-word removal. 4. Apply rule-based logic or a lightweight ML model to classify PDFs into predefined categories. 5. Generate an output file (CSV or JSON) containing: - PDF file name - Assigned category - Optional confidence score 6. Handle encrypted, empty, or unreadable PDFs gracefully and provide meaningful error messages. Requirements: - Strong Python experience - Familiarity with PDF text extraction libraries (`PyPDF2`, `pdfplumber`, or similar) - Experience with text processing / NLP (Pandas, NLTK, or similar) - Ability to implement clear, maintainable logic Deliverables: - Standalone Python script (`.py`) - Sample input PDFs and output file demonstrating the classification - Short instructions on how to run the script Notes: This task is focused on quickly producing a working, reliable script that can classify PDF documents according to predefined rules or categories.
- Not SureHourly
- < 1 monthDuration
- ExpertExperience Level
- Remote Job
- One-time projectProject Type
Skills and Expertise
Activity on this job
- Proposals:50+
- Last viewed by client:4 days ago
- Interviewing:4
- Invites sent:0
- Unanswered invites:0
About the client
- United StatesRagland5:39 PM
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by