Python Developer Needed for Legislative Data Pipeline, XML Parsing, and NLP-Based Bill Analysis

Posted 2 weeks ago

Worldwide

Summary

We are seeking an experienced Python developer to build a reproducible workflow for collecting, processing, and analyzing legislative bill data from a recent state legislative session. The project involves parsing legislative XML files, constructing a bill-level database, enriching records with legislator metadata, and performing initial text-based classification and exploratory analysis. ### Scope of Work * Parse a master legislative XML index file containing all measures introduced during a legislative session * Extract and structure bill-level metadata, including: * Bill number * Title * Sponsor information * Committee assignments * Status/history information * Related document links * Apply filtering and data-cleaning procedures to create a research-ready dataset * Merge bill records with legislator roster datasets to enrich sponsor information * Automate retrieval of linked bill-history XML files and associated bill-text documents * Build a reproducible data-processing pipeline that can be reused for future legislative sessions * Perform initial NLP-based topic classification and content categorization of bill text * Generate descriptive summaries and exploratory statistics across bills, sponsors, committees, and policy topics ### Technical Requirements Required experience: * Python * pandas * requests * lxml * BeautifulSoup * Regular Expressions (regex) * XML parsing and data extraction * Data cleaning and transformation workflows * Relational data merging and normalization Preferred experience: * Natural Language Processing (NLP) * Topic modeling or text classification * Document processing (PDF/XML) * Exploratory data analysis and visualization * Reproducible research workflows and project documentation ### Deliverables * Fully documented Python workflow * Clean bill-level analytical dataset * Automated data collection and processing scripts * Topic-classified bill dataset * Summary statistics and exploratory analytical outputs * Documentation explaining workflow execution and data structure ### Additional Information To keep the initial posting concise, detailed source materials, sample files, data schemas, and project-specific documentation will be shared only with shortlisted candidates. Selected candidates will receive access to representative XML files, supporting datasets, and additional project requirements necessary for preparing an accurate implementation plan and estimate. The solution should be modular, reproducible, and designed so that additional legislative sessions can be processed with minimal modifications.

  • More than 30 hrs/week
    Hourly
  • 1-3 months
    Duration
  • Expert
    Experience Level
  • $10.00

    -

    $40.00

    Hourly
  • Remote Job
  • Ongoing project
    Project Type

Contract-to-hire opportunity

This lets talent know that this job could become full time.
Learn more
Skills and Expertise
Mandatory skills
Python
Data Scraping
Data Mining
Activity on this job
  • Proposals:50+
  • Last viewed by client:2 weeks ago
  • Interviewing:
    9
  • Invites sent:
    17
  • Unanswered invites:
    8
About the client
Member since Apr 18, 2026
  • United States
    Washington Township4:50 AM
  • $332 total spent
    5 hires, 1 active
  • 5 hours
  • Health & Fitness
    Mid-sized company (10-99 people)

Explore similar jobs on Upwork

Job Aggregation and Dashboard CreationHourly‐ Posted 8 months ago
Automation
Data Scraping
Data Extraction
API
UI/UX Prototyping
Web Scraping
Install and Integrate UTM Grabber PluginFixed-price‐ Posted 3 weeks ago
WordPress

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo