Parse Florida Corporate Registry Files

Posted 2 weeks ago

Worldwide

Summary

I need 10 fixed-width Florida corporate registry text files parsed into a clean spreadsheet, with one calculated column added for business age. ATTACHMENTS, READ BOTH BEFORE APPLYING Two files are attached to this job. "Corporate File Definitions" is Florida's official record layout PDF; it shows the exact character position and length of every field in the data files. "Data Access & Field Reference" is a PDF document I wrote covering how you'll receive the actual data files and the full extraction and output spec. Review both before writing your proposal. THE FILES The data is Florida's public corporate registry (Sunbiz), provided as 10 fixed-width ASCII text files, each approximately 1.7GB, no delimiters. Every field sits at a fixed character position on a 1,440-character record. These files are too large to attach here; the PDF document explains exactly how you'll get them on award, either via a shared Google Drive folder or downloading directly from Florida's public data server yourself. The layout PDF is the only source of truth for field positions. Do not guess. WHAT TO EXTRACT Pull only the fields listed in the attached PDF document, using the character positions in the layout PDF: Corporation Number, Corporation Name, Status, Filing Type, Address, City, State, Zip, File Date (formation date), and Officer 1 Name only. Ignore officer fields 2 through 6 entirely. FILTER Keep only rows where Status equals "A" (active). Drop everything else. CALCULATE Add one new column called business_age_years: today's date minus the File Date, expressed in whole years. If File Date is missing or invalid for a row, leave this blank rather than estimating. OUTPUT Deliver one file, CSV or .xlsx, with the columns listed in the attached PDF document. Trim trailing spaces from all text fields. Sort by business_age_years descending, oldest businesses first. Include a short note stating the row count before and after filtering. QUALITY CHECK Before delivering, pick 10 rows at random from your output and verify them by hand on the free public lookup at search.sunbiz.org. Confirm the corporation name, formation date, and officer name match. Report how many of the 10 were correct. This is part of the deliverable, not optional. PRICE AND TIMELINE Fixed price $100. Deliverable: the formatted spreadsheet, your 10-row QC note, and the Python script you used. Target turnaround 3 business days from receipt of files. TO APPLY Answer both in your proposal, or it will not be read. First, have you parsed fixed-width or positional flat files in Python before, and what library do you use (pandas, struct, or other)? Second, roughly how long would parsing and filtering about 17GB across 10 files take on your setup?

  • $100.00

    Fixed-price
  • Intermediate
    Experience Level
  • Remote Job
  • One-time project
    Project Type
Skills and Expertise
Mandatory skills
Python
pandas
Data Extraction
Activity on this job
  • Proposals:20 to 50
  • Last viewed by client:last week
  • Hires:
    1
  • Interviewing:
    2
  • Invites sent:
    3
  • Unanswered invites:
    0
About the client
Member since Apr 12, 2026
  • USA
    Pompano Beach8:10 AM
  • $175 total spent
    2 hires, 0 active

Explore similar jobs on Upwork

Local Lead GenerationHourly‐ Posted 2 weeks ago
Web Scraping
Data Scraping
Data Extraction
Lead Generation
Data Entry
Data Mining
Data Collection
Data Processing
Web Scraping Framework
Web Crawler Framework
Web Scraping Software
Web Scraping Plugin
Web API
Search Tool
Search Engine
Microsoft Word
Data Entry
Administrative Support
Microsoft Excel

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo