Search Freelance Jobs on Upwork

Posted yesterday

US-Based PHP Developer for B2B Site Automation Scripts

Hourly: $35.00 - $75.00
Expert
Est. time: 1 to 3 months, Less than 30 hrs/week

We run a B2B SaaS platform serving specialty running retailers. Part of what we do is pull inventory data from brand-side wholesale portals so our retailers can see live vendor availability alongside their own POS data. We have valid B2B accounts on these portals. We have around 15 of these scrapers running today and want to add more, plus we need someone reliable to fix existing ones when a vendor changes their site. What you'll build * New scrapers for brand B2B portals to extract inventory data (SKU, size/color/width, quantity, backorder dates). Output goes into a MySQL 8 schema we'll provide. * Product image scrapers for both B2B portals and public-facing brand sites. * Fixes and updates to existing scripts when a portal changes (auth flow updates, endpoint changes, HTML/JSON structure shifts). Stack PHP 8.4 is our primary language. Existing scrapers are based on PHP, Selenium and Python. Use whichever HTTP library you find cleanest. Selenium, Playwright, or Puppeteer are welcome when a portal has complex JS-driven rendering or session flows that don't work with straight HTTP. Node or Python is fine for those. Scripts run on a Ubuntu server on GCP. Requirements * Comfortable inspecting network traffic to figure out how an undocumented portal actually works behind the login page. Most brand portals have a JSON API driving their UI even when it isn't publicly documented. * Willing to do both greenfield builds and maintenance on existing scripts. Engagement Hourly and ongoing. Volume varies week to week. Several existing scripts need to be reworked and occasionally a new script needs to be created for a new brand. When an existing breaks, it's usually because a vendor changed their portal without notice, and we need to be able to reach you promptly. We're prioritizing quality and a successful outcome over cost. If you're the right fit, we're not looking to bid you down. To apply Share one or two examples of authenticated portal automation you've built, ideally something involving JWT or cookie/session auth.

Posted last month

Scraping Project (Please Do Not Contact Outside of Upwork)

Hourly
Intermediate
Est. time: 1 to 3 months, Less than 30 hrs/week

Texas Municipality Data Collection Tool Built a custom web scraping and data aggregation system to collect and organize public information from Texas municipal websites. The tool automated data extraction across multiple sources, standardized the results, and exported structured datasets for analysis and reporting. Technologies: React, Node.js, Web Scraping, APIs, Data Processing, MongoDB/PostgreSQL

Posted 3 weeks ago

Website Development for Auto Auction Violations Data Collection

Fixed price
Intermediate
Est. budget: $500.00

We are seeking a skilled web developer to create a website focused on collecting data regarding violations, unethical treatment, and illegal conduct faced by dealers at auto auctions in the USA. The website will serve as a platform for users to report incidents, share experiences, and contribute to a database of information. The ideal candidate should have experience in building user-friendly interfaces, database management, and data security. Your expertise will help us bring awareness to these issues and support the dealer community.

Posted 3 weeks ago

Lead Research Automation — Company Data Enrichment & Contact Scoring

Hourly: $20.00 - $200.00
Expert
Est. time: 1 to 3 months, Less than 30 hrs/week

Here’s what I’m doing: I’m looking to make our lead research process better and get it fully automated. Here’s the process as it stands now: 1. We start with a list of companies and their websites. 2. For each one: scrape their site for shipping facility locations — warehouses, DCs, manufacturing plants. Check their locations/facilities pages and their careers page. Job postings for warehouse, forklift, shipping & receiving, production roles confirm an active facility and usually give the address. Careers pages are inconsistent — some are static HTML, some run through Workday, Greenhouse, iCIMS, or other ATS platforms that don’t always let scrapers in. I need someone who’s dealt with that before and knows how to handle it, not just scrape the easy ones and skip the rest. 3. Identify contacts matching these titles: Transportation Manager, Director of Transportation, Logistics Manager, Director of Logistics, Traffic Manager, Senior Transportation Manager, Senior Logistics Manager, Logistics Sourcing Manager, Logistics Procurement Manager, Transportation Procurement Manager, Transportation Sourcing Manager. 4. Score every contact for whether they’re actually still there — not just whether they show up in a database. Apollo and ZoomInfo are full of people who left or retired years ago but still show as active. The scorecard has to catch that before it goes any further. 5. Enrich the contacts that pass: email and phone. Phones: enriched numbers are usually garbage with no way to verify them. Scrape the company website for the corporate number instead. No 800 numbers — those are dead-end customer service lines. I want a local number so I can call and ask whether the person is still there. I know distinguishing a corporate number from an 800 number on a scraped page isn’t always straightforward — some sites only list the 800 number, some bury the local number in a footer or contact page. Tell me how you’d approach that. Emails: fine as-is — enrich, run through NeverBounce. Non-negotiable: every contact comes out with a score and the specific reason behind it. No black box. I need to see why someone scored high and why someone else scored low. End state: I drop in the list, it runs, output comes back scored and enriched with facility locations attached. Fully automated, repeatable.

Posted 4 weeks ago

Lead Generation & Data Enrichment Specialist

Hourly: $20.00 - $30.00
Intermediate
Est. time: 1 to 3 months, Less than 30 hrs/week

Victory Land Sales is a veteran-owned Texas land company that helps hardworking Americans purchase rural land through affordable owner financing. We've sold over 4,000 acres and are growing rapidly. We're looking for a detail-oriented Lead Generation & Data Enrichment Specialist to help us build highly targeted buyer lead lists for our marketing and sales campaigns. Position Overview We need someone who can identify potential land buyers, scrape lead data from multiple sources, enrich contact information, and verify lead quality before delivery. Your work will directly support our sales and marketing team by providing accurate, high-quality buyer leads. Responsibilities Lead Scraping & Research Identify and scrape buyer leads from online sources Build targeted prospect lists based on specific criteria Research potential land buyers and investors Gather relevant demographic and contact information Data Enrichment Find and append: Email addresses Phone numbers Mailing addresses Social media profiles (when available) Business information (when applicable) Data Validation Verify email deliverability Validate phone numbers Remove duplicates Ensure data accuracy and completeness Maintain clean CRM-ready lead lists Database Management Organize leads into spreadsheets or CRM systems Segment lists based on buyer profiles Deliver structured data in requested formats Ideal Candidate You have experience with: Lead generation Web scraping Data mining Contact enrichment List building CRM management

Posted 3 weeks ago

AI Architect: Local RAG & Ingestion MVP

Hourly
Intermediate
Est. time: Less than 1 month, Less than 30 hrs/week

Forum Intelligence: Project Brief & Initial Rollout 1. Executive Summary & Objective Forum Intelligence is a beginning as a localized data retrieval, processing, and archiving system designed to scrape public municipal records and state legislative data for public oversight. The immediate objective is to build a functional, highly resilient prototype focused on the Tri-Cities region (Burbank, Glendale, and Pasadena, California). The system will autonomously ingest messy, unstructured municipal data (City Council meeting minutes, agendas, public notices, and legislative PDF text, recorded mp4), clean it, and make it fully searchable and queryable via a localized AI agentic framework. 2. Phase 1 Scope: The Tri-Cities Rollout Th engineer will be responsible for building two primary pillars: A. Resilient Scraper Bots • Target Ingestion: Monitor and pull data from Burbank, Glendale, and Pasadena municipal portals and California legislative feeds. • Data Types: Brittle HTML sites, heavily nested tables, public notices, legislative drafts, and massive unstructured PDF archives. • Requirements: The scraping architecture must be exceptionally robust, utilizing intelligent error handling, retry semantics, and pagination tracking to handle frequent municipal website layout changes without breaking the pipeline. B. Ingestion & Vector Pipeline • Parsing: Extracting clean text from poorly formatted documents and scanned PDFs. • Local RAG (Retrieval-Augmented Generation): Chunking and embedding the data locally into a vector database (e.g., pgvector, Chroma, or Milvus) to enable semantically accurate entity linking and contextual search. 3. Targeted Hardware Stack To ensure maximum data security, strict public oversight integrity, and predictable operational costs, Forum Intelligence is skipping commercial cloud APIs in favor of an on-premise, localized NVIDIA enterprise deployment. The production roadmap aligns precisely with the new computing patterns detailed in NVIDIA’s latest hardware roadmap: • Inference & Token Generation: Running local open-weight frontier models (e.g., Neotron 3 Ultra or Claude/Llama equivalents) optimized for reasoning and long-context tool use. • Compute & Orchestration: The backend infrastructure is architected around NVIDIA’s dedicated agentic architecture, utilizing high-instructions-per-clock (IPC) Vera CPUs paired with Vera Rubin GPUs. • Memory & Storage Processing: Utilizing NVIDIA’s unified memory fabric and data processing units (DPUs) for ultra-low latency context management, KV caching, and fast vector database retrieval. 4. Immediate Milestones for the Engineer 1. Architecture Design: Map out the database schema and local inference ingestion loop. 2. Tri-Cities Scraper Deployment: Write and deploy the initial automated bots for Burbank, Glendale, and Pasadena. 3. Local MVP Pipeline: Demonstrate a local RAG pipeline where a user can query the Tri-Cities scraped records and receive grounded answers with exact source attributions. The above was AI generated from months long conversations with Gemini. The goal is to prove the concept then roll out to LA County, state of CA, and then the country.

Posted 4 weeks ago

Data Researcher: Systematic Lead Generation for Group Health Insurance Brokers (Zip-by-Zip Search)

Hourly: $6.00 - $18.00
Expert
Est. time: More than 6 months, Less than 30 hrs/week

Job Title: Data Researcher Needed: Systematic Lead Generation for Group Health Insurance Brokers (Zip-by-Zip Search) Job Description: Overview: We are seeking a meticulous, systematic Data Research Specialist to help us build a comprehensive, national database of Group Health Insurance Brokerages systematically state by state. The goal of this project is to prospect target areas systematically—zip code by zip code—to ensure 100% market coverage without missing smaller local firms. This is a highly structured, volume-based prospecting task. Bases on a 20 hour work week we are looking for 500 or more leads a week as a goal. If you have experience with deep-web research, B2B lead generation, and working with strict data formatting in Google Sheets, we want to hear from you. Key Responsibilities: • Conduct systematic, zip code-by-zip code research using search engines, local directories, map data, and industry licensing databases to identify group health insurance brokerages. • Identify and target the Sales Leadership contact at each brokerage (e.g., VP of Sales, Sales Director, Agency Principal). • Locate accurate, verified direct contact details for those individuals. • Input data perfectly into a structured excel Sheet with zero formatting errors. *** A link to a shared excel document will be provided. This is where all data will be populated. Required Data Points (Template Provided Below): For every single brokerage identified, you must collect: 1. Agency/Brokerage Name 2. First Name of Sales Leadership Contact 3. Last Name of Sales Leadership Contact 4. Verified Business Email Address 5. Office Phone Number 6. Agency Website URL 7. Zip Code 8. City 9. State Project Requirements & Skills: • Absolute Accuracy: High-quality, verified data only. Generic info emails (e.g., info@, sales@) should be a last resort; we heavily prioritize direct, personal executive email addresses. • Systematic Approach: Ability to strictly follow a provided list of zip codes and check off areas as they are cleared. • Tool Proficiency: Experience with lead generation and verification tools (e.g., LinkedIn Sales Navigator, Hunter.io, Apollo, NeverBounce, or similar) is a massive plus. • Communication: Responsive, capable of providing daily or milestone updates on Google Sheets.

Posted 3 weeks ago

Senior Python/FastAPI Developer for Real Estate Public Records Platform

Hourly: $30.00 - $60.00
Expert
Est. time: 3 to 6 months, 30+ hrs/week

We are building an early-stage real estate data platform that collects, cleans, enriches, and serves public-record and legal-notice data for real estate investors and professionals. This is not a greenfield build. We already have an existing backend repo with API routes, database models, migrations, scraping workers, tests, Docker configuration, and cloud deployment pieces. We need a strong backend engineer who can step into the existing system, understand what is working, identify what is risky, and help us get the backend stable enough for launch. The right person is practical, scrappy, and comfortable working in a startup environment where the goal is not perfection. The goal is to find the highest-leverage path to a reliable product. The platform involves: -Public-record and legal-notice data -Property data enrichment -API endpoints used by a frontend application -Data quality, reliability, and launch-readiness Current Backend Stack The backend is built primarily in Python and includes: -FastAPI -SQLAlchemy and Alembic -Postgres / Google Cloud SQL -MongoDB helper/caching layer -Scraping and ETL pipeline for public-record and legal-notice data -Playwright/Patchright-based scraping -reCAPTCHA-aware scraping workflows -LLM-based data extraction / AI-assisted parsing of unstructured notice data -Pydantic models -Google Cloud integrations: Cloud Run, Cloud Scheduler, Pub/Sub, Secret Manager, Cloud Storage, Artifact Registry -Docker -Pulumi infrastructure-as-code -GitHub Actions CI/CD -pytest, Ruff, uv You do not need to be world-class in every tool listed above, but you should be strong enough in Python backend systems, scraping/data pipelines, and cloud deployment to quickly understand the architecture and make sound technical decisions. What We Need Help With We need someone who can: -Review and understand the current backend architecture -Stabilize and improve the scraping / ETL pipeline for public-record and legal-notice data -Make sure public-record and legal-notice data is collected, parsed, stored, and served correctly -Improve backend APIs used by the frontend -Improve data quality checks for incomplete, missing, or inconsistent property records -Build and maintain property enrichment workflows using external data sources -Help design database models for richer property history and event tracking -Improve LLM-assisted parsing of unstructured legal notice data where appropriate -Debug deployment, CI/CD, Cloud Run, and infrastructure issues -Improve logging, error handling, monitoring, and observability -Strengthen test coverage where it matters -Help document the backend so future developers can contribute -Coordinate with our frontend developer to support product launch -Help prioritize backend work based on launch impact, data reliability, and technical risk Who This Is For You are likely a strong fit if you: -Like working inside existing codebases -Can diagnose messy systems without needing everything rewritten -Think in practical tradeoffs, not just ideal architecture -Are comfortable with incomplete documentation -Have experience with scraping/ETL workflows and unstructured data extraction -Can explain technical risks clearly to a non-technical founder -Prefer shipping useful improvements over debating perfect abstractions -Are willing to own outcomes, not just complete assigned tickets Who This Is Not For This is probably not the right fit if you: -Only want clean, fully documented codebases -Prefer to rebuild from scratch by default -Need enterprise-level process before making progress -Are an agency sending rotating developers -Only want tightly defined tickets with no ambiguity -Are uncomfortable with scraping, data quality, or production debugging Hiring Process We want to keep the hiring process practical and focused on real work. 1. Initial Screening We will review your proposal, background, and screening question responses. 2. Real-World Technical Scenario Strong candidates may be asked to respond to a specific backend issue from our current roadmap. We are looking for how you think, what tradeoffs you notice, and how clearly you communicate. 3. Paid Finalist Review A small number of finalists may be invited to complete a paid review of the existing backend codebase before any larger implementation work begins. Budget / Working Style We are an early-stage company and are looking for a practical, startup-minded developer. This is a paid contract role, but we are not looking for enterprise-agency rates. We value clear communication, efficient execution, and someone who can help us prioritize the highest-leverage backend work first. The first paid technical review may be structured as a fixed-price milestone. Continued implementation work may be hourly or milestone-based depending on fit. Long-Term Opportunity Our goal is to find someone who can become a long-term backend partner for the product, not just complete isolated tickets. For the right person, there may be an opportunity to grow into a technical lead / backend ownership role with additional upside tied to company performance. We are looking for someone who wants to help take a real product to market, but the initial engagement will be paid, scoped clearly, and focused on proving mutual fit.

Posted 4 days ago

Developer/Technical Lead for Start-Up

Hourly: $45.00 - $70.00
Intermediate
Est. time: More than 6 months, 30+ hrs/week

Developer Scope of Work Project Overview & Engagement Terms Domexa Labs for MyCondoCompliance (mycondocompliance.com). MyCondoCompliance is an enterprise and consumer-facing web platform built to aggregate, OCR, analyze, and report on condominium association compliance data throughout Florida (starting with Miami-Dade County). 1. Key Engagement Expectations: Dedicated Weekly Support: We require reliable, continuous development capacity week-over-week to support platform growth, new features, maintenance, and internal system updates. Flexible Monthly Hours: Hours will flex on a month-to-month basis depending on business priorities, product release cycles, and current backlogs. Minimum 2-3 hours/week, not to exceed 15hrs/week. Rapid Turnaround & Steady Communication: We operate in a fast-paced environment. Quick turnarounds on hotfixes, active updates on tasks, and daily/structured communication are critical. Language Requirement: Excellent, professional verbal and written English is a strict requirement for technical syncs, documentation, and coordination. 2. Technical Infrastructure & DevOps Architecture The MVP is complete, live, and deployed. You will inherit the following technical ecosystem: Infrastructure Stack: - Code Repository: Managed via GitHub. - Front-end Hosting: Deployed and managed on Netlify. - CI/CD: Automatically triggers deployment to production on master branch updates, and to staging/dev on dev branch updates. - Back-end Hosting & Infrastructure: Managed on Digital Ocean inside a Kubernetes environment. - DNS Administration: Managed on Digital Ocean. - Third-Party API Integrations: - Mapbox: Powers map-based search and property discovery. - Mailgun: Handles transactional email delivery. - Chatbase: Integrated for natural language querying and chat functionality. - TipTap: Rich text editor powering board notes and internal editing. 3. Scope Evolution & Core Pipelines As our incoming developer, you will be expected to maintain, debug, and expand upon the core features built during our initial execution phases. A. Data Pipelines & OCR Ingestion Engine - Website Scraper/ETL: Continuous ingestion pipelines that pull structured condo data and metadata from county public registers. - Normalization Engine: Ingestion pipeline that categorizes incoming unstructured documents into strict schemas - OCR & Vectorization: All ingested documents are automatically processed via an OCR layer, and the resulting plaintext is indexed into a vector database for semantic search and Retrieval-Augmented Generation (RAG). B. Autonomous AI Processing Agents We run specialized Python/Node microservices to process aggregated document metadata: - Granular Extraction: AI agents systematically query vector databases to extract critical datapoints - Audit Trails & Provenance: Each extracted datapoint must carry verification properties—linking back directly to the document source, specific page/snippet, and extraction timestamp. C. Portal Tiering & Client Features - Consumer Interface: Detailed property pages, dynamic scoring components, PDF report compilation and downloads. - Enterprise Interface: Multi-tenant web app allowing real estate, financial, and legal clients to access deep search, structured list filtering (e.g., filtering condos by unit counts, reserve posture, specific clauses such as "Kauffman language", and termination criteria), and batch export controls. - Admin Dashboard: Tracks user engagement metrics, domain lookups, purchase histories, and mailing list extractions.

Posted 2 months ago

AI-Powered Real Estate Website Growth & Automation Specialist (SEO, PPC, CRM, AI Integrations)

Fixed price
Intermediate
Est. budget: $500.00

Overview Curiel Homes is looking for a highly skilled freelance specialist to help transform our website into a true lead conversion machine. We are a modern real estate and mortgage brand focused on growth, automation, local authority, and scalable systems. We want someone who understands: Real estate lead generation SEO and local search strategy AI integrations and automation Google PPC CRM workflows and follow-up systems Website performance optimization Conversion-focused design This is NOT just a website design project. We are looking for someone strategic who can help build long-term systems and automation that drive leads and conversions. Current Goals We want to improve and/or implement: Website + SEO Neighborhood-focused SEO landing pages Local authority content strategy Faster mobile page speed ADA/screen reader optimization Structured blog strategy based on real search queries Mortgage and down payment assistance pages Google Business Profile integration AI + Automation AI chat follow-up and lead nurturing CRM automation and workflow optimization Retargeting audience setup Review scraping/integration tools (ex: Birdeye.ai) AI-assisted content generation Smart lead capture systems Data + Content Add/edit videos throughout website Fed-related market data integrations Content hub/blog strategy Automated market insights and reporting ideas Ideal Candidate We are looking for someone who has: Experience working with lead-gen businesses Strong understanding of SEO and technical SEO Experience with Google PPC Experience with AI tools and automation Experience integrating CRMs and lead funnels Strong communication and strategic thinking Ability to recommend scalable systems Portfolio of websites or funnels that generated measurable results Bonus if you have experience with: IDX websites Sierra Interactive, Follow Up Boss, KVCore, or similar CRMs AI chat systems Local SEO domination strategies Conversion rate optimization What We Need From You When applying, please include: AI/automation experience What platforms/tools you recommend Your approach to improving conversion rates Estimated timeline Your preferred pricing structure Project Scope We are open to: Hourly consulting Fixed project pricing Ongoing monthly partnership Potential for long-term work if it is a strong fit.

Jobs Per Page: