AI Engineer — Autonomous Research & Signal Engine
Worldwide
AI Engineer — Autonomous Research & Signal Engine (Internal Tool) ## About the work We run a B2B program that researches genuine, off-work signals about specific people — their interests, activities, and community involvement — from public sources, and uses those signals to design real, personalized experiences. We have a working manual pipeline and a proven methodology. We now want to package it as an **internal engine** our team can run repeatably and reliably. This is a **backend build — no frontend, no UI work.** The engine will be exposed as an MCP server (and a plain REST API underneath) so our team can call it directly from their existing AI tooling. --- ## What you'll build (v1) - A **person-research agent** that resolves a named contact to a single verified identity across the public web and reliably collects their public professional and activity footprint. - A **signal-report generator** that extracts off-work interests, each tied to a specific source URL and a calibrated confidence rating. - An **experience-recommendation** step that maps a verified signal to a real, locally-reachable, costed option. - Persistence: a database that stores contacts, signals, sources, and confidence state so results are repeatable. - An **MCP server** exposing clean tools (e.g. `research_person`, `get_signal_report`, `source_experience`) plus a REST API. Reliably collecting and structuring public data from major professional networks is central to this build. In your proposal, tell us how you'd approach this technically and what reliability and compliance risks you'd manage. --- (Our specific sourcing and confidence rules are shared under NDA after an initial screen.) --- ## Must have - **Strong Python** — agent orchestration, data pipelines, research logic. - **LLM agent experience** — hands-on with the Anthropic and/or OpenAI APIs, tool/function calling, and structured JSON output. Raw API fluency matters more to us than framework name-dropping. - **Reliable public-web data collection** — httpx/requests, HTML parsing, and headless browsers (Playwright preferred), holding up against anti-bot measures and rate limits. - **Async Python** (asyncio) — for concurrent, performant agent runs. - **Grounded generation / hallucination mitigation** — source attribution, validation, and ideally eval harnesses. ## Strongly preferred - Experience **building MCP servers**. - **FastAPI** and **Postgres**. - Proxy / rate-limit handling for dependable data collection. ## Nice to have - Identity-resolution / entity-matching across disparate sources. - PII-handling and data-protection awareness (GDPR/CCPA). --- ## How to apply 1. Share **1–2 relevant systems** you've built — agentic research, enrichment, or grounded generation. 2. In **3–4 sentences**: how would you reliably collect and interpret public profile/activity data, and what are the top risks you'd manage? 3. Note whether you've built MCP servers, and with which stack.
- More than 30 hrs/weekHourly
- 1-3 monthsDuration
- IntermediateExperience Level
$15.00
-
$30.00
Hourly- Remote Job
- Complex projectProject Type
Skills and Expertise
Activity on this job
- Proposals:50+
- Interviewing:0
- Invites sent:0
- Unanswered invites:0
About the client
- United StatesLouisville11:13 AM
- $20K total spent16 hires, 7 active
- 451 hours
- Sales & MarketingSmall company (2-9 people)
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by