Add Local AI to App
Worldwide
NLP Engineer — Natural-Language Q&A / Semantic Search Engine (Self-Hosted, No LLM API) OVERVIEW I need a natural-language question-answering and semantic search engine built on top of a structured dataset I'll provide, integrated into my app's existing search. A user types a plain-English question and gets a direct, grounded answer plus ranked supporting results. Hard constraint: this must NOT call any generative LLM API (OpenAI, Anthropic, Google, Cohere, etc.) and must not depend on any paid per-query AI service. The system runs on self-hosted / local models only. To be clear about what "no LLM" means here: embedding models, classifiers, NER, and extractive QA models are expected and fine — I use neural embedding models already. What I do not want is a generative chat LLM (or an API wrapper around one) producing the answers. WHAT YOU'LL BUILD - A natural-language query endpoint (FastAPI) that takes a question and returns a direct answer + ranked supporting matches - Query understanding: intent classification + named entity recognition (entity names, identifiers, dates, categories, and other fields relevant to the dataset) - Dense vector retrieval over a knowledge base using Qdrant (I already run Qdrant), with optional hybrid keyword/BM25 search - Answering via extractive QA (span extraction from retrieved passages) and/or deterministic template-filled answers — NOT generation - Cross-encoder reranking for precision - An ingestion + embedding pipeline that builds the index from a structured dataset I provide Coverage is defined by the connected data. The system answers factual questions grounded in the dataset — attribute lookups ("what is the [attribute] of [entity]?"), relationship queries ("what is linked to [entity]?"), and filtered lists ("which records match [criteria]?"), including numeric/value fields if included. It is not expected to do open-ended reasoning or opinion — that's the point of avoiding a generative LLM. TECH STACK (must integrate with) - Python, FastAPI - Qdrant (existing instance) - sentence-transformers / Hugging Face models - PostgreSQL - Containerized deploy (Docker; bonus if you've used Modal serverless GPU) DELIVERABLES - Working FastAPI service with documented endpoints - Reproducible ingestion + embedding pipeline (rebuild the index from raw data with one command) - Written rationale for model choices (embedding model, QA model, reranker) - Test suite + evaluation results on a held-out question set I provide - Dockerfile / deployment instructions - Short handoff doc so my team can maintain and extend it MUST NOT - Call any external generative LLM API or paid per-query AI service - Ship a black box — code must be readable, documented, and maintainable - Hard-code answers; the system must generalize from the dataset ACCEPTANCE CRITERIA - Answers factual questions grounded in the dataset with [suggested: =85%] accuracy on a provided test set - Query latency [suggested: 300ms p95] on a single GPU (or CPU target if we agree on one) - Clean integration with my existing FastAPI + Qdrant setup - Index rebuildable from scratch by me, following your docs REQUIRED SKILLS - NLP, information retrieval, semantic search - sentence-transformers, text embeddings, vector databases (Qdrant / FAISS / Weaviate) - Extractive QA (fine-tuning or applying BERT / RoBERTa / DistilBERT for span extraction) - Intent classification and named entity recognition - Python, FastAPI, Docker NICE TO HAVE - Prior domain-specific QA / knowledge-base retrieval work - Hybrid search (BM25 + dense; Elasticsearch / OpenSearch) - Cross-encoder reranking experience TO APPLY — answer both (I'm filtering out anyone who'll just wrap an LLM): 1. Without using a generative LLM (GPT / Claude / Gemini), how would you build a system that answers a factual question like "what is the [attribute] of [a given entity]?" from a structured dataset — name the retrieval approach and the specific models you'd use and why. 2. Link a retrieval, QA, or semantic-search system you've built. What embedding model and (if any) QA/reranking models did you use, and how did you evaluate
$500.00
Fixed-price- IntermediateExperience Level
- Remote Job
- Ongoing projectProject Type
Skills and Expertise
Activity on this job
- Proposals:20 to 50
- Last viewed by client:18 hours ago
- Interviewing:2
- Invites sent:1
- Unanswered invites:0
About the client
- USAInver Grove Heights4:50 PM
- $9.1K total spent4 hires, 3 active
- Tech & ITIndividual client
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by