Add Local AI to App

Posted yesterday

Worldwide

Summary

NLP Engineer — Natural-Language Q&A / Semantic Search Engine (Self-Hosted, No LLM API) OVERVIEW I need a natural-language question-answering and semantic search engine built on top of a structured dataset I'll provide, integrated into my app's existing search. A user types a plain-English question and gets a direct, grounded answer plus ranked supporting results. Hard constraint: this must NOT call any generative LLM API (OpenAI, Anthropic, Google, Cohere, etc.) and must not depend on any paid per-query AI service. The system runs on self-hosted / local models only. To be clear about what "no LLM" means here: embedding models, classifiers, NER, and extractive QA models are expected and fine — I use neural embedding models already. What I do not want is a generative chat LLM (or an API wrapper around one) producing the answers. WHAT YOU'LL BUILD - A natural-language query endpoint (FastAPI) that takes a question and returns a direct answer + ranked supporting matches - Query understanding: intent classification + named entity recognition (entity names, identifiers, dates, categories, and other fields relevant to the dataset) - Dense vector retrieval over a knowledge base using Qdrant (I already run Qdrant), with optional hybrid keyword/BM25 search - Answering via extractive QA (span extraction from retrieved passages) and/or deterministic template-filled answers — NOT generation - Cross-encoder reranking for precision - An ingestion + embedding pipeline that builds the index from a structured dataset I provide Coverage is defined by the connected data. The system answers factual questions grounded in the dataset — attribute lookups ("what is the [attribute] of [entity]?"), relationship queries ("what is linked to [entity]?"), and filtered lists ("which records match [criteria]?"), including numeric/value fields if included. It is not expected to do open-ended reasoning or opinion — that's the point of avoiding a generative LLM. TECH STACK (must integrate with) - Python, FastAPI - Qdrant (existing instance) - sentence-transformers / Hugging Face models - PostgreSQL - Containerized deploy (Docker; bonus if you've used Modal serverless GPU) DELIVERABLES - Working FastAPI service with documented endpoints - Reproducible ingestion + embedding pipeline (rebuild the index from raw data with one command) - Written rationale for model choices (embedding model, QA model, reranker) - Test suite + evaluation results on a held-out question set I provide - Dockerfile / deployment instructions - Short handoff doc so my team can maintain and extend it MUST NOT - Call any external generative LLM API or paid per-query AI service - Ship a black box — code must be readable, documented, and maintainable - Hard-code answers; the system must generalize from the dataset ACCEPTANCE CRITERIA - Answers factual questions grounded in the dataset with [suggested: =85%] accuracy on a provided test set - Query latency [suggested: 300ms p95] on a single GPU (or CPU target if we agree on one) - Clean integration with my existing FastAPI + Qdrant setup - Index rebuildable from scratch by me, following your docs REQUIRED SKILLS - NLP, information retrieval, semantic search - sentence-transformers, text embeddings, vector databases (Qdrant / FAISS / Weaviate) - Extractive QA (fine-tuning or applying BERT / RoBERTa / DistilBERT for span extraction) - Intent classification and named entity recognition - Python, FastAPI, Docker NICE TO HAVE - Prior domain-specific QA / knowledge-base retrieval work - Hybrid search (BM25 + dense; Elasticsearch / OpenSearch) - Cross-encoder reranking experience TO APPLY — answer both (I'm filtering out anyone who'll just wrap an LLM): 1. Without using a generative LLM (GPT / Claude / Gemini), how would you build a system that answers a factual question like "what is the [attribute] of [a given entity]?" from a structured dataset — name the retrieval approach and the specific models you'd use and why. 2. Link a retrieval, QA, or semantic-search system you've built. What embedding model and (if any) QA/reranking models did you use, and how did you evaluate

  • $500.00

    Fixed-price
  • Intermediate
    Experience Level
  • Remote Job
  • Ongoing project
    Project Type
Skills and Expertise
Mandatory skills
iOS Development
Android App Development
Nice-to-have skills
Mobile App Development
Activity on this job
  • Proposals:20 to 50
  • Last viewed by client:18 hours ago
  • Interviewing:
    2
  • Invites sent:
    1
  • Unanswered invites:
    0
About the client
Member since May 29, 2025
  • USA
    Inver Grove Heights4:50 PM
  • $9.1K total spent
    4 hires, 3 active
  • Tech & IT
    Individual client

Explore similar jobs on Upwork

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo