AI Engineer / RAG Pipeline Developer for Compliance Law Management Information System

Posted last week

Worldwide

Summary

Key Responsibilities You will be responsible for building an end-to-end pipeline including: 1. Data Collection & Crawling - Design and implement web crawling pipelines for legal/compliance sources - Extract structured and unstructured legal content from websites and portals - Ensure compliance with robots.txt and legal scraping constraints 2. Document Processing (PDF + Text) - Build robust PDF parsing and extraction pipeline using tools like Docling - Handle complex legal documents (tables, footnotes, multi-column layouts) - Clean, normalize, and structure extracted content for downstream AI use 3. RAG Pipeline Development - Design and implement Retrieval-Augmented Generation architecture - Chunking strategies optimized for legal/compliance context - Embedding generation and metadata enrichment - Query understanding and response synthesis using LLMs 4. Vector Database (Pinecone) - Set up and optimize Pinecone vector database - Design indexing schema (metadata, filters, namespaces) - Optimize retrieval speed and accuracy - Implement hybrid search if needed (keyword + vector) 5. AI/LLM Integration - Integrate LLMs (OpenAI / open-source models) - Build prompt engineering for compliance/legal reasoning - Ensure traceability and citation-backed responses Required Skills - Strong experience building RAG systems in production - Hands-on experience with Pinecone or other vector databases - Experience with PDF parsing tools (Docling, PyMuPDF, Unstructured, etc.) - Strong Python backend development skills - Experience with web scraping/crawling frameworks (Scrapy, Playwright, etc.) - Familiarity with LLM APIs (OpenAI, Anthropic, or open-source models) - Understanding of embeddings, vector search, and semantic retrieval - Experience handling large-scale document pipelines Nice to Have - Experience with legal tech or compliance systems - Knowledge of information retrieval / NLP - Experience with LangChain, LlamaIndex, or similar frameworks - Cloud deployment (AWS/GCP/Azure) - Docker / Kubernetes experience Deliverables - Fully functional ingestion + crawling pipeline - PDF processing system using Docling or equivalent - Pinecone vector database setup with optimized schema - Working RAG system with API endpoints - Documentation of architecture and setup - Optional: simple UI for testing queries Project Type - Short-term MVP with potential for long-term extension - Possibility of ongoing development and scaling How to Apply Please include: - Relevant experience building RAG systems - Examples of similar AI or document intelligence projects - Your preferred stack for RAG pipelines - Any experience with legal/compliance data systems

  • Less than 30 hrs/week
    Hourly
  • 1-3 months
    Duration
  • Expert
    Experience Level
  • $10.00

    -

    $40.00

    Hourly
  • Remote Job
  • Complex project
    Project Type

Contract-to-hire opportunity

This lets talent know that this job could become full time.
Learn more
Skills and Expertise
Mandatory skills
Database Architecture
Activity on this job
  • Proposals:50+
  • Last viewed by client:2 days ago
  • Hires:
    1
  • Interviewing:
    0
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since Aug 13, 2024
  • United States
    Columbus3:12 AM
  • $984 total spent
    5 hires, 1 active
  • 35 hours

Explore similar jobs on Upwork

Database University AssignmentsHourly‐ Posted 9 months ago
SQL
Database
Microsoft Excel
Database Design
Database Management
SQL Server Integration Services
Excel Macros
Excel Formula
Microsoft Power BI
Microsoft Excel PowerPivot
Power Query
Data Entry
Data Cleaning
Data Analytics
Data Extraction
AWS Glue
Apache Kafka
Python
HubSpot
Salesforce CRM
REST API
Node.js

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo