Senior Search Architect (Lucene / Elasticsearch / OpenSearch / Algolia) – 200 Million Resumes

Posted 3 days ago

Worldwide

Summary

# Senior Search Architect (Lucene / Elasticsearch / OpenSearch / Vector Search) – Build a 200M Candidate Search Platform # IMPORTANT – Please Read Before Applying ## We are building a search platform comparable to: * **Juicebox.ai** * **Exa.ai** * **LinkedIn Recruiter** * **LinkedIn Sales Navigator** If you have built search platforms with similar complexity and scale, we would like to speak with you. **Please do NOT apply if you are planning to learn these technologies during this project.** We are specifically looking for someone who has already designed and built large-scale production search systems and can demonstrate previous work. --- # Overview We are building a next-generation AI-powered recruiting platform capable of searching **200 million resumes and LinkedIn profiles**. This is **not** a standard Elasticsearch implementation. We need an experienced Search Architect who has built production systems capable of: * Hundreds of millions of searchable documents * Complex recruiter-style faceted search * Sub-second response times * Extremely fast facet counts * High relevance ranking * Horizontal scalability Our goal is to create the best recruiter search platform in the industry. --- # About the Data We currently have approximately **200 million resumes and LinkedIn profiles**, all stored in JSON format. The platform should support continuous ingestion and updates while maintaining excellent search performance. --- # Responsibilities Design the overall search architecture and recommend the best technology stack. Responsibilities include: * Search architecture * Data modeling * Index design * Ranking strategy * Relevance tuning * Faceted search * Distributed indexing * Cluster architecture * Performance optimization * Capacity planning * Scalability planning --- # Search Features The search experience should feel similar to LinkedIn Recruiter and Juicebox.ai. Recruiters should be able to search using combinations of: ## Companies * Current company * Previous companies * Multiple companies * Include companies * Exclude companies ## Job Titles * Current title * Previous titles * Multiple titles * Include * Exclude ## Skills * Multiple skills * Required skills * Excluded skills * Skill normalization * Synonyms ## Location * Country * State * City * ZIP Code * Radius search (5–100 miles) * Multiple cities * Multiple states ## Education * Schools * Universities * Ivy League filter * Degrees * Multiple schools ## Experience * Total years of experience * Years at current company * Years in industry * Years in specific role ## Additional Filters * Industry * Certifications * Security clearance * Employment type * Languages * Work authorization * Remote / Hybrid / Onsite * Seniority * Management level Most filters should support: * Include * Exclude * Multiple values * Boolean AND / OR logic * Phrase search * Exact search * Prefix search * Fuzzy matching * Synonyms --- # Search Performance Requirements The platform should support: * Sub-second search response * Sub-second facet counts * Fast aggregations * Fast autocomplete * Instant suggestions * Efficient deep pagination * Millions of searches per day * Horizontal scalability * High availability * Fault tolerance --- # AI & Semantic Search We intend to use open-source LLMs to enrich resume data before indexing. Possible enrichment includes: * Skill extraction * Skill normalization * Company normalization * Industry classification * Job title normalization * Resume classification * Candidate summarization We are also open to implementing: * Vector embeddings * Semantic search * Hybrid keyword + vector search * Retrieval-augmented ranking We are interested in your recommendations based on production experience. --- # Technology We are **not committed to any specific technology stack**. We are open to using whichever technologies provide the best scalability, relevance, maintainability, and performance. Examples include: ### Search * Lucene * Elasticsearch * OpenSearch * Algolia * Solr * Vespa * Typesense ### Vector Search * Weaviate * Qdrant * Milvus * Pinecone * pgvector * Redis Search * FAISS * HNSW ### Databases We are open to using any database architecture if it makes sense for the solution, including: * PostgreSQL * MongoDB * ClickHouse * Cassandra * ScyllaDB * DynamoDB * Neo4j * Other specialized databases ### APIs Experience with: * GraphQL * REST APIs is highly desirable. ### Graph Technologies Knowledge of: * Neo4j * Graph databases * Relationship graphs is a significant plus for future recruiter relationship intelligence features. --- # What We're Looking For You should have real-world experience with: * 100M+ indexed documents * Elasticsearch/OpenSearch clusters * Lucene internals * Distributed indexing * Large-scale faceted search * Search relevance tuning * Custom analyzers * Synonyms * Ranking algorithms * Aggregations * Performance optimization * Capacity planning * High-volume production systems Experience building recruiting, staffing, ATS, CRM, people search, enterprise search, e-commerce, or internet-scale search products is highly preferred. --- # Mandatory Requirements Please **do not apply** unless you have built a comparable production search platform. Your proposal should include: * The largest search system you have built * Approximate number of indexed documents * Number of daily searches * Search technology used * Database(s) used * Whether embeddings/vector search were used * Cluster size * Average query latency * Average facet latency * Your role on the project * Team size --- # Interview Requirements You must be able to demonstrate previous work. Examples include: * Live demo * Recorded demo * Screenshots * Architecture diagrams * Code walkthrough * Performance dashboards You should be comfortable discussing: * Architecture decisions * Indexing strategy * Ranking * Relevance tuning * Scaling strategy * Search optimization * Cluster management * Performance bottlenecks --- # Please Do Not Apply If * You have only built small Elasticsearch projects. * You have only taken Elasticsearch courses. * You have never managed a production search cluster. * You plan to learn these technologies while working on this project. * You cannot demonstrate similar previous work. --- # Long-Term Opportunity This is expected to become a long-term technical leadership role. We are looking for someone who can own the search architecture, mentor our engineering team, and help us build one of the world's most capable AI-powered recruiting search platforms. If you have built search systems comparable to **Juicebox.ai, Exa.ai, LinkedIn Recruiter, LinkedIn Sales Navigator, HireEZ, SeekOut, ZoomInfo, Apollo, Indeed, or similar internet-scale search platforms**, we'd like to hear from you.

  • Less than 30 hrs/week
    Hourly
  • 1-3 months
    Duration
  • Expert
    Experience Level
  • $50.00

    -

    $200.00

    Hourly
  • Remote Job
  • Ongoing project
    Project Type
Skills and Expertise
Mandatory skills
Apache Lucene
Elasticsearch
Activity on this job
  • Proposals:50+
  • Last viewed by client:3 days ago
  • Interviewing:
    40
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since Mar 18, 2012
  • United States
    Edison6:53 PM
  • $211K total spent
    328 hires, 11 active
  • 2,835 hours
  • HR & Business Services
    Large company (100-1,000 people)

Explore similar jobs on Upwork

Git
WordPress
PHP
MySQL
JavaScript
Backend Laravel DeveloperFixed-price‐ Posted 2 months ago
Laravel
PHP
MySQL
MySQL Programming

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo