Senior Search Architect (Lucene / Elasticsearch / OpenSearch / Algolia) – 200 Million Resumes
Worldwide
# Senior Search Architect (Lucene / Elasticsearch / OpenSearch / Vector Search) – Build a 200M Candidate Search Platform # IMPORTANT – Please Read Before Applying ## We are building a search platform comparable to: * **Juicebox.ai** * **Exa.ai** * **LinkedIn Recruiter** * **LinkedIn Sales Navigator** If you have built search platforms with similar complexity and scale, we would like to speak with you. **Please do NOT apply if you are planning to learn these technologies during this project.** We are specifically looking for someone who has already designed and built large-scale production search systems and can demonstrate previous work. --- # Overview We are building a next-generation AI-powered recruiting platform capable of searching **200 million resumes and LinkedIn profiles**. This is **not** a standard Elasticsearch implementation. We need an experienced Search Architect who has built production systems capable of: * Hundreds of millions of searchable documents * Complex recruiter-style faceted search * Sub-second response times * Extremely fast facet counts * High relevance ranking * Horizontal scalability Our goal is to create the best recruiter search platform in the industry. --- # About the Data We currently have approximately **200 million resumes and LinkedIn profiles**, all stored in JSON format. The platform should support continuous ingestion and updates while maintaining excellent search performance. --- # Responsibilities Design the overall search architecture and recommend the best technology stack. Responsibilities include: * Search architecture * Data modeling * Index design * Ranking strategy * Relevance tuning * Faceted search * Distributed indexing * Cluster architecture * Performance optimization * Capacity planning * Scalability planning --- # Search Features The search experience should feel similar to LinkedIn Recruiter and Juicebox.ai. Recruiters should be able to search using combinations of: ## Companies * Current company * Previous companies * Multiple companies * Include companies * Exclude companies ## Job Titles * Current title * Previous titles * Multiple titles * Include * Exclude ## Skills * Multiple skills * Required skills * Excluded skills * Skill normalization * Synonyms ## Location * Country * State * City * ZIP Code * Radius search (5–100 miles) * Multiple cities * Multiple states ## Education * Schools * Universities * Ivy League filter * Degrees * Multiple schools ## Experience * Total years of experience * Years at current company * Years in industry * Years in specific role ## Additional Filters * Industry * Certifications * Security clearance * Employment type * Languages * Work authorization * Remote / Hybrid / Onsite * Seniority * Management level Most filters should support: * Include * Exclude * Multiple values * Boolean AND / OR logic * Phrase search * Exact search * Prefix search * Fuzzy matching * Synonyms --- # Search Performance Requirements The platform should support: * Sub-second search response * Sub-second facet counts * Fast aggregations * Fast autocomplete * Instant suggestions * Efficient deep pagination * Millions of searches per day * Horizontal scalability * High availability * Fault tolerance --- # AI & Semantic Search We intend to use open-source LLMs to enrich resume data before indexing. Possible enrichment includes: * Skill extraction * Skill normalization * Company normalization * Industry classification * Job title normalization * Resume classification * Candidate summarization We are also open to implementing: * Vector embeddings * Semantic search * Hybrid keyword + vector search * Retrieval-augmented ranking We are interested in your recommendations based on production experience. --- # Technology We are **not committed to any specific technology stack**. We are open to using whichever technologies provide the best scalability, relevance, maintainability, and performance. Examples include: ### Search * Lucene * Elasticsearch * OpenSearch * Algolia * Solr * Vespa * Typesense ### Vector Search * Weaviate * Qdrant * Milvus * Pinecone * pgvector * Redis Search * FAISS * HNSW ### Databases We are open to using any database architecture if it makes sense for the solution, including: * PostgreSQL * MongoDB * ClickHouse * Cassandra * ScyllaDB * DynamoDB * Neo4j * Other specialized databases ### APIs Experience with: * GraphQL * REST APIs is highly desirable. ### Graph Technologies Knowledge of: * Neo4j * Graph databases * Relationship graphs is a significant plus for future recruiter relationship intelligence features. --- # What We're Looking For You should have real-world experience with: * 100M+ indexed documents * Elasticsearch/OpenSearch clusters * Lucene internals * Distributed indexing * Large-scale faceted search * Search relevance tuning * Custom analyzers * Synonyms * Ranking algorithms * Aggregations * Performance optimization * Capacity planning * High-volume production systems Experience building recruiting, staffing, ATS, CRM, people search, enterprise search, e-commerce, or internet-scale search products is highly preferred. --- # Mandatory Requirements Please **do not apply** unless you have built a comparable production search platform. Your proposal should include: * The largest search system you have built * Approximate number of indexed documents * Number of daily searches * Search technology used * Database(s) used * Whether embeddings/vector search were used * Cluster size * Average query latency * Average facet latency * Your role on the project * Team size --- # Interview Requirements You must be able to demonstrate previous work. Examples include: * Live demo * Recorded demo * Screenshots * Architecture diagrams * Code walkthrough * Performance dashboards You should be comfortable discussing: * Architecture decisions * Indexing strategy * Ranking * Relevance tuning * Scaling strategy * Search optimization * Cluster management * Performance bottlenecks --- # Please Do Not Apply If * You have only built small Elasticsearch projects. * You have only taken Elasticsearch courses. * You have never managed a production search cluster. * You plan to learn these technologies while working on this project. * You cannot demonstrate similar previous work. --- # Long-Term Opportunity This is expected to become a long-term technical leadership role. We are looking for someone who can own the search architecture, mentor our engineering team, and help us build one of the world's most capable AI-powered recruiting search platforms. If you have built search systems comparable to **Juicebox.ai, Exa.ai, LinkedIn Recruiter, LinkedIn Sales Navigator, HireEZ, SeekOut, ZoomInfo, Apollo, Indeed, or similar internet-scale search platforms**, we'd like to hear from you.
- Less than 30 hrs/weekHourly
- 1-3 monthsDuration
- ExpertExperience Level
$50.00
-
$200.00
Hourly- Remote Job
- Ongoing projectProject Type
Skills and Expertise
Activity on this job
- Proposals:50+
- Last viewed by client:3 days ago
- Interviewing:40
- Invites sent:0
- Unanswered invites:0
About the client
- United StatesEdison6:53 PM
- $211K total spent328 hires, 11 active
- 2,835 hours
- HR & Business ServicesLarge company (100-1,000 people)
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by