Full-Stack Python + Web Developer for Voice-AI Prototype (FastAPI & Firebase)

Posted yesterday

Worldwide

Summary

### Job Title: Full-Stack Python + Web Developer for Secure Voice-AI Prototype (FastAPI & Firebase) ### Project Overview: We are seeking an experienced Independent Full-Stack Developer to build a lightweight, highly optimized web browser prototype for a voice-interaction companion application. The core engine links browser-based audio capture with the Google GenAI SDK (Gemini) and Google Cloud Firestore. The primary objective of this architecture is long-term conversational memory with a flat, net-zero data storage footprint, utilizing an automated text-compaction loop. ### Core Technical Scope: 1. **Frontend Web UI:** A minimalist, clean web page featuring a secure login/signup screen (Firebase Auth) and a central chat dashboard with a prominent "Hold to Speak" microphone button. 2. **Browser Microphone Handling:** Implement JavaScript (MediaRecorder API) to capture user voice input directly through the browser, package it cleanly as a lightweight payload, and stream it to the backend. 3. **Multi-Tenant Cloud Database:** Configure a Google Firebase Firestore schema with strict customer data isolation paths (/clients/client_id) protected by active security rules. 4. **AI Processing & Context Caching:** Integrate the official google-genai SDK using the gemini-2.5-flash-lite model. Implement Context Caching on system instructions to minimize recurring token ingestion overhead. 5. **Memory Compaction Loop:** Develop a background function that executes post-interaction. It must look at old text summaries + the new chat transcript, generate a newly merged, highly condensed 4-sentence profile text block, and overwrite the client's database file, completely discarding raw transcripts. 6. **Real-Time News Grounding:** Enable the native google_search tool parameter within the Gemini configuration code block to allow live global and national news headline retrieval when prompted by the user's voice stream. 7. **Spotify Intent Routing:** Implement Function/Tool Calling. When the user requests a song, genre, or artist, Gemini must output a structured JSON tool directive containing a Spotify search deep link. The web frontend browser tab must capture this link and automatically open the Spotify platform loop. 8. **Vocal Output Integration:** Route Gemini's response text to a Text-to-Speech API (such as ElevenLabs or Google Cloud TTS) and stream the resulting audio file back to the browser for automatic playback. ### Developer Requirements: - Strong proficiency in Python (FastAPI or Flask) and Firebase/Firestore architecture. - Hands-on experience integrating the official Google GenAI SDK and third-party voice APIs (ElevenLabs/Deepgram/Whisper). - Ability to write secure, isolated database rules. - Excellent, clear communication skills in plain English. - Willingness to sign a standard mutual NDA before full project documentation or branding is revealed. ### Project Type & Budget: - One-time project - Experience Level: Intermediate - Budget: Fixed-Price (Milestone-Based) – $1,000 to $1,500 ### Milestone Delivery Plan: - Milestone 1 ($300): Google Firebase Project setup, secure client-isolated Firestore rules deployed, and standard frontend login/chat layout built. - Milestone 2 ($500): Gemini 2.5 Flash-Lite API linked with active Context Caching, native Google Search grounding active, and the automated background text-summarization loop fully functional via mock text scripts. - Milestone 3 ($400 - $700): JavaScript browser microphone recording linked to the server, Spotify tool calling integrated, ElevenLabs voice engine connected, and successful deployment to a live testing link (Render or Heroku).

  • $1,000.00

    Fixed-price
  • Intermediate
    Experience Level
  • Remote Job
  • One-time project
    Project Type

Contract-to-hire opportunity

This lets talent know that this job could become full time.
Learn more
Skills and Expertise
Mandatory skills
Python
API
API Integration
Activity on this job
  • Proposals:20 to 50
  • Last viewed by client:7 hours ago
  • Interviewing:
    1
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since Jun 29, 2026
  • Pakistan
    Sargodha12:56 AM

Explore similar jobs on Upwork

Product Formulation
Product Development
Prototyping
Food Science
Food Engineering
Biochemistry
Chemical Analysis
Chemical Engineering
IoT Developer for Prototype DevelopmentFixed-price‐ Posted 2 weeks ago
Embedded C
Android App Development
Xamarin
Mobile App Development

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo