Voice Activity Detection (VAD) Engine — React Native / Expo — iOS & Android — Fixed Price
Worldwide
Job Description We are looking for an experienced React Native / Expo mobile developer to implement a Voice Activity Detection (VAD) engine on an existing personal safety mobile application. This is a fixed-price engagement totaling $2,300 across two milestones. Work begins upon written notice from the client confirming a backend infrastructure milestone currently in progress has been completed. The full Statement of Work will be provided upon selection. This post summarizes the scope. Serious candidates with demonstrable mobile audio experience only. About the Project The application operates under a 51-jurisdiction recording consent compliance engine that automatically loads state-specific recording rules based on GPS-detected location across all 50 U.S. states plus Washington D.C. The compliance engine is fully built and operational. Your job is to add a Voice Activity Detection layer on top of it. Rather than firing compliance notices at the start of every session regardless of context, the VAD engine makes the compliance system context-aware. It detects whether another person is actually present near the device and triggers the appropriate jurisdiction-specific notice only when a second party is detected. It also distinguishes live human voices from recorded audio such as podcasts, radio, and television — preventing false triggers while ensuring legally required notices fire when a real person is present. This is precision mobile engineering work on a compliance-sensitive platform. Attention to detail and the ability to work to exact technical specifications are required. Scope of Work The engagement covers four integrated deliverable areas: Part A — Liveness Detection Engine On-device detection that distinguishes a live human voice from recorded audio using five independent signals — acoustic environment analysis, voice directionality, prosody analysis, device audio cross-correlation, and OS phone call detection. All five signals must be implemented and combined into a confidence score before any detection is classified. Part B — Six-Tier VAD Classification The engine classifies all detected audio into one of six tiers: Tier 1 — User alone — no compliance notice fires Tier 2 — Live human voice detected — jurisdiction-specific compliance notice fires Tier 3 — Recorded audio detected — no notice fires, ambient indicator appears Tier 4 — Active phone call detected — OS-level phone call API triggers immediate compliance response Tier 5 — Manual override — user-initiated compliance notice regardless of VAD state Tier 6 — Ambiguous detection — defaults to conservative legal interpretation, notice fires Part C — Sensitivity Adjustment System A user-configurable three-level sensitivity toggle — Strict, Moderate, and Sensitive — controlling the confidence threshold required before a compliance notice fires. Includes a false positive recovery flow, UI implementation in Settings, session indicator during active sessions, and persistence across devices via backend account sync. Part D — 51-Jurisdiction Compliance Integration Connecting the VAD output to the existing compliance engine so the correct jurisdiction-specific notice fires automatically: Profile A one-party states — soft chime, recording uninterrupted Profile B all-party consent states — overt announcement at override volume, audio gated until confirmed Profile C mixed states — periodic beep for call duration or live voice presence Profile D private place states — private place confirmation dialog Unknown jurisdiction fallback — defaults to Profile B strictest rules Integration Testing As part of this engagement you will perform end-to-end integration testing on real Android and iOS hardware to confirm the VAD features work correctly with the backend systems. You will run six structured integration tests covering VAD event logging, vault metadata stamping, sensitivity preference sync, session announcement timing, false positive dismissal logging, and SOS override behavior. You will deliver a written Integration Test Report to the client alongside your Milestone B code submission documenting PASS or FAIL for each test with the exact observed API responses. Milestones Milestone A — $1,100 Liveness Detection Engine (Part A), Six-Tier VAD Classification (Part B), Sensitivity Adjustment System (Part C), and 51-Jurisdiction Compliance Integration mobile-side wiring (Part D). Payment released after client verifies all Milestone A acceptance criteria on real Android and iOS hardware. Milestone B — $1,200 Part D complete implementation — all six tier responses wired to correct jurisdiction profiles, sensitivity interaction across all profiles, API call stubs for three backend endpoints per the agreed API contract, local sensitivity preference storage, and delivery of the completed Integration Test Report. Milestone B does not begin until Milestone A is accepted in writing. Payment released after client verifies all Milestone B acceptance criteria on real Android and iOS hardware. Total Fixed Price: $2,300 No scope changes or additions will be made without a written amendment. Any deliverable that fails acceptance criteria will be corrected at no additional charge. Payment is released on client verification on real hardware only — no emulator testing accepted. Technical Requirements Strong React Native and Expo experience — the mobile codebase is built on Expo Demonstrable experience with on-device audio processing, VAD, or mobile speech detection — Silero VAD via ONNX Runtime Mobile is the recommended implementation approach Solid understanding of iOS AVAudioSession category and mode configuration Solid understanding of Android AudioManager mode configuration Ability to work against a defined API contract and write mobile-side API call stubs without requiring backend access at any point during development Real iOS and Android test devices — you must have access to real hardware Strong written English for integration test reporting and client communication Ability to work independently — the lead backend developer works on a separate workstream and the mobile contractor does not require backend access to complete this engagement What You Will Receive Upon Selection Upon selection you will receive the full Statement of Work which includes: Complete technical specification for all four deliverable areas The five liveness detection signals with full implementation requirements The six-tier classification architecture with exact behavior definitions The 51-jurisdiction response matrix covering all profiles and the unknown jurisdiction fallback The sensitivity system including the false positive recovery flow and cooldown logic The SOS non-negotiable override architecture The list of existing compliance engine behaviors that must not be changed The API contract for the three backend endpoints your mobile code writes stubs against The complete acceptance criteria checklist The integration test report format and six required test procedures How to Apply Please respond with the following. Applications that do not address all three points will not be reviewed. 1. VAD and audio detection experience Describe your experience with on-device VAD or audio classification on mobile. What libraries or approaches have you used. What platforms have you shipped this type of work on. 2. Technical approach How would you approach distinguishing a live human voice from a podcast playing through a phone speaker on both iOS and Android. What signals would you use and how would you combine them into a confidence score. 3. Availability and timeline Confirm your availability to begin work upon written activation notice from the client. Provide your estimated timeline to complete Milestone A from the date work begins. We evaluate candidates based on the technical depth of their application. Generic proposals and cover letters that do not address the three questions above will not be considered.
$2,300.00
Fixed-price- IntermediateExperience Level
- Remote Job
- One-time projectProject Type
Skills and Expertise
Activity on this job
- Proposals:20 to 50
- Last viewed by client:2 days ago
- Hires:1
- Interviewing:0
- Invites sent:1
- Unanswered invites:0
About the client
- USAAbingdon 6:43 AM
- $69K total spent50 hires, 13 active
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by