AI Voice Model Consultant based on the real experience of Voice AI & Audio Synthesis

Posted 2 weeks ago

Worldwide

Summary

We are a forward-looking team building the next generation of creative AI audio tools. Our mission is to create personalized, high-fidelity AI voice experiences that empower artistic expression. We are seeking a seasoned AI Voice Model Expert to architect and build a groundbreaking system for AI-generated singing and voice interaction.

The Core Project:
Your primary mission will be to design and develop a system that allows a user to sing any song in their own voice. This involves capturing a user's unique vocal tone, timbre, and singing style from recordings and applying it to generate high-quality vocal covers for any chosen song.

Key Responsibilities:
You will be responsible for the end-to-end development lifecycle, from research and prototyping to deployment and MLOps.

1. Voice Cloning & Singing Voice Conversion (SVC/RVC):
-   Research, adapt, and implement state-of-the-art models for voice cloning and singing voice conversion (e.g., RVC, So-VITS-SVC, DiffSinger, DDiFFSinger, VoiceLab).
-   Build robust pipelines for dataset preprocessing, feature extraction (e.g., F0, hubert features), and model training on user-provided voice data.
-   Focus on achieving exceptional quality, capturing nuances like vocal inflection, vibrato, and emotional delivery.

2. Text-to-Speech & Speech-to-Text (TTS/STT):
-   Integrate or develop TTS components for voice agent functionalities, ensuring natural prosody and alignment with the user's cloned voice characteristics.
-   Utilize STT systems for potential lyric alignment, transcription, or interactive voice commands.

3. Voice Agent & Interactive Systems:
-   Design the architecture for a voice agent that can handle user requests (e.g., "generate a cover of Song X," "make the voice sound more powerful").
-   Create a seamless workflow for song input, lyric synchronization, and vocal generation.

4. Music & Lyrics AI:
-   Explore and integrate models for music source separation (e.g., Demucs) to isolate instrumental backing tracks and original vocals.
-   Investigate AI for lyrical analysis, alignment, and potentially even lyric generation or style transfer to match a user's style.

5. MLOps & Engineering Excellence:
-   Architect, build, and maintain scalable ML pipelines for training, fine-tuning, and inference.
-   Implement model versioning, monitoring, and automated retraining pipelines.
-   Containerize models (Docker) and deploy them on scalable cloud infrastructure (e.g., AWS, GCP, Azure).
-   Ensure the entire system is reliable, efficient, and maintainable.

Deliverables & Outcomes:
-   A scalable, cloud-based platform for user voice model training and inference.
-   A well-documented API and/or a simple, user-friendly interface for end-users to generate new vocals.
-   A library of trained, high-quality voice models, starting with the founder's voice.
-   A robust MLOps foundation for continuous improvement and scaling of the AI capabilities.

Must-Have:
    -   5+ years of experience in ML engineering with a strong focus on generative AI and deep learning.
    -   Proven expertise in digital signal processing (DSP) and audio/music processing.
    -   Hands-on experience with SVC/RVC frameworks and a deep understanding of the underlying architectures (e.g., GANs, VAEs, Diffusion Models).
    -   Strong proficiency in Python and core ML libraries (PyTorch, TensorFlow).
    -   Solid experience with MLOps tools (e.g., MLflow, Kubeflow, Weights & Biases) and cloud deployment.
    -   A strong portfolio or examples of past projects in AI voice synthesis, music generation, or a closely related field.
Highly Desirable:
    -   Experience with TTS/STT systems (e.g., Tacotron, WaveNet, Whisper, VALL-E).
    -   Knowledge of music information retrieval (MIR) and lyrics processing.
    -   Experience building interactive voice/AI agents.
    -   Understanding of ethical AI principles, especially concerning voice cloning and deepfakes.

How to Apply:
Please submit the following:
1.  Your resume/CV.
2.  A link to your portfolio, GitHub, or demos of relevant work (e.g., AI-generated vocals, SVC models, deployed ML systems).
3.  A brief cover letter describing your specific approach to building a custom singing voice conversion system, including the tools and models you would prioritize and why.
4.  [Optional] Any open-source contributions or research papers in related fields.

  • $5.00

    Fixed-price
  • Intermediate
    Experience Level
  • Remote Job
  • One-time project
    Project Type
Skills and Expertise
Mandatory skills
Voice Acting
Voice-Over
Nice-to-have skills
Deep Learning
Model Deployment
Tools
Eleven Labs
Google Cloud AI
Activity on this job
  • Proposals:Less than 5
  • Last viewed by client:2 weeks ago
  • Hires:
    1
  • Interviewing:
    0
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since Aug 31, 2025
  • United States
    Bluefield9:50 AM
  • $1.5K total spent
    132 hires, 13 active

Explore similar jobs on Upwork

Object Detection
Python
OpenCV
TensorRT
NVIDIA Jetson
PyTorch
Docker
Git
Analysis using CICDDoS2019 datasetsHourly‐ Posted 1 month ago
Deep Neural Network
Deep Learning
Deep Learning Modeling
TensorFlow
Keras
Neural Network
Python
Machine Learning
Data Interpretation
Data Analysis
Data Visualization

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo