AI Video Automation Platform - DEVELOPER
Worldwide
I want to build an internal web platform that turns a script into a finished YouTube video automatically. I want to upload a script, select a saved AI character (avatar) and voice, and click Create. The system then automatically via API: Generates the voiceover Generates AI avatar footage of the character speaking the script Pulls matching stock b-roll for each scene Assembles the complete edited video (avatar intro, b-roll body, split-screen segments) — no human editor These channels are the output format and quality we're building toward. Watch a few videos before applying: www.youtube.com/@EliasYoderAmish/videos www.youtube.com/@EstherYoderAmish/featured www.youtube.com/@AmishPantry www.youtube.com/@TheGraysonReport www.youtube.com/@DrEdmundHale Dashboard steps & capabilities: Script upload ? automatic scene breakdown with timestamps and b-roll keywords (LLM API) Character library — save avatars per channel + create new ones (AI image ? animated talking avatar) Voice library — saved voices per character Auto-pull b-roll from Pexels/Pixabay APIs matched to each scene Auto-generate avatar footage speaking the script (typically the opening minutes + transitions, not the full runtime) Auto-assemble the final video following editing rules (avatar open ? b-roll ? split-screen ? avatar close) Render queue, project history, per-video cost tracking Requirements: Total generation cost must land around $3-5 per video (15–20 min videos). Expensive convenience APIs like HeyGen can't be the default — use low-cost or open-source approaches (open-source lip-sync on rented GPUs, budget TTS, free stock APIs, FFmpeg assembly), or propose your own way to hit the number. Will still have access to Heygen and other tools Every layer must be swappable — voice, avatar, b-roll, and rendering providers can be switched later without rebuilding. Option to select program wanted. Choose between VO tool or Avatar tool. The APIs stay stored in the dashboard. Fully automated end-to-end — script in, video out. A simple review screen to swap clips before final render is a plus. Relevant skills: FFmpeg / programmatic video assembly, AI media pipelines (ComfyUI, Replicate, fal.ai, RunPod), TTS and image/video generation APIs, full-stack web development. To apply, start your application with the word BISON and answer: How would you build this, layer by layer, with the specific tools you'd use and cost per video? Show me anything you've built involving AI video, avatars, or automated media. Timeline to a working v1? Final step before hiring is a small paid test: I give you a short script, you produce a 2-minute sample video.
- Less than 30 hrs/weekHourly
- 1-3 monthsDuration
- IntermediateExperience Level
$15.00
-
$35.00
Hourly- Remote Job
- Complex projectProject Type
Skills and Expertise
Activity on this job
- Proposals:20 to 50
- Last viewed by client:3 hours ago
- Interviewing:6
- Invites sent:8
- Unanswered invites:5
About the client
- USABoynton Beach3:52 AM
- $20 total spent1 hire, 0 active
- Media & EntertainmentSmall company (2-9 people)
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by