AI Vision & Generative Video Prototype Engineer for Spatial Media Platform
Worldwide
We are building a confidential next-generation AI media prototype that combines real-world visual recognition, multimodal content understanding, generative video creation, and spatial-style visual playback. This is not a standard mobile app project. A smartphone or camera-equipped device may be used as the temporary capture and playback interface, but the core work is the AI pipeline: recognizing physical visual/text-based content, interpreting its meaning, generating a short AI-created visual sequence, and playing it back through a simplified reflective display/demo setup. The detailed product concept will be shared only after NDA. We are looking for an experienced engineer: * Camera-based capture of physical visual/text content * OCR and computer vision processing * Multimodal AI interpretation of the captured content * Structured JSON generation for scene, character, tone, action, and visual direction * AI-generated image/video or animated visual sequence creation * Playback mode optimized for a simple reflective display-style demo * Lightweight backend/API pipeline for AI processing and asset storage * A polished demo flow suitable for investor, partner, or internal product presentations **Required Skills:** * Computer vision and OCR * Multimodal AI / LLM integration * AI image generation and/or AI video generation * Prompt engineering and structured JSON outputs * Backend/API development * Camera input and media processing * Prototype UI or demo interface development * Fast MVP execution under a 12-week timeline **Nice to Have:** * Experience with AR, spatial media, projection-style visuals, or reflective display demos * Experience creating AI characters, animated scenes, or generative video workflows * Unity, WebGL, React, Flutter, Swift, Kotlin, or similar demo-rendering experience * Experience building confidential product prototypes or investor demo systems **Confidentiality Requirements:** The selected freelancer must sign an NDA, assign all work product/IP to the client, and keep the entire project fully confidential. The project, concept, screenshots, demos, prompts, workflows, code, and outputs may not be shared publicly or used in a portfolio without written permission. **Application Requirements:** Please share relevant examples of projects involving AI vision, OCR, multimodal AI, generative image/video, AR/spatial media, or interactive prototype systems. Screenshots, short videos, documents, or links are welcome. Please also describe what you would build in the first 2 weeks to prove technical feasibility.
$500.00
Fixed-price- ExpertExperience Level
- Remote Job
- Ongoing projectProject Type
Skills and Expertise
Activity on this job
- Proposals:20 to 50
- Interviewing:0
- Invites sent:0
- Unanswered invites:0
About the client
- South KoreaSeoul1:23 AM
- $67K total spent85 hires, 32 active
- 282 hours
- Large company (100-1,000 people)
Explore similar jobs on Upwork
How it works
Create your free profileHighlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you wantApply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securelyFrom contract to payment, we help you work safely and get paid securely.
About Upwork
- 4.9/5(Average rating of clients by professionals)
- G2 2021#1 freelance platform
- 49,000+Signed contract every week
- $2.3BFreelancers earned on Upwork in 2020
Find the best freelance jobs
Growing your career is as easy as creating a free profile and finding work like this that fits your skills.
Trusted by