AI Vision & Generative Video Prototype Engineer for Spatial Media Platform

Posted 2 weeks ago

Worldwide

Summary

We are building a confidential next-generation AI media prototype that combines real-world visual recognition, multimodal content understanding, generative video creation, and spatial-style visual playback. This is not a standard mobile app project. A smartphone or camera-equipped device may be used as the temporary capture and playback interface, but the core work is the AI pipeline: recognizing physical visual/text-based content, interpreting its meaning, generating a short AI-created visual sequence, and playing it back through a simplified reflective display/demo setup. The detailed product concept will be shared only after NDA. We are looking for an experienced engineer: * Camera-based capture of physical visual/text content * OCR and computer vision processing * Multimodal AI interpretation of the captured content * Structured JSON generation for scene, character, tone, action, and visual direction * AI-generated image/video or animated visual sequence creation * Playback mode optimized for a simple reflective display-style demo * Lightweight backend/API pipeline for AI processing and asset storage * A polished demo flow suitable for investor, partner, or internal product presentations **Required Skills:** * Computer vision and OCR * Multimodal AI / LLM integration * AI image generation and/or AI video generation * Prompt engineering and structured JSON outputs * Backend/API development * Camera input and media processing * Prototype UI or demo interface development * Fast MVP execution under a 12-week timeline **Nice to Have:** * Experience with AR, spatial media, projection-style visuals, or reflective display demos * Experience creating AI characters, animated scenes, or generative video workflows * Unity, WebGL, React, Flutter, Swift, Kotlin, or similar demo-rendering experience * Experience building confidential product prototypes or investor demo systems **Confidentiality Requirements:** The selected freelancer must sign an NDA, assign all work product/IP to the client, and keep the entire project fully confidential. The project, concept, screenshots, demos, prompts, workflows, code, and outputs may not be shared publicly or used in a portfolio without written permission. **Application Requirements:** Please share relevant examples of projects involving AI vision, OCR, multimodal AI, generative image/video, AR/spatial media, or interactive prototype systems. Screenshots, short videos, documents, or links are welcome. Please also describe what you would build in the first 2 weeks to prove technical feasibility.

$500.00
Fixed-price
Expert
Experience Level
Remote Job
Ongoing project
Project Type

Contract-to-hire opportunity

This lets talent know that this job could become full time.
Learn more

Skills and Expertise

Mandatory skills

Web Application

AI Agent Development

Activity on this job

Proposals:20 to 50
Interviewing:
0
Invites sent:
0
Unanswered invites:
0

About the client

Member since Dec 26, 2020

South Korea
Seoul1:23 AM
$67K total spent
85 hires, 32 active
282 hours
Large company (100-1,000 people)

Explore similar jobs on Upwork

Software DeveloperHourly‐ Posted 7 months ago

ASP.NET MVC

Django

Python

AngularJS

JavaScript

jQuery

WordPress

Google Chrome Extension

React

CRM Development

Microsoft Dynamics 365

Microsoft Dynamics CRM

Microsoft Dynamics Development

Microsoft PowerApps

Single Sign-On

Build Marketplace on TokopediaHourly‐ Posted 4 weeks ago

PHP

HTML5

JavaScript

Web Development

How it works

Create your free profile
Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you want
Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securely
From contract to payment, we help you work safely and get paid securely.