Senior AI/ML Engineer (Computer Vision, Multimodal, LLM, PyTorch

Posted 2 weeks ago

Worldwide

Summary

Senior AI and ML Engineer for Computer Vision, Multimodal, LLM, and PyTorch About the role We are an AI product company building a multimodal perception engine that fuses video, image, voice, and text into one conversational, uncertainty-aware pipeline. We are looking for a senior AI and ML engineer to own the research and development of the core models across computer vision, conversational NLP, and multimodal fusion. This is hands-on research and development. You will design, build, fine-tune, and quantitatively validate the models that power the engine. What you will do - Build the multimodal fusion pipeline that combines video, image, voice, and text. - Develop the computer vision components, including segmentation, recognition, and portion or 3D estimation. - Build the conversational, uncertainty-aware disambiguation layer using LLM and Transformer fine-tuning. - Adapt open multimodal foundation models through transfer learning, rather than training from scratch. - Define and run quantitative validation against measurable accuracy targets. Critical skills, must have - Strong Python and deep learning with PyTorch. - Computer vision for image and video, including segmentation, detection, and recognition. - Multimodal AI and vision-language models that fuse multiple modalities into one model. - NLP and LLM fine-tuning with Transformers and Hugging Face. - Experience adapting and fine-tuning open foundation models through transfer learning. Good to have - Video analysis and temporal models, 3D reconstruction, or monocular depth and volume estimation. - ASR and speech-to-text, for example Whisper. - Uncertainty quantification and model calibration. - Applied statistics and data science, including time-series correlation and multiple-comparison control. - MLOps, including experiment tracking, model serving, and cloud GPU on AWS or GCP. Requirements - Residency in an EU country is required. - A PhD in AI, computer vision, machine learning, or a closely related field is a big plus. - Able to work independently and own the model research and development end-to-end. To apply Briefly describe one multimodal or computer vision project you led, including the problem, the models and frameworks you used, and your specific role. Links to a portfolio, GitHub, or papers are welcome.

More than 30 hrs/week
Hourly
1-3 months
Duration
Expert
Experience Level
$30.00
-
$60.00
Hourly
Remote Job
Ongoing project
Project Type

Contract-to-hire opportunity

This lets talent know that this job could become full time.
Learn more

Skills and Expertise

Mandatory skills

Natural Language Processing

Activity on this job

Proposals:15 to 20
Last viewed by client:last week
Interviewing:
10
Invites sent:
30
Unanswered invites:
5

About the client

Member since Dec 1, 2019

Cyprus
Paphos3:18 AM
$16K total spent
16 hires, 1 active
401 hours

Explore similar jobs on Upwork

AI-Driven Social Media ManagerHourly‐ Posted 3 weeks ago

Social Media Marketing

Twitter/X Marketing

Graphic Design

Adobe Illustrator

AI Video Generation Expert NeededFixed-price‐ Posted 3 weeks ago

Graphic Design

Adobe Illustrator

Video Production

Illustration

How it works

Create your free profile
Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you want
Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securely
From contract to payment, we help you work safely and get paid securely.