AI Engineer for AI Answer-Sheet Evaluation Product (Milestone-Based)

Posted 6 days ago

Worldwide

Summary

I’m building an AI product for educational institutions that helps evaluate descriptive / handwritten answer sheets faster and more consistently. The product is intended to support a teacher / evaluator workflow where they can: - Create an exam / evaluation session - Upload: question paper, marking scheme / rubric, scanned student answer sheets - Receive AI-generated outputs for each answer sheet, including: extracted answers question-by-question, suggested marks, grading justifications, confidence flags, inline annotation suggestions / comments tied to relevant answer content - Review the answer sheet in a teacher-facing UI - Edit marks and comments - Add / edit inline annotations directly on the answer sheet - Finalize marks and export results I’m looking for an AI Engineer who can own the AI pipeline for answer-sheet understanding, grading, and inline annotation generation for the product. There will be a separate full-stack engineer building the UI layer. Your role is to build the AI backend layer that powers that product experience. You will own the AI intelligence layer of the product, including: - answer-sheet ingestion and OCR pipeline - parsing and structuring uploaded answer sheets - question-wise answer segmentation / mapping - rubric-aware grading / scoring suggestions - grading justifications and confidence signals - inline annotation / feedback generation tied to relevant answer content - creating evaluation datasets and quality measurement for the AI outputs - API / structured outputs that the full-stack app can consume reliably - Iterative quality improvements of the AI pipeline What the AI system needs to do Given: a question paper, a marking scheme / rubric, scanned student answer sheets - the AI pipeline should produce structured outputs that support the teacher review workflow. At a high level, the pipeline should do the following: - Extract text / content from uploaded answer sheets - Identify which answer content belongs to which question / sub-question - Generate question-wise structured answer outputs - Score each answer against the input rubric / marking scheme - Generate short grading justifications and confidence signals - Generate inline annotation suggestions tied to relevant answer content - Return all of this in a structured format that the web app can display Milestone-based engagement structure Milestone 1 Deliverables — By Week 2 - creating a sample evaluation dataset for testing - working first-pass AI pipeline on the sample dataset - structured segmentation outputs with extracted answer text, question number, page references - For segmented answers, generate structured grading outputs - structured annotation outputs - typo / spelling comments, missing-point comments, general feedback comments - AI-to-app integration contract / API output format - Initial quality baseline for segmentation, grading and annotation quality Milestone 2 Deliverables — By Week 4 - Improve handling for multi-page answers, sub-questions / nested numbering where relevant, noisy OCR / imperfect scans - High grading quality - score consistency, justification quality, rubric alignment - Good annotation quality for inline review of answers - Cleaner confidence / failure outputs for the app - Stable evaluation framework with reliable metrics and result generation Milestone 3 Deliverables — By Week 6 - Fully ready AI pipeline for the agreed scope - final improvements to grading / segmentation / annotation quality - stable structured outputs for the app - Fully stable integration of AI pipeline with the application and UX layer - Fully working and automated evaluation framework - Comprehensive quality evaluation results output and summary Availability and collaboration requirements - This product will be built in a fast iteration loop with multiple contributors, so there is a requirement to be collaborative and responsive during the project. - Be available on chat for quick questions, clarifications, and design / implementation discussions - Attend short daily standups / check-ins to share progress, blockers, and next steps - Collaborate effectively with other engineers working on adjacent parts of the product and actively manage dependencies across the AI and application layers - Participate in code reviews for relevant changes made by other engineers so the overall MVP stays coherent and integrated - Handle open-ended product / technical problems as they come up during the project - Comfortable balancing speed with enough structure and quality for a real product

$1,000.00
Fixed-price
Intermediate
Experience Level
Remote Job
Ongoing project
Project Type

Skills and Expertise

Mandatory skills

Generative AI

AI Evaluation

Activity on this job

Proposals:10 to 15
Last viewed by client:4 days ago
Hires:
1
Interviewing:
1
Invites sent:
0
Unanswered invites:
0

About the client

Member since May 31, 2021

India
Mumbai7:07 PM
$200 total spent
3 hires, 2 active

Explore similar jobs on Upwork

Gen AI Developer (Contract)Fixed-price‐ Posted 1 month ago

AI Agent Development

Python

JavaScript

API

Node.js

Deep Learning

React

PostgreSQL

Quantum Computing Consultant – High-Dimensional Combinatorial Opt…Hourly‐ Posted 3 weeks ago

Quantum Computing

How it works

Create your free profile
Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
Work the way you want
Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
Get paid securely
From contract to payment, we help you work safely and get paid securely.