AI Engineer for AI Answer-Sheet Evaluation Product (Milestone-Based)

Posted 6 days ago

Worldwide

Summary

I’m building an AI product for educational institutions that helps evaluate descriptive / handwritten answer sheets faster and more consistently. The product is intended to support a teacher / evaluator workflow where they can: - Create an exam / evaluation session - Upload: question paper, marking scheme / rubric, scanned student answer sheets - Receive AI-generated outputs for each answer sheet, including: extracted answers question-by-question, suggested marks, grading justifications, confidence flags, inline annotation suggestions / comments tied to relevant answer content - Review the answer sheet in a teacher-facing UI - Edit marks and comments - Add / edit inline annotations directly on the answer sheet - Finalize marks and export results I’m looking for an AI Engineer who can own the AI pipeline for answer-sheet understanding, grading, and inline annotation generation for the product. There will be a separate full-stack engineer building the UI layer. Your role is to build the AI backend layer that powers that product experience. You will own the AI intelligence layer of the product, including: - answer-sheet ingestion and OCR pipeline - parsing and structuring uploaded answer sheets - question-wise answer segmentation / mapping - rubric-aware grading / scoring suggestions - grading justifications and confidence signals - inline annotation / feedback generation tied to relevant answer content - creating evaluation datasets and quality measurement for the AI outputs - API / structured outputs that the full-stack app can consume reliably - Iterative quality improvements of the AI pipeline What the AI system needs to do Given: a question paper, a marking scheme / rubric, scanned student answer sheets - the AI pipeline should produce structured outputs that support the teacher review workflow. At a high level, the pipeline should do the following: - Extract text / content from uploaded answer sheets - Identify which answer content belongs to which question / sub-question - Generate question-wise structured answer outputs - Score each answer against the input rubric / marking scheme - Generate short grading justifications and confidence signals - Generate inline annotation suggestions tied to relevant answer content - Return all of this in a structured format that the web app can display Milestone-based engagement structure Milestone 1 Deliverables — By Week 2 - creating a sample evaluation dataset for testing - working first-pass AI pipeline on the sample dataset - structured segmentation outputs with extracted answer text, question number, page references - For segmented answers, generate structured grading outputs - structured annotation outputs - typo / spelling comments, missing-point comments, general feedback comments - AI-to-app integration contract / API output format - Initial quality baseline for segmentation, grading and annotation quality Milestone 2 Deliverables — By Week 4 - Improve handling for multi-page answers, sub-questions / nested numbering where relevant, noisy OCR / imperfect scans - High grading quality - score consistency, justification quality, rubric alignment - Good annotation quality for inline review of answers - Cleaner confidence / failure outputs for the app - Stable evaluation framework with reliable metrics and result generation Milestone 3 Deliverables — By Week 6 - Fully ready AI pipeline for the agreed scope - final improvements to grading / segmentation / annotation quality - stable structured outputs for the app - Fully stable integration of AI pipeline with the application and UX layer - Fully working and automated evaluation framework - Comprehensive quality evaluation results output and summary Availability and collaboration requirements - This product will be built in a fast iteration loop with multiple contributors, so there is a requirement to be collaborative and responsive during the project. - Be available on chat for quick questions, clarifications, and design / implementation discussions - Attend short daily standups / check-ins to share progress, blockers, and next steps - Collaborate effectively with other engineers working on adjacent parts of the product and actively manage dependencies across the AI and application layers - Participate in code reviews for relevant changes made by other engineers so the overall MVP stays coherent and integrated - Handle open-ended product / technical problems as they come up during the project - Comfortable balancing speed with enough structure and quality for a real product

  • $1,000.00

    Fixed-price
  • Intermediate
    Experience Level
  • Remote Job
  • Ongoing project
    Project Type
Skills and Expertise
Mandatory skills
Generative AI
AI Evaluation
Activity on this job
  • Proposals:10 to 15
  • Last viewed by client:4 days ago
  • Hires:
    1
  • Interviewing:
    1
  • Invites sent:
    0
  • Unanswered invites:
    0
About the client
Member since May 31, 2021
  • India
    Mumbai7:07 PM
  • $200 total spent
    3 hires, 2 active

Explore similar jobs on Upwork

Gen AI Developer (Contract)Fixed-price‐ Posted 1 month ago
AI Agent Development
Python
JavaScript
API
Node.js
Deep Learning
React
PostgreSQL
Quantum Computing

How it works

  • Post a job icon
    Create your free profile
    Highlight your skills and experience, show your portfolio, and set your ideal pay rate.
  • Talent comes to you icon
    Work the way you want
    Apply for jobs, create easy-to-by projects, or access exclusive opportunities that come to you.
  • Payment simplified icon
    Get paid securely
    From contract to payment, we help you work safely and get paid securely.
Want to get started? Create a profile

About Upwork

  • Rating is 4.9 out of 5.
    4.9/5
    (Average rating of clients by professionals)
  • G2 2021
    #1 freelance platform
  • 49,000+
    Signed contract every week
  • $2.3B
    Freelancers earned on Upwork in 2020

Find the best freelance jobs

Growing your career is as easy as creating a free profile and finding work like this that fits your skills.

Trusted by

  • Microsoft Logo
  • Airbnb Logo
  • Bissell Logo
  • GoDaddy Logo