You will get a Custom Computer Vision Solution for Images & Videos

Name: You will get a Custom Computer Vision Solution for Images & Videos
Availability: InStock

Ahmed N. Ahmed N.

Ahmed N. Ahmed N.

Project details

Computer Vision is not about running a pre-trained model — it’s about building a reliable system that works on your real data.

I will design and develop a custom computer vision solution tailored to your use case, whether it’s image analysis, video processing, object detection, recognition, or tracking. The solution will be built with clean architecture, tested for accuracy, and delivered in a production-ready format.

This service is ideal for:
o) Object detection & tracking
o) Image classification
o) Face recognition & verification
o) Pose or activity recognition
o) Video analysis & monitoring
o) Custom vision-based automation

✅ What you’ll get
o) Problem analysis & solution design
o) Custom computer vision model or pipeline
o) Data preprocessing & augmentation
o) Model training or fine-tuning
o) Inference pipeline for images or videos
o) Evaluation results & performance metrics
o) Clean code with usage instructions
o) API or script-based delivery

Machine Learning Tools

Amazon SageMaker, ChatGPT, Google Sheets, GPT-3, Keras, Microsoft Excel, MLflow, NLTK, NumPy, OpenCV, pandas, Python, Python Scikit-Learn, PyTorch, scikit-learn, SciPy, SQL, TensorFlow, Tesseract OCR, Vertex AI, Word2vec, XGBoost

What's included

Service Tiers	Starter $249	Standard $399	Advanced $549
Delivery Time	3 days	5 days	7 days
Number of Revisions	1	2	3
Number of Model Variations	1	2	3
Number of Scenarios	1	1	2
Number of Graphs/Charts	1	2	3
Model Validation/Testing
Model Documentation
Data Source Connectivity
Source Code

About Ahmed

View profile

View portfolio

AI Engineer | RAG Systems, Agentic AI, Computer Vision, Automations.

Karachi, Pakistan - 7:44 pm local time

I build production AI systems that actually ship — RAG pipelines, agentic workflows, and computer vision solutions used by real users, not just demos.

If you're building an AI product and need an engineer who can take it from prototype to deployed system, I'm likely a fit.

🔧 What I Build

➔ RAG Systems & LLM Pipelines
Production-grade retrieval pipelines with hybrid search (dense + sparse), reranking, citation grounding, and hallucination reduction. Built on LangChain, LlamaIndex, pgvector, Pinecone, and ChromaDB — connected to your actual data, not generic demos.

➔ Agentic AI & Automations
Multi-step AI agents with tool use, autonomous web research, structured output validation, and workflow orchestration via n8n. Systems that make decisions and execute tasks — not just chatbots that answer questions.

➔ LLM Fine-Tuning
Parameter-efficient fine-tuning using LoRA/QLoRA across multiple model architectures including Gemma, LLaVA, iDefics, and InternVL on custom
vision-language datasets. Full pipeline: dataset prep, training, eval, and inference-ready weights.

➔ Computer Vision
Object detection and tracking (YOLO, ByteTrack), pose estimation (Mediapipe), OCR pipelines (Tesseract, EasyOCR, Google Vision API),
and multimodal document understanding. Deployed on real video and image data.

➔ Document Intelligence & OCR
Automated extraction pipelines for PDFs, scanned documents, and structured forms. Integrates Google Document AI, GPT-4 Vision, and custom post-processing — reducing manual document handling significantly.

🧰 Core Stack

LangChain · LlamaIndex · OpenAI · Groq · Gemini · Hugging Face
Python · FastAPI · Flask · Next.js · Streamlit
pgvector · Pinecone · ChromaDB · PostgreSQL · MongoDB
AWS · GCP · Docker · n8n · Supabase · Vercel
OpenCV · YOLO · Mediapipe · Tesseract · PyTorch · TensorFlow

📁 Portfolio includes: RAG document pipelines, multimodal LLM fine-tuning, medical Q&A systems, computer vision sports analytics,
OCR automation, and full-stack AI applications.

🚀 How I Work

I don't disappear after handing off code. Every project includes:
— Clean, modular, well-documented code you can maintain
— Clear milestones with regular progress updates
— Honest scoping upfront — if something will take longer or cost
more than expected, I'll tell you before we start, not after

I work best with clients who have a clear problem and want an engineer who takes ownership of the solution end-to-end.

💼 Project Types I Take On

✅ RAG chatbots and document Q&A systems
✅ Agentic workflows and AI automation pipelines
✅ LLM fine-tuning for domain-specific tasks
✅ Computer vision systems (detection, tracking, OCR)
✅ Full-stack AI applications (Python backend + modern frontend)
✅ n8n / workflow automation with AI integration

🔍 A Few Things I've Built

— Finance app with Urdu voice input, SMS parsing for Pakistani banks, RAG-powered chat over transactions, and zero monthly API cost
— Multimodal LLM fine-tuning across 4 architectures (Gemma, LLaVA, iDefics, InternVL) using LoRA/QLoRA on custom vision-language data
— Medical RAG system with hybrid search, citation grounding, and hallucination reduction over clinical PDFs
— Football analytics pipeline: player tracking, team classification via SigLIP embeddings, pitch homography, Voronoi control maps
— Document extraction API deployed on AWS, integrated into n8n workflow for end-to-end automation with Google Sheets export

If you're building something in this space, message me with a brief description — I'll give you a straight answer on fit, timeline, and approach.

Steps for completing your project

After purchasing the project, send requirements so Ahmed can start the project.

Delivery time starts when Ahmed receives requirements from you.

Ahmed works on your project following the steps below.

Revisions may occur after the delivery date.

Requirements Review & Feasibility Analysis

Data Preparation & Pipeline Design

Review the work, release payment, and leave feedback to Ahmed.

Select service tier

Starter$249

Standard$399

Advanced$549

Basic Vision Task

Single use case Images only Simple detection or classification

Delivery Time 3 days
Number of Revisions 1
Number of Model Variations 1
Number of Scenarios 1
Number of Graphs/Charts 1
- Model Validation/Testing
- Model Documentation
- Data Source Connectivity
- Source Code

3 days delivery — Jul 3, 2026

Revisions may occur after this date.

Upwork Payment Protection

Fund the project upfront. Ahmed gets paid once you are satisfied with the work.