I need an application (the programming langauge is irrelevant, but preferably java) that will do OCR for text on colored images. the task is somewhat similar to CAPTCHA recognition, as the text background is always colored and the images are usually of small size. I’m open to buying an existing solution you have or developing a new one from scratch.
the app should be able to:
1) parse the csv file with links to various pictures and animated GIFs (formats to be supported are JPG, PNG, GIF, animated GIF)
2) for each picture: download it and recognize the text on it
3) provide an export file with the following structure:
image link | text on the image
ad.co/a.png | omg! I love this!
af.co/b.gif | When I was your age, phones did not exist, son.
here are the examples of pictures and gifs you'd be dealing with:
we already have a large set of images with the text on them recognized (~10000 images), we can produce a much larger set (~100,000) within a week if you need it for training.
we have tried using tesseract (without any modifications) and got horrible results (<20% accuracy), so your solution shouldn’t rely on tesseract.
when you bid on this task, please include the following:
1) tell me if you have a ready solution or if you are going to be developing a new one
2) (if **2** is not available) give the best estimate on how long it'll take to develop a workable prototype
3) tell us what the solution is written on (c, c++, java, python, etc)
Bidders with a ready solution on java are obviously preferred.