Hourly - Intermediate ($$) - Est. Time: Less than 1 month, 10-30 hrs/week - Posted
I need OCR solution to convert the domain names provided in the links as images. https://www.afnic.fr/en/products-and-services/services/daily-list-of-registered-domain-names/ please let me know 1. how accurate the results can be, I need close to 100% accuracy. 2. what language stack you are going to use for this.
Skills: OCR Tesseract OCR algorithms
Hourly - Intermediate ($$) - Est. Time: 3 to 6 months, Less than 10 hrs/week - Posted
Hello, I am looking for a solution that will analyze a jpeg image file, and extract the text data and dump it into a database in an unstructured form. The next step would be to then analyze that unstructured data and pull out the key pieces of information to then create a structured database with specific data. USE CASE: The idea is to take the daily image files put out by the county with information such as deeds, debts, and liens. Scan those images, extract all of the text within each document, and then be able to add certain bits of information to a database. For example, let's say that there is an image file that includes a notice of a mortgage for a property. This image would include certain bits of information such as the "folio number" which is the unique property identifier for the property with the county, the mortgage amount, and maybe the terms of the mortgage. I would like for the solution to be able to extract that data, and then dump that into a database table, so that I can then link it with another table about the property. I was thinking of using apache tika to extract the data, and then Pig to parse it. However, if you are an expert in this, maybe you have a better way.
Skills: OCR Tesseract Machine learning OCR algorithms OpenCV
Hourly - Intermediate ($$) - Est. Time: 3 to 6 months, Less than 10 hrs/week - Posted
I'm looking someone with experience in performing Optical Character Recognition (OCR) for scanned PDFs. I have many thousands of scanned PDFs that I need the text of to be used for an internal project. The scanned PDFs contain blocks of text and also tables that would require OCR. The nature of this project requires that the OCR be as close to 100% accurate as possible. The use of any technology is acceptable (tesseract, ABBYY, etc) as long as the OCR of the PDF files is as close to 100% as possible. I will provide the files in PDF format via Dropbox and the deliverable format should be in .txt format (no formatting other than line breaks required). Proficiency in english (written and spoken) is a must-have requirements for this job and be able to communicate status updates and issues. There is a short term need to digitize 2,000 files and potential for follow on work up to 500 files a month there after.
Skills: OCR Tesseract Adobe PDF Computer vision English Spelling
Hourly - Expert ($$$) - Est. Time: More than 6 months, Less than 10 hrs/week - Posted
Hello, I am looking for help in expanding my OCR team to help with additional tasks and OCR problems to solve. I am looking specifically for someone with experience in: 1. Tesseract. 2. OpenCV. 3. MATLAB. This is a long-term opportunity. Look forward to hearing from you.
Skills: OCR Tesseract MATLAB OpenCV Tesseract