We have paper copies of phone books with approximately 5,000 individuals that we need digitized. Each page (which we will scan to PDF) contains approximately 25 names, so there will be around 200 pages to process.
A sample page (with private info blacked out) is pasted. The pages are taken from several different phonebooks, so the formatting might not always be identical to this page. Notice that although it is usually one student per line, occassionally one entry will have multiple lines, for example if a student has 2 phone numbers instead of 1.
The deliverable should be an excel file that contains the information in a standard format (which I will provide), eg: Source Phonebook Name, Student Name, Phone#, Phone#2, Street Address, City, State, ZIP, Mother Name, Father Name.
You are free to use OCR if that's easier, or resort to manual entry if that is easier. If you use OCR, it is expected that you will manually check the results to ensure accuracy.