PDF Data extraction, Anki SRS, Excel, Java, Visual basic, HTML

Closed - This job posting has been filled and work has been completed.
Web, Mobile & Software Dev Scripts & Utilities Posted 3 years ago


Hours to be determined
More than 6 months


This project will involve two parts.  Preference will be for someone who will be able to complete both parts for continuity sake.
I am also interested in other small tasks specified in the "Other Skills" section.

Specify whether you would like to work on:
-Part 1
-Part 1+2
-Part 1+2 and Other

Part 1
I have spent some time trying to extract captions from PDF files using regular expressions and extracting tools in Acrobat X, but do not have a firm enough handle on PDF structure in order to extract the text consistently. I have several medical textbooks for which I would like to extract the images, corresponding captions, and create either an excel or csv file containing the captions and image file names for my personal study.  We would start with a single 900 page textbook extraction and work forward from there.

Part 2
Search and Export Data/Image pair by phrase Excel VB macro / java/ etc...
I would like to be able to type a phrase, such as 'Penicillin' and generate a CSV file with the following structure:
{Penicillin, img1file, Img 1 caption, img2file, Img2caption, etc....}
These will be for import into a flashcard program (Anki) from the initial database.

Other small projects I am interested in working on would involve skills such as:
**Anki 2.0 SRS template creation
Website/HTML data extraction
SDK development

Skills: pdf, import

About the Client

(5.00) 1 review

United States
Miami 05:30 PM

4 Jobs Posted
50% Hire Rate, 1 Open Job

$412 Total Spent
2 Hires, 1 Active

$4.63/hr Avg Hourly Rate Paid
78 Hours

Member Since May 5, 2010