EPUB to XML conversion library in Python

Closed - This job posting has been filled and work has been completed.
Web, Mobile & Software Dev Scripts & Utilities Posted 2 years ago

Hourly Job

Part Time
1 to 3 months

Details

I'm looking for an experienced Python developer, with an attention to detail. Ideally you know a bit the lxml library, XSLT transforms, and the EPUB format.

The job is to write a small Python library that:
(Part 1) converts an EPUB file to an XML document with a specific syntax (see below).
(Part 2) the opposite, ie. convert an XML document to EPUB.

XML document syntax
======================
I will give you the complete reference for the syntax but essentially it is simply:
<document>
<title>My doc</title>
<chapter>
  <page>
   <title>Introduction</title>
     <section>
       <subsection>
         <text>Lorem ipsum dolor sit amet...</text>
       </subsection>

       [...]

     </section>
  </page>
</chapter>
</document>

For (Part 1), I will provide a small open-source library to read an EPUB file. You will then use lxml to parse the EPUB pages and convert them to XML. The difficulty is to retrieve the document structure (section & subsection) from the EPUB html titles h1..h6.

(Part 2) will probably need an XSLT transformation and an EPUB creation library.


About the Client

(4.99) 29 reviews

Japan
Tokyo 01:21 PM

40 Jobs Posted
78% Hire Rate, 1 Open Job

Over $20,000 Total Spent
38 Hires, 0 Active

$12.11/hr Avg Hourly Rate Paid
1,947 Hour

Member Since May 18, 2009