In an app for scanning invoices, receipts, and generally paper with text on it, I need to rotate and crop the image of the invoice and/or receipt.
I'm looking for a CNN to do this, preferable implemented in convnetjs as I think the network should be fairly light weight.
A number of receipts will be provided, but they are unlabeled.
This is a classification+localization problem where a dual-head CNN might be appropriate.
Thus the image should be classified as containing a piece of paper, and the coordinates of the 4 corners should be estimated (for cropping).
Based on feedback, I have posted a separate job for the labeling part. I will be providing 500 labeled images. The labels are in the form of bounding boxes in a GDocs document.
Although this might not be enough data to get a high quality model, I expect to re-train the model at some point in the future when I have more data.
Because the labels are now bounding boxes, the job will not cover skew or rotation, only cropping.
Some (minimal) preparation is needed to get the labels into proper form. If you have problems with that, I can help you.
Also, the framework will be TensorFlow, not convnetjs. The model should fit on a mobile phone though (TensorFlow supports this).