We use excel files as a way to synchronize our system with 3rd party providers.
The excel file structure depends on the type of data and the provider and we need a tool that will load the data to our database.
1. Excel files should be loaded from Amazon S3 the format shall be specified by the file path in s3
2. Excel files may not be in a table format
3. Data from the files should be extracted and transformed according to the schema of the type
4. Data from the file should be validated: duplicates, required values
5. Some values should be translated according to a lookup from values found in the system DB
6. Valid data should be inserted to the system DB (mysql)
7. Errors should be written to a file
8. There should be a console to view and manage process errors
9. The process should be configurable since new providers are added occasionally and formats may change