We have recently won a grant to develop our new product, Pikhaya, which uses open data to provide free market intelligence to help entrepreneurs find viable business premises.
There are 350 local authorities in England and Wales, of which 70 currently publish compliant data we need to include in our model.
We will need, on a quarterly basis, to extract data from their websites, transform it into a common, machine-readable (CSV) format, and then upload it into our database.
This is the first time we are running this process and we are aiming to hire three different freelancers to do this. After the first round of ETL, the most accurate freelancer would then work with us on a regular basis to produce the data we require.
Scope of work:
70 current local authorities, most with only one dataset (usually Excel or CSV) to download and transform. An example includes Chorley (http://chorley.gov.uk/Pages/AtoZ/Information.aspx). We are specifically interested in the empty business premises which is not always available (sometimes only ‘all’, sometimes only ‘occupied’). On the example page, you’ll need to click on Freedom of Information and then the file-name: ‘Chorley - Ratepayer account data February 16.csv’
You will need to be comfortable with transforming messy, inconsistently-structured tabular data into standardised machine-readable format, and with a high level of data-accuracy.
We’ll provide you with the details in Google Sheets for each local authority site where you need to do a download. We are also in the process of filing Freedom of Information requests for the other 280 local authorities (so we may have more shortly).
Please respond with:
1 The time you think it will take you (and we’d like to have it complete by early March);
2 Examples of work which reflect the approach you would take with our brief (i.e. something relatively similar);
3 How you will ensure data quality;
4 An estimated price for the full job (although you can bill by the hour).