I'm looking for someone to write the script and walk me through what they did in this article in detail and set it up:
If you notice, it looks like they did a bit of cleaning of the data as well as they imported it into SOLR (or ElasticSearch if it can handle this much data)... Actually, I may want to load them into Amazon Cloudsearch. Or, if it's cheaper, we could use SolrCloud and something like this: https://aws.amazon.com/marketplace/pp/B008ASMOSK (see reviews, could be issue w/ disk permissions). I would love instructions also on how I could move the cluster we create over to a dedicated server stack if it's a lot cheaper (and it could be, please advise).
So, it looks like the import script was in Ruby. If you read the article carefully, you'll see what they did.
Next, I'd love to get pointed to some ways to configure Solr or Cloudsearch (which I think is like SOLR) so I can work on indexing, ranking and query serving this much data, and also learn how I can work with it in a "search engine" format to get started. A panel of some sort would be great or some direction on how to search the data in a web interface.
Please let me know: #1) your experience in doing something like this, #2) how long it would take you, #3) what you would do, and #4) how much it would cost.