I would like to use a headless browser with JS support to extract data from websites.
Specifically, I would like to use Chrome on CentOS using a Perl wrapper.
Please do not suggest any other technology for this project as I am specifically interested to try it with Chrome.
I am looking for a prototype that has the following functionality:
- Use Chrome in headless mode. (Optimize memory usage if possible)
- Invoke script with URL (http & https supported) and return full HTML response.
- Configure options:
* Use proxy (format: xxx.xxx.xxx.xxx:pppp)
* change user agent
* do not load inline images
* return full HTTP header
* delete cookies after ending
- Simple log file that shows total time for a request and memory usage.
- Install everything on my CentOS server and document how to install, so that I can install on another server myself.
I hope this is not an extremely long project, so I prefer if you could work on it and complete it within the next few days.
Less than 30 hrs/week
Less than 1 month< 1 monthProject LengthDuration
I am looking for freelancers with the lowest rates