General idea is to run crawler for previously added domains and collect all external links and save them into mysql database. This process needs to be repeatable for given or new domains on daily/weekly basis. Crawler needs to have a simple management panel based on known framework xcrud - to see all results and manage crawler options.
Simple panel management to see and control crawling process
Crawler needs to collect external links, has to have an option to collect all external links or links which are containing particular domains names taken from previously created list
Save all links to mysql database
Detect if site is not responding, if so, save this event into database
Add / remove / edit new domains to crawl
Run / stop crawler from management panel
Able to start task form the last failure
Able to see actual crawling process
Able to send email notifications if crawler will suddenly stop
Able to add domain to the domains list by url: http://mycrawler.com/add_domain.php?domain_name=www.newdomain.com
(at this point only domain adding is required) - script should return result
(added or display error if occurred )
Xcrud framework will be provided.
Dev space will be provided.
More details will be provided if you are interested in.