I need the transcripts from ted.com scraped and put into well-formatted text files along with a metadata CSV file.
For each video at this address (https://www.ted.com/talks) I need the transcripts for each language put into a consistently formatted text file.
1. Each of the approximately 2,700 videos at https://www.ted.com/talks has the transcripts translated into many languages. For each of these videos, I would like to have a folder labeled with the talk title. Then, each individual language transcript for that video should be put into its own text file inside this folder. So if the video has 35 language translations, then the folder should have 35 text files, one for each language. Each of these files should be named with its corresponding iso 669-2 language code.
2. A metadata csv file with information found on each video's web page, where the main column is the folder name (i.e., talk_title) and there are columns for speaker_name, speaker_description, speaker_who_they_are, related_tags, recorded_date, number_of_views, and language_iso_codes.
3. The python script used to do all of this.
March 15, 2018
I am willing to pay higher rates for the most experienced freelancers