Language learning tool: High-frequency sentence identifier
Purpose: allow language learners to identify and collect sentences that are particularly valuable to study and memorize because they use several high-frequency words and do not include any obscure words.
Suggested mode of operation:
-For each language, the program would draw upon data from at least two sources:
(1) a spreadsheet (CSV) or other easily manipulable data storage mechanism in which the 10,000 or 20,000 most column words of a language are pasted. In column one, assign the word a frequency score: the higher the frequency, the higher the score. So, for example, if the list has 500 words, the least common (most obscure) word in that list could be assigned a score of "1," and the most common word in the list could be assigned a score of 500.
(2) a list or corpus of example sentences, whether scraped in real-time from Wikipedia or other websites (e.g. Project Gutenberg, Tatoeba), or manually created and made...