We are looking for an experienced NLP expert to write Python code that generates a set of variations of a phrase (in U.S. English) that are semantically similar to the original. In the NLU system we are using for our chatbot, Dialogflow, entities are matched based on a set of phrases for each reference term within an entity's definition.
We want you to write code that computationally generates sets of phrases for each of a couple hundred reference terms (and/or human-generated equivalent phrases). As much of possible, these sets should include the full range of possible ways that a user could express the entity.
For example, a reference term might be "autobiography or biography of an artist". A list of computationally generated similar phrases should definitely include variations that directly express the concept, such as:
autobiographies of artists
biography of a painter
an artist's life story
book about an sculptor's life
Not required but desirable would be including phrases that would indicate the user has an interest in the reference term more indirectly. For example, this could include a phrase referencing an instance of the thing to which the term refers--"the story of Van Gogh's life" or "books about members of the Harlem Renaissance."
Our chatbot is intended for children ages 7-12, so prioritizing words and grammatical structures that people in this demographic are most likely to use would increase the effectiveness of the output.
We will provide lists of entities and their reference terms as CSV files. Your deliverables will be sets of similar terms for each reference term and the Python code that generated them.
We are open to the full range of linguistic (e.g., using NLTK with WordNet) and statistical approaches (e.g., using spaCy and Gensim) to the task at hand. However, we do not have a domain- or audience-specific training corpus we can provide.
Hours to be determined
Less than 1 month< 1 monthProject LengthDuration
I am willing to pay higher rates for the most experienced freelancers