The project is concerned with producing a self learning system via unsupervised or semi supervised data. The Language Model is to be worked out and also the precise NN structure.
You should have strong skills in neural networks as applied to solving language problems. You should have a general knowledge of language problem models including Shannon entropy, n-grams, the curse of dimensionality, computational difficulties in terms of time, RNN, NNLM, RNNLM, Bayes Theorem, Markov's principle, Backpropagation, BPTT, corpora, perpelxity, smoothing (Kneser-Ney), empirical methods, and how to implement a NN (eg using Java).
You need to have basic linux skills and the ability to adapt and learn.
The main task will be to analysis data and produce a language model which is then implemented as NN.
You should have familiarity with the work of Bengio, Schwenk, Rosenfeld, and others who have worked in the field.
Most of this work is working out the designs and modelling, not Java development, but it does involve some to implement the solution (eg using neuroph).
You should be motivated by self learning problems and understand the computation problems in training systems.
You will have an opportunity to work with others so you need not know everything aspect of the above listed subjects, but an ability to learn new material effectively is important.