I am an almost blind bioinformatics graduate student. My dissertation focuses on improving our understanding of the mechanisms underlying the aging process to develop novel ways for delaying, stopping and hopefully reversing their adverse effects in order to extend our lifespan and health-span.
Since I am legally blind, I cannot perform experimental wet-lab work efficiently. That is why my direct research approaches, methods, analysis and activities are pretty much limited to those, which can be performed electronically on computers, because – in contrast to wet-lab data – computers can be made more accessible for the visually impaired by using screen reader and screen magnification software, such as Zoomtext (see http://www.aisquared.com/products/).
Therefore, In order to graduate I need to predict, which manipulations, e.g. changing the gene expression pattern, protein abundance distribution, epigenetic alterations, environmental conditions; nutrient composition and abundances, etc. could extend lifespan. Due to lack of funding we cannot generate our own data. Hence, in order to publish in peer-reviewed journals we depend on already publically available information and databases for making new discoveries, which have not already been made by their original publishers.
We use the yeast as model organism because my adviser has a yeast lab, in which he can validate my predictions. If I can show my adviser strong enough supporting evidence that a certain predicted longevity manipulation could indeed extend the lifespan in yeast, then he will make this manipulation (e.g. knocking out or over-expressing genes accordingly) in his yeast lab. If he can show that this can indeed extend lifespan, then I can graduate.
The teaching materials, e.g. electronic textbooks, slides, exercises, assignments, homework, tutorials, solutions,, projects, commands, review sheets, etc., which are needed for my genomics, bioinformatics, machine-learning and network analysis classes, are available in digital format online or on my hard drive. Those four courses require very sophisticated skills of the R
Some of the R skills I need help to master are listed below:
1) Downloading data from the internet into R environment
2) Inspecting data using R
3) Plotting genomic data using R
4) Making statistical summaries of data
5) Checking data for obvious problems, e.g. parsing errors (strange symbols)
6) Network analysis, e.g. R package iGraph
7) Making custom charts (e.g. labels, etc.)
8) Checking whether 2 analyses give the same results?
9) Efficiently entering data found in tables
10) Extracting data from charts from papers
11) Using R with version control system (git)
12) Clustering and categorizing data
13) Taking subsets of data
14) Plotting hierarchical data over time
15) Analyzing time series data, e.g. R packages, e.g. Zoo, XTS
16) Analyzing distributions, e.g. histograms, normality testing
17) Interpolating missing data
18) Detecting outliers
19) Exporting charts in vector format, e.g. PDF
20) Using Rmarkdown, knitR and LaTeX to allow reproducible analysis
21) Using the R packages Bioconductor and Limma
22) Gene enrichment analysis
23) Visualizing data from different sources in the same plot using different scales
24) Converting between different nomenclatures, gene identifiers, species
25) Replicating published analytical results based on submitted raw data
26) Analyzing transcriptomic, proteomic and epigenetic datasets
27) Applying machine learning
28) Predicting gene- and protein functions, regulations, transcription factors (TF) and transcription factor binding sites (TFBS)
29) Looking for mathematical similarities and differences in the behavior of time series functions for changes in gene expression pattern as well as protein and metabolite abundances to be used for grouping and prediction