you will answer data mining questions for me to prepare exam. these are last years homework questions.
1- Refer to the milk transportation-cost data in table (T6.10.xls).
a. View the entire data set in three dimensions. Rotate the coordinate axes in various directions. Check for unusual observations.(Use matlab 3D Scatter Plot: scatter3 function)
b. For each dimension, plot box-plot and find the outliers.
c. Highlight the set of points corresponding to gasoline trucks. Do any of the gasoline-truck points appear to be multivariate outliers? Are there some orientation of x1,x2,x3 space for which the set of points representing gasoline trucks can be readily distinguished from the set of points representing diesel trucks.
2- The attached table (T1-6.xls) contains some of the raw data about the multiple-sclerosis. Two different visual stimuli (S1 and S2) produced responses in both the left eye (L) and the right eye (R) of subjects in the study groups. The values recorded in the table include x1(subject’s age); x2 (total response of both eyes to stimulus S1, that is, S1L+S1R); x3 (difference between responses of eyes to stimulus S1, S1L-S1R ; and so forth. The last column of the table shows MS: if 0 gives Non-Multiple-Sclerosis group, if 1 gives Multiple-Sclerosis group.
a. Plot the two-dimensional scatter diagram for the variable x2 and x4 for the multiple-sclerosis data. Comment on the appearance of the diagram for the each class distribution.
b. Compute the (covariance matrix) and normalized eigenvalue and eigenvectors.
c. Using the PCA method, find the necessary principal components and plot the data with scatter3 function in Matlab.
3- Verify that 1=9 , 2=4 and e1’=[2 1] , e2’=[-1 2] are the eigenvalues and eigenvectors of the
4- Using the matrix
a. Calculate A’A and obtain its eigenvalues and eigenvectors.
b. Calculate AA’ and obtain its eigenvalues and eigenvectors. Check that the nonzero eigenvalues are the same as those in part a.