Go www.Kaggle.com. Create an account. Click on Datasets. Download a few datasets that interest you. Open the datasets in Excel.
The objective is that you tell a story/scenario about relationship between variables or prediction in future.
In doing it, you should
- Select a dataset that you can run regression
- The relationship between response variable and predictors (independent variables) should be meaningful (You may clean the data by removing some of variables or observations)
- There is requirement minimum of 100 observations and 4 predictors.(10pts)
- You should run the regression and interpret the result ( in terms all the key elements you learnt in interpreting the generated summary output (30pts)
- Use two different confidence interval ( 99% and 90%) compare the result with each interval (20pts)
- For each of the two regression you run, plot generate residual vs fitted plots and normal probability and interpret the plots (15pts)
- Creativity and meaningful application of what you learnt in regression topic is highly important (15pts)
- Format of Word report (10 pts)
Two pages word document (approximately 1000 words) report on the story you told or the problem you have addressed with the selected database. Do not copy and paste the result of table of plot in the word document.
Name the result of regression as Table (1) and Table (2) and the same for plots .Put them in appendix at the end of the report and in the report indicate to them with their number.
Submissions are one word document report and one excel project file
Submission has two phases, the first phase you will submit the project by the first deadline. In two weeks you will receive feedback on the project you have a ten days to go over the comments and improve the project and submit it before the second deadline. The first part has 80% and the second part has 20% of whole the project points.
WARNING: The Kaggle data set may download as a CSV file. Save the file as an Excel file before you do any work