Task： Choose an existing datasets or create your own data, carry out exploratory data analyses and regression analyses to explain the relationships among the variables involved.
Team members: For this project you may choose to work with 1-2 persons and submit a joint project. If you cannot find a team member, I will assign a teammate to you.
Grading policies: team members will receive the same grade for the project and it is up to you to make sure that the work is shared equitably. The total points of the project is 100 points, which can be divided into three parts:
Initial report (20 pts): due by 10/12/2015;
Presentation (20 pts): each team will give a 25 minutes presentation of the project; (dates to be assigned)
Final report (60 pts): due by a week from the final exam (TBA);
General guidelines of the project
Identify the problem of interest: choose a data set, describe the data set and identify the problem you are interested in;
Perform preliminary studies of the data: data visualization; check model assumptions, etc
Select most promising predictors: what variables are potentially most useful for your problem;
It is expected to have at least one variable that is not numerical (so you need to introduce a dummy variable for your regression model);
Choose the best regression model that serves your purpose and justify your choice (e.g. model diagnostics, outliers, normality, model selection criterion, etc…)
Interpret the final regression model;
Discuss your findings: What do the results mean?
Put all your codes in the appendix.
Find your own data set online (e.g. google “regression data set”), you will find plenty;
Make sure at least one of the predictors you use is categorical/factor (i.e. not numerical variables);
people/gang/regression_i/requirement.txt · Last modified: 2015/08/29 15:33 by gang