Module Code: 
H8STATS2 
Long Title

Statistics II

Title

Statistics II

Module Level: 
LEVEL 8 
EQF Level: 
6 
EHEA Level: 
First Cycle 
Module Coordinator: 
Sophie Flanagan 
Module Author: 
ORLA LAHART 
Departments: 
School of Computing

Specifications of the qualifications and experience required of staff 
Master’s and/or PhD degree in a numerate / scientific discipline, with experience in practical applications of statistical techniques. May have industry experience also

Learning Outcomes 
On successful completion of this module the learner will be able to: 
# 
Learning Outcome Description 
LO1 
Analyse and select the appropriate statistical methodology to solve data analysis problems, or make predictions 
LO2 
Understand the concepts of normality, independence, and homoscedasticity for the selection of statistical tests and forecast technique 
LO3 
Critically evaluate the outcome of statistical significant tests using advanced concepts, such as statistical power, sample size, and multiple comparisons 
LO4 
Conduct advanced statistical analyses to answer real life questions and demonstrate ability to solve problems. 
LO5 
Interpret and clearly communicate the results of statistical tests to take informed decisions using data in the appropriate contexts. 
Dependencies 
Module Recommendations
This is prior learning (or a practical skill) that is required before enrolment on this module. While the prior learning is expressed as named NCI module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).

68040 
H8STATS1 
Statistics I 
Corequisite Modules

No Corequisite modules listed 
Module Content & Assessment
Indicative Content 
Inferential statistics revisited
Hypothesis testing. Oneway ANOVA, and ttest; discussion of tools and programming languages for statistics, such as Python, R, and SPSS

Exploratory data analysis
Correlations, ChiSquare test of independence, box plots, confidence intervals, simple linear regression

Regressions
Multiple linear and polynomial regressions, robust and quantile regressions, stepwise regression, and model selection

Normality
Tests and plots for normality, including QQplots, ShapiroWilk, and KolmogorovSmirnov tests, and Box Cox transformation. Reporting results

Statistical power & sample size
Cohen’s d and Hedges’s g, and other effect size suggestions. Power calculations

Twoway ANOVA
Review of main assumptions. Data preparation. Conduct, interpret, and report ANOVA results

Posthoc tests
Multiple comparisons and pvalue inflation. Tukey’s HSD and Bonferroni correction. Dunn’s and Dunnett's tests. False discovery rate. Reporting results

Nonparametric tests on contingency tables
One and twofactor analysis using Chisquared test for count data. Reporting results

Nonparametric tests on populations
MannWhitney, Wilcoxon, and KruskalWallis tests. Reporting results

Factor Analysis and PCA
Collinearity. KaiserMeyerOlkin test. Screeplot. Loadings and interpretation. Reporting results

Time Series Analysis
Smoothing data and seasonality. Forecasting with HoltWinters and ARIMA. Relationship between time series forecasting and supervised learning

Survival Analysis
Censored data, KaplanMeier estimator, and Cox model

Assessment Breakdown  % 
Coursework  50.00% 
End of Module Assessment  50.00% 
AssessmentsFull Time
Coursework 
Assessment Type: 
Assignment 1 
% of total: 
50 
Assessment Date: 
n/a 
Outcome addressed: 
1,2,3,4,5 
NonMarked: 
No 
Assessment Description: In this assignment the student will prepare data for twoway ANOVA, and any two other nonparametric tests from Week8 and Week9. The student may use Python, R, or SPSS, but should not rely on only one tool, variety is expected. It is not necessary to replicate any test you carry out, ie if you perform a test in R it is not necessary to repeat it in SPSS and/or Python. A data file from the Census of Ireland is suggested, though students are permitted to choose a different file if they wish (subject to approval by Lecturer). The task is to prepare a statistical report based on the data in the file 

End of Module Assessment 
Assessment Type: 
Terminal Exam 
% of total: 
50 
Assessment Date: 
EndofSemester 
Outcome addressed: 
1,2,3,5 
NonMarked: 
No 
Assessment Description: The end of semester examination paper which is two hours in duration. Usually learners are requested to answer four out of five questions. Question format will usually be of essaystyle but may also include other formats (e.g., a plan for an extended business data analysis project or a technical figure). Marks will be awarded based on clarity, structure relevant examples, depth of topic knowledge and an understanding of the potential and limits of solutions 

Reassessment Requirement 
Repeat examination
Reassessment of this module will consist of a repeat examination. It is possible that there will also be a requirement to be reassessed in a coursework element.

Reassessment Description Reassessment of this module will consist of a repeat examination. It is possible that there will also be a requirement to be reassessed in a coursework element.

NCIRL reserves the right to alter the nature and timings of assessment
Module Workload
Module Target Workload Hours 0 Hours 
Workload: Full Time 
Workload Type 
Workload Description 
Hours 
Frequency 
Average Weekly Learner Workload 
Lecture 
Weekly lecture 
24 
Per Semester 
2.00 
Tutorial 
Weekly tutorial 
12 
Per Semester 
1.00 
Independent Learning Time 
Research & autonomous student learning 
89 
Per Semester 
7.42 
Total Weekly Contact Hours 
3.00 
Workload: Part Time 
Workload Type 
Workload Description 
Hours 
Frequency 
Average Weekly Learner Workload 
Lecture 
Weekly lecture 
24 
Per Semester 
2.00 
Tutorial 
Weekly tutorial 
12 
Per Semester 
1.00 
Independent Learning Time 
Research & autonomous student learning 
89 
Per Semester 
7.42 
Total Weekly Contact Hours 
3.00 
Module Resources
Recommended Book Resources 


David Spiegelhalter. (2019), The Art of Statistics, Pelican, p.256, [ISBN: 0241258766].

Bernard Rosner. (2015), Fundamentals of Biostatistics, 8th. Cengage Learning, USA, [ISBN: 9781305268920].

Peter Dalgaard. Introductory Statistics with R, Springer, p.364, [ISBN: 0387790535].
 Supplementary Book Resources 


Andy Field. (2018), Discovering Statistics using IBM SPSS Statistics, 5th. SAGE Publications, Incorporated, p.775, [ISBN: 9781544328225].

EMC Education Services. (2015), Data Science and Big Data Analytics, John Wiley & Sons, Incorporated, [ISBN: 111887613X].

Eugene Demidenko. (2019), Advanced Statistics with Applications in R, 10, John Wiley & Sons, p.880, [ISBN: 9781118387986].
 Recommended Article/Paper Resources 


Valentin Amrhein, et al. (2019), Scientists rise up against statistical
significance, Nature, 567, p.3057,

Naomi Altman, Martin Krzywinski. (2017), P values and the search for significance, Nature Methods, 14, p.3–4,
 Other Resources 


[Website], Choosing the correct stat test.

[Website], The Khan Academy. http://www.khanacademy.org/.

[Website], Learn with Dr Eugene O’Loughlin. http://www.youtube.com/eoloughlin.

[Website], Central Statistics office. http://www.cso.ie.

[Website], Glossary of Statistical Terms. http://bit.ly/LIRYpQ.

[Website], HyperStat Online Statistics Textbook. http://davidmlane.com/hyperstat/.

[Website], The R Project for Statistical Computing. http://www.rproject.org/.

