NCI Courses - H8MACL - Machine Learning

Module Code:	H8MACL
Long Title	Machine Learning
Title	Machine Learning
Module Level:	LEVEL 8
EQF Level:	6
EHEA Level:	First Cycle

Credits:	10

Module Coordinator:	Sophie Flanagan

Module Author:	ORLA LAHART

Departments:	School of Computing

Specifications of the qualifications and experience required of staff	Lecturer must have MSc or PhD degree in computer science or cognate discipline. Experience in lecturing machine learning and coding in Python. May also have industry experience. Lab Assistants are required for tutorials and they should have experience in Python coding and reasonable knowledge of machine learning techniques.

Learning Outcomes
On successful completion of this module the learner will be able to:
#	Learning Outcome Description
LO1	Comprehend, compare and contrast fundamental machine learning concepts and techniques
LO2	Comprehend and assess potential ethical implications of machine learning.
LO3	Extract, transform, explore and clean data in preparation for machine learning
LO4	Build and evaluate machine learning models on various problem domains
LO5	Summarise, critique and present results from machine learning for decision-making

Dependencies
Module Recommendations This is prior learning (or a practical skill) that is required before enrolment on this module. While the prior learning is expressed as named NCI module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).
No recommendations listed
Co-requisite Modules
No Co-requisite modules listed

Entry requirements

Module Content & Assessment

Indicative Content
Introduction and Ethics in Machine Learning Forms of learning (Supervised, Unsupervised, Reinforcement) Ethics in data sourcing & handling Review of regulatory & privacy components (including the Data Protection Act) Ethical implications of Machine Learning Methodologies (e.g., KDD, SEMMA & CRISP-DM) Review of basic data exploration statistics
Data Preprocessing Data cleaning (i.e., handling missing values, outliers, noise data) Data integration (i.e., entity integration problem, and handling of redundant, correlated, duplicated, and conflicting data) Data transformation (i.e., normalization, binning, log transformation, scaling) Data reduction (i.e., dimensionality reduction like PCA and MCA, attribute subset selection, sampling)
Regression Review of linear and multiple linear regression Assessing the model’s accuracy Model selection (i.e., AIC and BIC) Measuring predictors’ importance Subset selection Shrinkage methods
Classification Introduction to classification Review of logistic regression Review of k-nearest neighbours Classification performance measures (e.g., Confusion matrix, precision and recall, ROC curve)
Model Evaluation and Selection Bias-Variance trade-off Curse of dimensionality Evaluation methods (i.e., split validation, cross-validation, and bootstrap methods) Understanding, detecting and handling (massive) class imbalance
Unsupervised Learning Introduction to unsupervised learning Notions of distance and similarity Partitioning methods (e.g., k-Means, k-Medoids) Plotting and understanding clusters Cluster evaluation metrics (i.e., DBIndex, silhouette coefficient)
Tree-Based Models Decision Trees Regression and classification trees Node purity Pruning
Ensemble Models Bagging Random Forest Boosting Stacking
Naïve Bayes Classification Introduction to Naïve Bayes Bayes theorem Maximum a posteriori hypothesis Class conditional independence Naïve Bayes classifier
Introduction to Artificial Neural Networks Feedforward neural network architecture Sigmoid activation function Backpropagation
Introduction to Deep Learning Introduction to deep learning Deep feedforward networks Recurrent and recursive neural network Evaluation of deep learning
Text Analysis Text tokenization Text normalization Feature extraction (e.g., Bag of words model, TF-IDF model) Sentiment analysis

Assessment Breakdown	%
Coursework	100.00%

Assessments

Full Time

Coursework

Assessment Type:	Formative Assessment	% of total:	Non-Marked
Assessment Date:	n/a	Outcome addressed:	1,2,3,4,5
Non-Marked:	Yes
Assessment Description: Formative assessment will be provided on the in-class individual or group activities. Feedback will be provided in written or oral format, or on-line through Moodle. In addition, in class discussions will be undertaken as part of the practical approach to learning.

Assessment Type:	Assignment	% of total:	25
Assessment Date:	Week 4	Outcome addressed:	2,3
Non-Marked:	No
Assessment Description: Learners may be provided with one or more datasets and will be required to apply suitable data cleaning, pre-processing and transformation operations on different attributes of the datasets. In addition, learners will be required to identify and discuss ethical implications of handling and applying machine learning to these datasets.

Assessment Type:	Project	% of total:	75
Assessment Date:	Week 12	Outcome addressed:	1,3,4,5
Non-Marked:	No
Assessment Description: This assessment will evaluate learner’s comprehension of fundamental machine learning theory and concepts, their applicability and limitations to different problems. Learners will have to (1) identify a topic of interest and one relevant research or business question in that topic; (3) select at least two datasets useful to answer the question; (3) apply data pre-processing and transformation techniques to prepare the datasets for machine learning analysis; (4) perform exploratory analysis in these datasets; (5) apply, evaluate and optimize suitable machine learning techniques to extract knowledge from the selected datasets useful for a decision-making process in the topic of choice; (6) report and interpret the findings to answer the question of interest, and (7) elaborate a video presentation highlighting the project’s main objectives, methodology, main findings, challenges faced.

No End of Module Assessment

No Workplace Assessment

Reassessment Requirement
Coursework Only This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.
Reassessment Description This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.

NCIRL reserves the right to alter the nature and timings of assessment

Module Workload

Module Target Workload Hours 0 Hours

Workload: Full Time
Workload Type	Workload Description	Hours	Frequency	Average Weekly Learner Workload
Lecture	On-line/Classroom activities	24	Per Semester	2.00
Tutorial	Practical & Tutorial activities	36	Per Semester	3.00
Independent Learning	Independent Learning activities	190	Per Semester	15.83
Total Weekly Contact Hours				5.00

Module Resources

Recommended Book Resources
Ethem Alpaydin. (2020), Introduction to Machine Learning, 4th ed.. MIT Press, Cambridge, MA, p.712, [ISBN: 978-0262043793]. Shai Shalev-Shwartz, Shai Ben-David. (2015), Understanding Machine Learning, 2nd. Cambridge University Press, New York, NY, p.397, [ISBN: 978-1107512825]. Sebastian Raschka, Vahid Mirjalili. (2019), Python Machine Learning, Packt Publishing, Birmingham, p.770, [ISBN: 978-1789955750].
Supplementary Book Resources
Kartik Hosanagar. (2019), A Human's Guide to Machine Intelligence, Penguin, London, p.272, [ISBN: 9780525560890]. Trevor Hastie, Robert Tibshirani, Jerome Friedman. (2017), The Elements of Statistical Learning, 2nd ed.. Springer, New York, NY, p.767, [ISBN: 978-0387848570]. John D. Kelleher, Brian Mac Namee, Aoife D'Arcy. (2020), Fundamentals of Machine Learning for Predictive Data Analytics, 2nd ed.. MIT Press, Cambridge, MA, p.856, [ISBN: 9780262044691]. Wes McKinney. (2017), Python for Data Analysis, O'Reilly Media, Sebastopol, CA, p.550, [ISBN: 978-1491957660]. Aurélien Géron. (2019), Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, O'Reilly, Sebastopol, CA, p.819, [ISBN: 978-1492032649]. CHARU C. AGGARWAL. (2019), Neural Networks and Deep Learning, Springer, p.524, [ISBN: 978-3030068561]. Ian Goodfellow,Yoshua Bengio,Aaron Courville. (2016), Deep Learning, MIT Press, Cambridge, MA, p.775, [ISBN: 978-0262035613]. Dipanjan Sarkar. (2016), Text Analytics with Python, Apress, Bangalore, p.385, [ISBN: 978-1484223871]. Benjamin Bengfort, Tony Ojeda, Rebecca Bilbro. (2018), Applied Text Analysis with Python, O'Reilly Media, Sebastopol, CA, p.310, [ISBN: 978-1491963043]. Valentina E. Balas, Sanjiban S. Roy, Dharmendra Sharma, Pijush Samui. (2019), Handbook of Deep Learning Applications, Springer, Cham, p.383, [ISBN: 978-3-030-11478-7].
Recommended Article/Paper Resources
Lipton, Z. C. & Steinhardt, J.. (2019), Troubling trends in machine learning scholarship, Queue, 17, p.80, https://doi.org/10.1145/3317287.3328534 Raschka, S, Patterson, J., & Nolet, C.. (2020), Machine learning in Python: Main developments and technology trends in data science, machine learning, and artificial intelligence, Information, 11, p.193, https://doi.org/10.3390/info11040193
Supplementary Article/Paper Resources
Joshi, A. V.. (2020), Emerging trends in machine learning, Machine Learning and Artificial Intelligence, p.12713, https://doi.org/10.1007/978-3-030-26622- 6_13 Jordan, M. I. & Mitchell, T. M.. (2015), Machine learning: Trends, perspectives, and prospects, Science, 349, p.25526, https://doi.org/10.1126/science.aaa8415
Other Resources
[Website], Machine Learning Repository, Center for Machine Learning and Intelligent Systems, https://archive.ics.uci.edu/ml/index.php [Website], Kaggle platform for predictive modelling competitions, https://www.kaggle.com [Website], Central Statistics Office, http://www.cso.ie [Website], Eurostat, http://ec.europa.eu/eurostat [Website], Data.gov, https://www.data.gov [Website], Google Dataset Search, https://datasetsearch.research.google.co m/ [Website], Google Cloud Public Datasets, https://cloud.google.com/public-datasets /

Discussion Note:

Powered By Akari Curriculum Management

Curriculum Management Version 5.1.0