Module Code: |
H8MACL |
Long Title
|
Machine Learning
|
Title
|
Machine Learning
|
Module Level: |
LEVEL 8 |
EQF Level: |
6 |
EHEA Level: |
First Cycle |
Module Coordinator: |
Sophie Flanagan |
Module Author: |
ORLA LAHART |
Departments: |
School of Computing
|
Specifications of the qualifications and experience required of staff |
Lecturer must have MSc or PhD degree in computer science or cognate discipline. Experience in lecturing machine learning and coding in Python. May also have industry experience. Lab Assistants are required for tutorials and they should have experience in Python coding and reasonable knowledge of machine learning techniques.
|
Learning Outcomes |
On successful completion of this module the learner will be able to: |
# |
Learning Outcome Description |
LO1 |
Comprehend, compare and contrast fundamental machine learning concepts and techniques |
LO2 |
Comprehend and assess potential ethical implications of machine learning. |
LO3 |
Extract, transform, explore and clean data in preparation for machine learning |
LO4 |
Build and evaluate machine learning models on various problem domains |
LO5 |
Summarise, critique and present results from machine learning for decision-making |
Dependencies |
Module Recommendations
This is prior learning (or a practical skill) that is required before enrolment on this module. While the prior learning is expressed as named NCI module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).
|
No recommendations listed |
Co-requisite Modules
|
No Co-requisite modules listed |
Module Content & Assessment
Indicative Content |
Introduction and Ethics in Machine Learning
Forms of learning (Supervised, Unsupervised, Reinforcement)
Ethics in data sourcing & handling
Review of regulatory & privacy components (including the Data Protection Act)
Ethical implications of Machine Learning
Methodologies (e.g., KDD, SEMMA & CRISP-DM)
Review of basic data exploration statistics
|
Data Preprocessing
Data cleaning (i.e., handling missing values, outliers, noise data)
Data integration (i.e., entity integration problem, and handling of redundant, correlated, duplicated, and conflicting data)
Data transformation (i.e., normalization, binning, log transformation, scaling)
Data reduction (i.e., dimensionality reduction like PCA and MCA, attribute subset selection, sampling)
|
Regression
Review of linear and multiple linear regression
Assessing the model’s accuracy
Model selection (i.e., AIC and BIC)
Measuring predictors’ importance
Subset selection
Shrinkage methods
|
Classification
Introduction to classification
Review of logistic regression
Review of k-nearest neighbours
Classification performance measures (e.g., Confusion matrix, precision and recall, ROC curve)
|
Model Evaluation and Selection
Bias-Variance trade-off
Curse of dimensionality
Evaluation methods (i.e., split validation, cross-validation, and bootstrap methods)
Understanding, detecting and handling (massive) class imbalance
|
Unsupervised Learning
Introduction to unsupervised learning
Notions of distance and similarity
Partitioning methods (e.g., k-Means, k-Medoids)
Plotting and understanding clusters
Cluster evaluation metrics (i.e., DBIndex, silhouette coefficient)
|
Tree-Based Models
Decision Trees
Regression and classification trees
Node purity
Pruning
|
Ensemble Models
Bagging
Random Forest
Boosting
Stacking
|
Naïve Bayes Classification
Introduction to Naïve Bayes
Bayes theorem
Maximum a posteriori hypothesis
Class conditional independence
Naïve Bayes classifier
|
Introduction to Artificial Neural Networks
Feedforward neural network architecture
Sigmoid activation function
Backpropagation
|
Introduction to Deep Learning
Introduction to deep learning
Deep feedforward networks
Recurrent and recursive neural network
Evaluation of deep learning
|
Text Analysis
Text tokenization
Text normalization
Feature extraction (e.g., Bag of words model, TF-IDF model)
Sentiment analysis
|
Assessment Breakdown | % |
Coursework | 100.00% |
AssessmentsFull Time
Coursework |
Assessment Type: |
Formative Assessment |
% of total: |
Non-Marked |
Assessment Date: |
n/a |
Outcome addressed: |
1,2,3,4,5 |
Non-Marked: |
Yes |
Assessment Description: Formative assessment will be provided on the in-class individual or group activities. Feedback will be provided in written or oral format, or on-line through Moodle. In addition, in class discussions will be undertaken as part of the practical approach to learning. |
|
Assessment Type: |
Assignment |
% of total: |
25 |
Assessment Date: |
Week 4 |
Outcome addressed: |
2,3 |
Non-Marked: |
No |
Assessment Description: Learners may be provided with one or more datasets and will be required to apply suitable data cleaning, pre-processing and transformation operations on different attributes of the datasets. In addition, learners will be required to identify and discuss ethical implications of handling and applying machine learning to these datasets. |
|
Assessment Type: |
Project |
% of total: |
75 |
Assessment Date: |
Week 12 |
Outcome addressed: |
1,3,4,5 |
Non-Marked: |
No |
Assessment Description: This assessment will evaluate learner’s comprehension of fundamental machine learning theory and concepts, their applicability and limitations to different problems. Learners will have to (1) identify a topic of interest and one relevant research or business question in that topic; (3) select at least two datasets useful to answer the question; (3) apply data pre-processing and transformation techniques to prepare the datasets for machine learning analysis; (4) perform exploratory analysis in these datasets; (5) apply, evaluate and optimize suitable machine learning techniques to extract knowledge from the selected datasets useful for a decision-making process in the topic of choice; (6) report and interpret the findings to answer the question of interest, and (7) elaborate a video presentation highlighting the project’s main objectives, methodology, main findings, challenges faced. |
|
No End of Module Assessment |
Reassessment Requirement |
Coursework Only
This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.
|
Reassessment Description This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.
|
NCIRL reserves the right to alter the nature and timings of assessment
Module Workload
Module Target Workload Hours 0 Hours |
Workload: Full Time |
Workload Type |
Workload Description |
Hours |
Frequency |
Average Weekly Learner Workload |
Lecture |
On-line/Classroom activities |
24 |
Per Semester |
2.00 |
Tutorial |
Practical & Tutorial activities |
36 |
Per Semester |
3.00 |
Independent Learning |
Independent Learning activities |
190 |
Per Semester |
15.83 |
Total Weekly Contact Hours |
5.00 |
Module Resources
Recommended Book Resources |
---|
-
Ethem Alpaydin. (2020), Introduction to Machine Learning, 4th ed.. MIT Press, Cambridge, MA, p.712, [ISBN: 978-0262043793].
-
Shai Shalev-Shwartz, Shai Ben-David. (2015), Understanding Machine Learning, 2nd. Cambridge University Press, New York, NY, p.397, [ISBN: 978-1107512825].
-
Sebastian Raschka, Vahid Mirjalili. (2019), Python Machine Learning, Packt Publishing, Birmingham, p.770, [ISBN: 978-1789955750].
| Supplementary Book Resources |
---|
-
Kartik Hosanagar. (2019), A Human's Guide to Machine Intelligence, Penguin, London, p.272, [ISBN: 9780525560890].
-
Trevor Hastie, Robert Tibshirani, Jerome Friedman. (2017), The Elements of Statistical Learning, 2nd ed.. Springer, New York, NY, p.767, [ISBN: 978-0387848570].
-
John D. Kelleher, Brian Mac Namee, Aoife D'Arcy. (2020), Fundamentals of Machine Learning for Predictive Data Analytics, 2nd ed.. MIT Press, Cambridge, MA, p.856, [ISBN: 9780262044691].
-
Wes McKinney. (2017), Python for Data Analysis, O'Reilly Media, Sebastopol, CA, p.550, [ISBN: 978-1491957660].
-
Aurélien Géron. (2019), Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, O'Reilly, Sebastopol, CA, p.819, [ISBN: 978-1492032649].
-
CHARU C. AGGARWAL. (2019), Neural Networks and Deep Learning, Springer, p.524, [ISBN: 978-3030068561].
-
Ian Goodfellow,Yoshua Bengio,Aaron Courville. (2016), Deep Learning, MIT Press, Cambridge, MA, p.775, [ISBN: 978-0262035613].
-
Dipanjan Sarkar. (2016), Text Analytics with Python, Apress, Bangalore, p.385, [ISBN: 978-1484223871].
-
Benjamin Bengfort, Tony Ojeda, Rebecca Bilbro. (2018), Applied Text Analysis with Python, O'Reilly Media, Sebastopol, CA, p.310, [ISBN: 978-1491963043].
-
Valentina E. Balas, Sanjiban S. Roy, Dharmendra Sharma, Pijush Samui. (2019), Handbook of Deep Learning Applications, Springer, Cham, p.383, [ISBN: 978-3-030-11478-7].
| Recommended Article/Paper Resources |
---|
-
Lipton, Z. C. & Steinhardt, J.. (2019), Troubling trends in machine learning
scholarship, Queue, 17, p.80,
-
Raschka, S, Patterson, J., & Nolet,
C.. (2020), Machine learning in Python: Main
developments and technology trends in
data science, machine learning, and
artificial intelligence, Information, 11, p.193,
| Supplementary Article/Paper Resources |
---|
-
Joshi, A. V.. (2020), Emerging trends in machine learning, Machine Learning and Artificial
Intelligence, p.12713,
-
Jordan, M. I. & Mitchell, T. M.. (2015), Machine learning: Trends, perspectives,
and prospects, Science, 349, p.25526,
| Other Resources |
---|
-
[Website], Machine Learning Repository, Center for Machine Learning and
Intelligent Systems,
-
[Website], Kaggle platform for predictive modelling
competitions,
-
[Website], Central Statistics Office,
-
[Website], Eurostat,
-
[Website], Data.gov,
-
[Website], Google Dataset Search,
-
[Website], Google Cloud Public Datasets,
|
|