Module Code: |
H8TA |
Long Title
|
Text Analytics
|
Title
|
Text Analytics
|
Module Level: |
LEVEL 8 |
EQF Level: |
6 |
EHEA Level: |
First Cycle |
Module Author: |
Isabel O'Connor |
Departments: |
School of Computing
|
Specifications of the qualifications and experience required of staff |
Master’s / PhD degree in a computing or cognate discipline. May have industry experience also.
|
Learning Outcomes |
On successful completion of this module the learner will be able to: |
# |
Learning Outcome Description |
LO1 |
Rationalise and defend methodological choices in pre-processing methods for text analytics |
LO2 |
Build and critically evaluate text analytics models in a variety of contexts |
LO3 |
Execute and document corpus-based case studies |
LO4 |
Evaluate and discuss the impact machine learning models applied to text corpora |
Dependencies |
Module Recommendations
This is prior learning (or a practical skill) that is required before enrolment on this module. While the prior learning is expressed as named NCI module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).
|
No recommendations listed |
Co-requisite Modules
|
No Co-requisite modules listed |
Entry requirements |
Learners should have attained the knowledge, skills and competence gained from stage 3 of the BSc (Hons) in Data Science
|
Module Content & Assessment
Indicative Content |
Introduction
Introduction to:. Text Analytics, . Key Domain, Methods and Ethics. and software libraries / packages, and Web APIs. E.g.: NLTK, LIWC, S4, GATE, Alchemi API, Natural Language API, Mallet, tm/tidyverse/tidytext etc.
|
Vector and Document Spaces
Elementary Methods:. Bag(s) of Words, . Ngrams . Document and Language Classification via Vector spaces and the Zipfian Distribution. Dictionary-based approaches
|
Vector and Document Spaces
Vector Spaces: Term Document / Document Term Matrices, TF-IDF, Word2Vec, Doc2Vec
|
Text Understanding and Semantics
Topic Modelling:. Latent Dirichlet Allocation. Explicit Semantic Analysis. Latent Semantic Analysis. Hierarchical Dirichlet Process. And associated methods, e.g.. Singular Value Decomposition. Non-negative Matrix Factorisation
|
Text Understanding and Semantics
Part of Speech Tagging Entity Extraction / Identification, SPARQL and Linked Data, Aspect-based Reasoning
|
Knowledge Graphs and Network Analysis
Introduction to graph-based models for document corpora, Introduction to network analysis for graph-based models
|
Computational Linguistics
Interrogating structure, intent, language use independent of content; key use cases:. Affect Analysis. Deception Detection. Psychometric Profiling. Author fingerprinting
|
Applied Machine Learning
Case Studies in applying (un)supervised machine and/or deep learning to text analytics.
|
Assessment Breakdown | % |
Coursework | 100.00% |
AssessmentsFull Time
Coursework |
Assessment Type: |
Continuous Assessment |
% of total: |
Non-Marked |
Assessment Date: |
n/a |
Outcome addressed: |
1,2,3,4 |
Non-Marked: |
Yes |
Assessment Description: Ongoing independent and group problem solving activities and feedback. |
|
Assessment Type: |
Project |
% of total: |
50 |
Assessment Date: |
n/a |
Outcome addressed: |
1,2 |
Non-Marked: |
No |
Assessment Description: Students will submit a report (4000 words) on a case study where they will encompass 3 methods covered in the first 6 teaching weeks as outlined in the indicative structure above. The report should discuss the preparation of the corpora for each method, and rationalise the use and effectiveness of each method applied. It should also discuss related work in the area covering the context of the text data as well as studies applied to similar data sets |
|
Assessment Type: |
Project |
% of total: |
50 |
Assessment Date: |
n/a |
Outcome addressed: |
3,4 |
Non-Marked: |
No |
Assessment Description: Students will submit a report (4000 words on a case study where they will encompass a further 2 methods from the teaching weeks 7-10 and a further 2 methods not yet included applied in conjunction with a selection of machine learning models: at least 1 unsupervised, and at least 1 supervised. The report should discuss the preparation of the corpora for each method, and rationalise the use and effectiveness of each method applied. It should also discuss related work in the area covering the context of the text data as well as studies applied to similar data sets. |
|
No End of Module Assessment |
Reassessment Requirement |
Repeat examination
Reassessment of this module will consist of a repeat examination. It is possible that there will also be a requirement to be reassessed in a coursework element.
|
Reassessment Description Should learners not achieve a 40% pass mark, they will either sit a repeat terminal exam, or undertake an assessment that assesses all learning outcomes.
|
NCIRL reserves the right to alter the nature and timings of assessment
Module Workload
Module Target Workload Hours 0 Hours |
Workload: Full Time |
Workload Type |
Workload Description |
Hours |
Frequency |
Average Weekly Learner Workload |
Lecture |
Classroom & Demonstrations (hours) |
24 |
Per Semester |
2.00 |
Tutorial |
Other hours (Practical/Tutorial) |
24 |
Per Semester |
2.00 |
Independent Learning |
Independent learning (hours) |
202 |
Per Semester |
16.83 |
Total Weekly Contact Hours |
4.00 |
Module Resources
Recommended Book Resources |
---|
-
Bird, S., Klien, E. & Loper, E.. (2009), Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit, O’Reily.
-
Goldberg, Y.. (2017), Neural Network Methods in Natural Language Processing (Synthesis Lectures on Human Language Technologies), Morgan & Claypool Publishers.
-
Silge, J.. (2017), Text Mining with R: A Tidy Approach, O’Reily.
-
Rodrigues, M., & Teixeira, A. (2015), Advanced Applications of Natural Language Processing for Performing Information Extraction, Springer.
| Supplementary Book Resources |
---|
-
Biemann, C. & Mehler, A.. (2014), Text Mining, Springer.
-
Pennebaker, J.. (2013), The Secret Life of Pronouns: What Our Words Say About Us, Bloomsbury Press.
-
Sankar, D.. (2016), Text Analytics with Python: A Practical Real-World Approach to Gaining Actionable Insights from your Data, Apress.
-
Wachsmuth, H.. (2015), Text Analysis Pipelines, Springer.
| This module does not have any article/paper resources |
---|
Other Resources |
---|
-
[Website],
-
[Website],
-
[Website],
-
[Website],
-
[Website],
-
[Website],
|
|