MASTER OF SCIENCE (DATA SCIENCE AND ANALYTICS)

This programme was introduced in the 2017/18 academic year to meet the ever-growing demand of professional practitioners in the field of Big Data Analytics.

The goal of this programme is to produce workforce/professional practioners in the field of Big Data Analytics who are capable of making right decisions based on the availability of comprehensive data. Therefore, the educational objective of this programme is to produce computing practitioners who:

[PEO1]   have advanced knowledge in the field of Data science and Analytics capable of adopting best methodologies, tools and techniques to provide innovative solutions across various sectors.

[PEO2]   have leadership skills, and are able to communicate as well as interact effectively with diverse stakeholders.

[PEO3]   have positive attitudes, lifelong-learning capabilities and entrepreneurial mind-set for successful career.

[PEO4]   uphold and defend ethical and professional practices in maintaining self and professional integrity.


  • Credit requirements: 44 units                                  

    (i)    Core Courses (Taught Courses) : 24 units (Code: T) 

    (a)

    CDS501/4 – Principles and Practices of Data Science and Analytics

    (b)

    CDS502/4 – Big Data Storage and Management

    (c)

    CDS503/4 – Machine Learning

    (d)

    CDS504/4 – Enabling Technologies and Infrastructures for Big Data

    (e)

    CDS505/4 – Data Visualisation and Visual Analytics

    (f)

    CDS506/4 – Research, Consultancy and Professional Skills

     

    (ii)   Elective Courses: 12 Units (Code: E) 

           Choose any three (3) courses from the table below:  

    Business Analytics

    (a)

    CDS511/4  –   Consumer Behavioural and Social Media Analytics

    (b)

    CDS512/4  –   Business Intelligence and Decision Analytics

    (c)

    CDS513/4  –   Predictive Business Analytics

    Multimodal Analytics

    (d)

    CDS521/4  –    Multimodal Information Retrieval

    (e)

    CDS522/4  –    Text and Speech Analytics

    (f)

    CDS523/4  –    Forensic Analytics and Digital Investigations

     (iii)  Project (Core): 8 units (Code: T)

             CDS590 – Consultancy Project and Practicum

    This experiential work-based learning course prepares students to be a data scientist/analytics consultant by enhancing students’ knowledge and skills in research, planning and implementation of a consultancy project in the field of data science/analytics, which can be applied to real life situation.  Students are required to complete the practicum at their respective workplaces or their chosen/assigned organisations.  Students work under the supervision of a lecturer and an industry supervisor.  The students are required to solve a real- world problem or tap opportunities related to data science and analytics during their practicum.

     The prerequisite of this course is CDS506 which must be taken in the preceding semester. The students are required to secure practicum placement together with project proposal during CDS506.

    At the end of this course, the students will be able to:

    • Devise a solution to a real-world problem using data science technique appropriately.
    • Practice effective communication orally, the progress and achievement of the practicum.
    • Perform work collaboratively in a multi-ethnic environment with superior, colleagues, staff and supervisors.
    • Display professional behaviours such as trust, honest and non-violation of the predefined policy at the workplace.
    • Display confidence and ability to overcome challenges in completing the project and practicum.
    • Perform project tasks with proper planning to meet project milestone.
    • Display high level of responsibility and accountability to lead the project independently.
  • The programme is offered on full-time basis with a minimum period of candidature of three (3) semesters and a maximum of six (6) semesters. The study schemes are as follows:

    1.5 year (applicable to full-time study scheme only):

    dsa1

     

    2 years (applicable to full-time study scheme only):

    dsa2

     

    2.5 year (applicable to full-time study scheme only):

    dsa3

    4b

     

    Course offering is given in the table below:

    5b

  • CDS501/4 – Principles & Practices of Data Science & Analysis

    This course introduces the basic goals and techniques in data science and analytics process with some theoretical foundations which include useful statistical and machine learning concepts so that the process can transform hypotheses and data into actionable predictions. The course provides basic principles on important steps of the process which include data collecting, curating, analysing, building predictive models and reporting and presenting results to audiences of all levels. R programming language and statistical analysis techniques are introduced based on examples such as from marketing, business intelligence and decision support.

    At the end of this course, the students will be able to:

    • Organize effectively all the necessary steps in any data science and analytics real-world project.
    • Adapt the R programming language and useful statistical and machine learning techniques in data science and analytics projects.
    • Practice all the skills needed by the data scientist, which include acquiring the data, managing the data, choosing the modelling technique, writing the code, and verifying and presenting the results.

    CDS502/4 – Big Data Storage and Management

    Storing and managing big data addresses different issues compared to conventional databases. Big data involves huge amount of data (volume), supports heterogeneous data format (variety) and can be accessed at high speed (velocity). The course includes fundamental on big data storage and management related issues. Understanding of various storage infrastructures includes understanding of technologies ranging from traditional storage to cloud-based storage. The course provides exposure on recent technologies in manipulating, storing and analyzing big data. The technologies include but not limited to Hadoop, MongoDB and Apache Cassandra.

    At the end of this course, the students will be able to:

    • Compare the various data storage infrastructures, advanced concepts and technologies
    • Build a database to support big data using related big data storage system.
    • Identify and master the rules of modern and traditional in storing and managing large data.

    CDS503/4 – Machine Learning

    Upon successful completion of the course, students will have a broad understanding of machine learning algorithms. Students will be acquiring skills of applying relevant machine learning techniques to address real-world problems. Students will be able to adapt or combine some of the key elements of existing machine learning algorithms. Topics which will be covered in this course include supervised and unsupervised learning techniques, parametric and non-parametric methods, Bayesian learning, kernel machines, and decision trees. The course will also discuss recent applications of machine learning. Students are expected to obtain hands-on experience during labs and assignments to address practical challenges. An understanding of the current state-of-the-art in machine learning is done via a review of key research papers allowing students to further research in machine learning.

    At the end of this course, the students will be able to:

    • To apply relevant machine learning algorithms for typical real-world problems.
    • Manipulate machine learning algorithms which can be adapted to more complex scenarios.
    • Synthesize findings and recommendations.

     

    CDS504/4 – Enabling Technologies & Infrastructures for Big Data

    Data science is advancing the inductive conduct of science and is driven by big data available on the Internet. This course will explain the technologies and techniques to improve the access, security, and performance of big data processing and storage systems. This course will help students:

    • Acquire the necessary skills as an analyst for big data system.
    • Identify the security aspects of the data and determine the appropriate measures to protect it.
    • Have an exposure and training in designing basic infrastructure for the application of big data with sensitive nature of the low-power edge devices.

    This course includes parallel and distributed processing, grid and cloud computing, big data tools, big data processing techniques, network infrastructure and architecture, network performance and security for big data.

    At the end of this course, the students will be able to:

    • Distinguish major concepts of data science which are high-performance parallel and distributed computing; computing with emerging technologies, and network performance.
    • Identify the needs and issues for big data security to protect sensitive data and suitable access controls.
    • Design a cloud platform and efficient techniques that can support end-users running latency-sensitive big data applications on low-powered edge devices.

    CDS505/4 - Data Visualisation & Visual Analytics       

    This course discusses the use of computer-supported, interactive and visual representations of data in order to amplify cognition, help people reason effectively about information, find patterns and meaning in the data, and easily explore the datasets from different perspectives in particular in data-intensive environment. The course covers techniques from two branches of visual representation of data, namely data visualization and visual analytics. In data visualization, the course covers scientific visualisation techniques (representations of empirically-gathered scientific datasets) such as contours, isosurface, and volume rendering as well as specifics techniques in information visualisation (representations of abstract datasets) which include tables, networks and trees, and mapcolour. In visual analytics, a visualization process features a significant amount of computational analysis and human-computer interaction. So, the topics covered in this part of the course include view manipulation, multiple views, reduction in items and attributes, and focus + context as well as analysis case studies involving a visualization system or tool.

    At the end of this course, the students will be able to:

    • Select the right visualization techniques for any given problems or applications.
    • Adapt visualization techniques for particular application.
    • Apply several techniques either by designing or developing specific visualization techniques or using existing tools.

    CDS506/4 - Research, Consultancy and Professional Skills

    The course provides knowledge and effective skills that are required in research, consultancy and professional practice. For the research section, it will cover literature review, development of research questions, usage of theories, research design, data collection as well as related analysis techniques. For the consultancy skills, students will be equipped with the mindset tools and skills to provide effective consulting advice to clients. In the final section, professional issues, and different aspects such as ethical, legal and social in conducting research and consultancy will also be discussed.

    At the end of this course, the students will be able to:

    • Combine theory and consultation techniques to effectively meet clients' needs
    • Adapt a structured and effective research method in data science and analytics research.
    • Correlate professional issues inherent in research methods and consultancy.

    CDS511/4 - Consumer Behavioural and Social Media Analytics

    This course provides a broad and interdisciplinary research and practise focusing on two areas: behaviour and web & social media analytics. Specifically, behaviour analytics concerns the process of systematically converting multimodal human behavioural cues (facial, speech, textual etc.) to machine readable form, in order to automatically model the human behaviour. The focus is on humans as consumers. This involves human-computer interaction (HCI), user behaviour modelling, computational models of emotions, and emotion sensing and recognition. Web and social media analytics concerns the strategies to leverage powerful social media data concerning customer needs, behaviour and preferences. Students will learn the strategies to derive insights from the above mentioned data that are crucial for business decisions. Students will be encouraged to explore statistical, machine learning and analytical tools such as SPSS, R, WEKA, Google Analytics, TrueSocial Metrics and Clicky for analysis.

    It is worth to note that an understanding of the current state-of-art in consumer behavioural and social media analytics is done via a review of key research papers, and book chapters allowing students to further research in this area if needed.

    At the end of this course, the students will be able to:

    • Distinguish the suitable metrics for assessing multimodal human behavioural cues in a consumer perspective.
    • Identify human behavioural cues across a variety of contexts with state-of-the-art tools to facilitate better interaction and decision making.
    • Construct predictive models (by extracting, analyzing and deriving insights) from the related web and social media data for data-informed decision-making within a business perspective.

    CDS512/4 - Business Intelligence & Decision Analytics

    The course will focus on the knowledge and skills to select, apply and evaluate business intelligence and decision analytics techniques which discover knowledge that can add value to a company. The course will also discuss innovative applications and exploitation of the current techniques and approaches related to business intelligences and performance measurement, and mathematical model to facilitate decision-making process in business and operations.

    At the end of this course, the students will be able to:

    • Elaborate concepts, technologies and theories related to business intelligences and decision analytics.
    • Integrate the use of different types of business intelligence models and tools, and decision analytics models to various real-life problems.
    • Propose improvement strategies for enhancing business performance by applying business intelligence and decision analytics techniques.

    CDS513/4 - Predictive Business Analytics

    The course provides the theory behind predictive analytics, and methods, principles and techniques for conducting predictive business analytics projects. The course introduces the underlying algorithms as well as the principles and best practices that govern the art of predictive analytics that translate big data into meaningful, usable business information. The course also explores the tips and tricks that are essential for successful predictive modelling in areas such as business performance, pharmaceutical industry, finance, accounting, and organization management. The course takes technology approach to address a big data analytic challenge by applying the concepts taught in the course in the context of predictive analytics project lifecycle. Students will be exposed to a predictive business analytics tool.

    At the end of this course, the students will be able to:

    • Apply appropriate predictive business analytics techniques and tools to effectively interpret big data.
    • Revise and adapt insights that can lead to actionable results and pragmatic business solutions.
    • Construct a business challenge as a predictive business analytics challenge.

    CDS521/4 - Multimodal Information Retrieval

    This course provides the basic concepts, principles and applications for multimodal (text, image, video and audio) retrieval. This course covers basic techniques for content processing, indexing, representation, ranking, querying, and evaluation for multimodal information retrieval. In addition, advanced techniques such as large scale retrieval, multimodal analysis, and cross media retrieval will be covered based on the latest context such as mobile devices, social media and big data.

    At the end of this course, the students will be able to:

    • Summarize and criticize the state of the art of multimodal information retrieval.
    • Adapt the framework, models and techniques of multimodal information retrieval.
    • Solve problems in emerging multimodal applications using the learned techniques.

    CDS522/4 - Text and Speech Analytics

    A lot of the information resides in documents and speech format. This information however is not directly utilisable because they are unstructured. The course focuses on the theory and applications of natural language processing and speech processing to retrieve linguistic knowledge in these sources. The linguistic knowledge from words, syntax and semantics of sentences will be combined with machine learning algorithms and statistical approach to find, organize, categorize, analyze and interpret the unstructured and semi-structured text that allow users to seek advice to make a decision.

    At the end of this course, the students will be able to:

    • Describe basic concepts and algorithms in natural language and speech processing, for example tokenization, morphological analysis, ngram, tagging, parsing, word sense disambiguation and decoding.
    • Manipulate natural language processing and speech processing approaches to obtain different levels of linguistics information such as word, sentence and semantics for text analytics.
    • Design custom solutions using natural language processing and speech processing techniques or text and speech analytics problems in organizations.

    CDS523/4 - Forensic Analytics and Digital Investigations

    This course introduces fundamental knowledge and techniques of computer forensics and digital investigations. Starting from an overview of the profession of digital investigator, issues on the digital forensics and investigations on big data, and the current practices for processing crime and incident scenes will be explained. Next, the principles of interpretation of evidence, ways of controlling and preserving evidence, and techniques for manual interpretation of raw binary data will be detailed. The students will learn advanced techniques in forensic investigations on big data: methods to identify big data evidence, collecting and performing analysis on the data, and then the proper techniques to report and present the forensic findings as well as the proper way to act as expert witness in reporting results of investigations.

    In addition, technical and legal difficulties involved in searching, extracting, maintaining and storing digital evidence will be explained along with the legal implications of such investigations and the rules of legal procedure relevant to electronic evidence.

    At the end of this course, the students will be able to:

    • Conduct digital investigations that conform to accepted professional standards and are based on the investigative process: identification, preservation, examination, analysis and reporting.
    • Identify and document potential security breaches of computer data that suggest violations of legal, ethical, moral, policy and/or societal standards.
    • Master the principles and practices of big data forensics and digital investigations.
    • Access and critically evaluate relevant technical and legal information and emerging industry trends.

    CDS590/8 - Consultancy Project & Practicum

    This experiential work-based learning course prepares students to be a data scientist/analytics consultant by enhancing students’ knowledge and skills in research, planning and implementation of a consultancy project in the field of data science/analytics, which can be applied to real life situation.  Students are required to complete the practicum at their respective workplaces or their chosen/assigned organisations.  Students work under the supervision of a lecturer and an industry supervisor.  The students are required to solve a real world problem or tap opportunities related to data science and analytics during their practicum.

  • The OBE is a method of curriculum design, teaching and learning where it emphasises on the outcome of the learning process. For instance, what the current students have gained from the teaching and learning process, and for the graduated students –in a few years after they have left the university.

    The focus of OBE lies on the professional attributes of the graduates, where the motivation is to fulfil the enduring demands of the country and the global market.

    Introduction and Programme Educational Objective

    This programme was introduced in the 2017/18 academic year to meet the ever-growing demand of professional practitioners in the field of Big Data Analytics.

    The goal of this programme is to produce workforce/professional practioners in the field of Big Data Analytics who are capable of making right decisions based on the availability of comprehensive data. Therefore, the educational objective of this programme is to produce computing practitioners who:

    [PEO1] have advanced knowledge in the field of Data science and Analytics capable of adopting best methodologies, tools and techniques to provide innovative solutions across various sectors.

    [PEO2] have leadership skills, and are able to communicate as well as interact effectively with diverse stakeholders.

    [PEO3] have positive attitudes, lifelong-learning capabilities and entrepreneurial mind-set for successful career.

    [PEO4] uphold and defend ethical and professional practices in maintaining self and professional integrity.

     

    Programme Learning Outcomes

    At the end of this programme, the students will be able to:

    PLO1

    C1 -  Knowledge & Understanding

    integrate advanced knowledge related to current practices and research issues in Data Science and Analytics;

    PLO2

    C3A - Practical Skills

    conduct standard approaches and apply practical skills, tools or investigative techniques which are at the forefront of Data Science and Analytics;

    PLO3

    C2 - Cognitive Skills

    recommend innovative solutions and ideas that is at the forefront of developments in Data Science and Analytics;

    PLO4

    C3C - Communication Skills

    communicate clearly the knowledge, skills, ideas, critique and rationale using appropriate methods to peers, experts and nonexperts;

    PLO5

    C3B - Interpersonal Skills

    work together and interact effectively with different people in learning and working communities and other groups and networks;

    PLO6

    C5 -  Ethics & Professionalism

    uphold professional and ethical practices in conducting research and delivering services related to the field of Data Science and Analytics;

    PLO7

    C4A - Personal Skills

    exhibit capabilities to extend knowledge through life-long learning related to Data Science and Analytics;

    PLO8

    C4B - Entrepreneurship Skill

    exhibit entrepreneurial mind-set related to Data Science and Analytics;

    PLO9

    C3F - Leadership, Autonomy & Responsibility

    demonstrate leadership, autonomy and responsibility in delivering services related to Data Science and Analytics;

    PLO10

    C3D - Digital Skill

    competently use and adapt a wide range of suitable digital technologies and appropriate software to enhance professional practice in Data Science and Analytics; and

    PLO11

    C3E - Numeracy Skill

    utilise numerical skills to acquire, interpret and extend knowledge in Data Science and Analytics.

    The following table provides the matrix for programme learning outcomes of this programme. 

    No.

    Course Code/Unit

    Course Title

    Programme Learning Outcomes

    Knowledge & Understanding

    Practical Skill

    Cognitive Skill

    Communication Skill

    Interpersonal Skill

    Ethics & Professionalism

    Personal Skill

    Entrepreneurship Skill

    Leadership, Autonomy & Responsibility

    Digital Skill

    Numeracy Skill

    CORE COURSES

     

     

    1.

    CDS501/4    

    Principles and Practices of Data Science and Analytics

    √ 

    2.

    CDS502/4

    Big Data Storage and Management

    √ 

    3.

    CDS503/4

    Machine Learning

    √ 

    √ 

    √ 

    √ 

    4.

    CDS504/4

    Enabling Technologies and Infrastructures for Big Data

    √ 

    √ 

    √ 

    5.

    CDS505/4

    Data Visualisation and Visual Analytics

    √ 

    √ 

    √ 

    √ 

    6.

    CDS506/4

    Research, Consultancy and Professional Skills

    √ 

    √ 

    √ 

    √ 

    √ 

    √ 

    7.

    CDS590/8

    Consultancy Project and Practicum

    √ 

    √ 

    √ 

    √ 

    √ 

    √ 

    √ 

    ELECTIVE COURSES

     

     

    8.

    CDS511/4

    Consumer Behavioural and  Social Media Analytics

    √ 

    √ 

    √ 

    √ 

    9.

    CDS512/4

    Business Intelligence and Decision Analytics

    10.

    CDS513/4

    Predictive Business Analysis

    √ 

    √ 

    √ 

    √ 

    √ 

    11.

    CDS521/4

    Multimodal Information Retrieval

    √ 

    √ 

    √ 

    12.

    CDS522/4

    Text and Speech Analytics

    √ 

    √ 

    √ 

    13.

    CDS523/4

    Forensic Analytics and Digital Investigations

    √ 

    √ 

    √ 

    √ 

School of Computer Sciences, Universiti Sains Malaysia, 11800 USM Penang, Malaysia
Tel: +604-653 3647 / 2158 / 2155  |  Fax: +604-653 3684  | Email: This email address is being protected from spambots. You need JavaScript enabled to view it.  |  icon admin