HEALTH DATA ANALYTICS

Vincenzo MATTA HEALTH DATA ANALYTICS

0622900031
DIPARTIMENTO DI INGEGNERIA DELL'INFORMAZIONE ED ELETTRICA E MATEMATICA APPLICATA
EQF7
DIGITAL HEALTH AND BIOINFORMATIC ENGINEERING
2021/2022

OBBLIGATORIO
YEAR OF COURSE 1
YEAR OF DIDACTIC SYSTEM 2018
SPRING SEMESTER
CFUHOURSACTIVITY
1HEALTH DATA ANALYTICS MOD.1
216LESSONS
18LAB
2HEALTH DATA ANALYTICS MOD.2
216LESSONS
18LAB
Objectives
THE COURSE PROVIDES BASIC METHODOLOGICAL AND TECHNOLOGICAL TOOLS FOR DATA ANALYSIS, STATISTICAL INFERENCE AND SIGNIFICANCE ASSESSMENT OF CLINICAL/BIOMEDICAL INFORMATION.

KNOWLEDGE AND UNDERSTANDING
- METHODOLOGIES FOR THE ANALYSIS OF CLINICAL/BIOMEDICAL DATA (REGRESSION AND STATISTICAL INFERENCE TECHNIQUES; HYPOTHESIS TESTS AND DECISION STRATEGIES; CLUSTERING ALGORITHMS).
- ARCHITECTURES OF DATA-ANALYSIS AND DECISION-SUPPORT SYSTEMS, E.G., OLAP (ONLINE ANALYTICS PROCESSING), COMMONLY ADOPTED IN CLINICAL/BIOMEDICAL APPLICATIONS.

APPLYING KNOWLEDGE AND UNDERSTANDING
- EXTRACTING USEFUL INFORMATION FROM CLINICAL/BIOMEDICAL DATASETS BY APPLYING THE DATA ANALYSIS TECHNIQUES ILLUSTRATED DURING THE COURSE.
- MASTERING SOFTWARE FRAMEWORKS AND TOOLS FOR THE ANALYSIS OF CLINICAL/BIOMEDICAL DATASETS.
- DESIGNING OLAP SOLUTIONS TO SUPPORT DECISIONS IN CLINICAL/BIOMEDICAL APPLICATIONS.
Prerequisites
FUNDAMENTALS OF PROBABILITY AND PROGRAMMING.
Contents
- INTRODUCTION (LECTURE/PRACTICE/LABORATORY HOURS: 2/0/0)
REGRESSION VS. DECISION. PREDICTION VS. INFERENCE. PARAMETRIC VS. NON-PARAMETRIC METHODS. SUPERVISED VS. UNSUPERVISED METHODS.

- REGRESSION (LECTURE/PRACTICE/LABORATORY HOURS: 8/3/4)
LINEAR REGRESSION. SUBSET SELECTION. SHRINKAGE (RIDGE, LASSO). PRINCIPAL COMPONENT REGRESSION. HIGH-DIMENSIONAL DATA. CROSS-VALIDATION. LOCAL AVERAGING (E.G., NEAREST-NEIGHBOR, NAÏVE-KERNEL). APPLICATION OF REGRESSION AND STATISTICAL INFERENCE TO CLINICAL/BIOMEDICAL DATA.

- DECISION (LECTURE/PRACTICE/LABORATORY HOURS: 6/3/4)
HYPOTHESIS TESTS. NAÏVE-BAYES. GRADIENT DESCENT AND STOCHASTIC GRADIENT DESCENT ALGORITHMS. LOGISTIC REGRESSION. APPLICATION OF DECISION STRATEGIES AND STATISTICAL SIGNIFICANCE ASSESSMENT IN RELATION TO CLINICAL/BIOMEDICAL DATA.

- CLUSTERING (LECTURE/PRACTICE/LABORATORY HOURS: 6/2/4).
K-MEANS ALGORITHM. HIERARCHICAL CLUSTERING. EXPECTATION-MAXIMIZATION ALGORITHM. DBSCAN ALGORITHM. PRINCIPAL COMPONENT ANALYSIS. APPLICATION OF CLUSTERING TECHNIQUES TO CLINICAL/BIOMEDICAL DATA.

- HEALTH DATA ANALYTICS SYSTEMS (LECTURE/PRACTICE/LABORATORY HOURS: 4/0/2)
SYSTEM ARCHITECTURES FOR DATA ANALYSIS AND DECISION SUPPORT IN THE CLINICAL/BIOMEDICAL CONTEXT. OVERVIEW OF STANDARD TECHNOLOGIES FOR HEALTH DATA ANALYTICS (E.G., OLAP). FRAMEWORKS RELEVANT TO DISTRIBUTED IMPLEMENTATIONS (E.G., APACHE SPARK).

TOTAL LECTURE/PRACTICE/LABORATORY HOURS 26/8/14

- SOFTWARE TOOLS
R; MATLAB; PHYTON; APACHE SPARK.
Teaching Methods
THE COURSE INCLUDES THEORETICAL LECTURES, CLASSROOM EXERCISES, AND THE USAGE OF SOFTWARE TOOLS FOR DATA ANALYSIS.
Verification of learning
SUCCESSFUL ACHIEVEMENT OF THE LEARNING OUTCOMES WILL BE ASSESSED THROUGH A PROJECT WORK DEALING WITH THE ANALYSIS OF CLINICAL/BIOMEDICAL DATA.
Texts
AN INTRODUCTION TO STATISTICAL LEARNING, G. JAMES, D. WITTEN, T. HASTIE, R. TIBSHIRANI, SPRINGER, 2013.

STATISTICS FOR HEALTH DATA SCIENCE, R. ETZIONI, M. MANDEL, AND R. GULATI, SPRINGER, 2021.

SUPPLEMENTARY TEACHING MATERIAL WILL BE AVAILABLE ON THE UNIVERSITY E-LEARNING PLATFORM (HTTP://ELEARNING.UNISA.IT) ACCESSIBLE TO STUDENTS USING THEIR OWN UNIVERSITY CREDENTIALS.
More Information
THE COURSE IS HELD IN ENGLISH.
  BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2022-11-21]