Fabio POSTIGLIONE | STATISTICAL DATA ANALYSIS
Fabio POSTIGLIONE STATISTICAL DATA ANALYSIS
cod. 0622700059
STATISTICAL DATA ANALYSIS
0622700059 | |
DIPARTIMENTO DI INGEGNERIA DELL'INFORMAZIONE ED ELETTRICA E MATEMATICA APPLICATA | |
EQF7 | |
COMPUTER ENGINEERING | |
2019/2020 |
OBBLIGATORIO | |
YEAR OF COURSE 1 | |
YEAR OF DIDACTIC SYSTEM 2017 | |
SECONDO SEMESTRE |
SSD | CFU | HOURS | ACTIVITY | ||
---|---|---|---|---|---|
STASTICAL DATA ANALYSIS | |||||
SECS-S/02 | 3 | 24 | LESSONS | ||
SECS-S/02 | 3 | 24 | EXERCISES | ||
STASTICAL DATA ANALYSIS | |||||
ING-INF/03 | 1 | 8 | EXERCISES | ||
ING-INF/03 | 2 | 16 | LESSONS |
Objectives | |
---|---|
THE COURSE HAS THE TWOFOLD PURPOSE OF: I) ILLUSTRATING THE MAIN METHODOLOGIES OF INTEREST FOR STATISTICAL DATA ANALYSIS; II) APPLYING SUCH METHODOLOGIES TO RELEVANT PRACTICAL PROBLEMS, USING TOOLS COMMONLY EMPLOYED FOR STATISTICAL ANALYSIS, DATA VISUALIZATION AND PROCESSING. KNOWLEDGE AND UNDERSTANDING. •ACQUISITION OF THE MAIN STATISTICAL INFERENCE AND DATA ANALYSIS. •PARAMETRIC VS. NON PARAMETRIC APPROACHES. SUPERVISED VS. UNSUPERVISED APPROACHES. •ACQUISITION OF THE MAIN TECHNIQUES AND TOOLS FOR BIG DATA ANALYSIS. APPLICATION KNOWLEDGE AND UNDERSTANDING. •ABILITY TO APPLY THE MAIN TECHNIQUES FOR STATISTICAL INFERENCE AND DATA ANALYSIS TO PRACTICAL PROBLEMS (E.G., SOCIAL OR BIOMEDICAL DATA). •ABILITY TO EXAMINE BIG DATA, ARRANGED IN RATHER COMPLEX AND/OR HETEROGENEOUS STRUCTURES • ABILITY TO USE SOFTWARE (E.G., R, MATLAB) FOR STATISTICAL DATA ANALYSIS, DATA VISUALIZATION AND PROCESSING. •ABILITY TO USE TOOLS OF PRACTICAL INTEREST FOR DATA ANALYTICS (E.G., APACHE SPARKS) . |
Prerequisites | |
---|---|
PREREQUISITES: SUITABLE KNOWLEDGE OF MATHEMATICS AND FUNDAMENTALS OF PROBABILITY AND STATISTICS. |
Contents | |
---|---|
- FUNDAMENTALS OF STATISTICS (HOURS FOR LECTURE/EXERCISES: 7/3) STATISTICAL INFERENCE, PARAMETRIC METHODS, MAXIMUM LIKELIHOOD. DECISION THEORY. BAYESIAN APPROACH. - DATA NORMALIZATION. WHITENING (1/1) - INTRODUCTION TO SUPERVISED LEARNING AND LINEAR MODELS (6/3) MULTIPLE LINEAR REGRESSION. GENERALIZED LINEAR MODELS. CLASSIFICATION (11/5) LOGISTIC REGRESSION. LINEAR DISCRIMINANT ANALYSIS. BAYESIAN FORMULATION OF REGRESSION/CLASSIFICATION. BIAS AND VARIANCE. NAÏVE-BAYES. NONPARAMETRIC SUPERVISED APPROACHES. EXAMPLES: NAÏVE-KERNEL, NEAREST-NEIGHBOR AND K-NEAREST-NEIGHBOR. - RESAMPLING (2/1) CROSS-VALIDATION (LOO, K-FOLD). BOOTSTRAP. - LINEAR MODEL SELECTION AND REGULARIZATION (9/3) STEPWISE SELECTION. RIDGE REGRESSION. LASSO. DIMENSIONALITY REDUCTION. PRINCIPAL COMPONENT REGRESSION. EXTENSION TO HIGH-DIMENSIONAL DATA. SPARSITY-AWARE METHODS FOR BIG DATA ANALYTICS. - GENERALIZED ADDITIVE MODELS AND TREE-BASED METHODS (HOURS: LESSONS/EXERCISES/LABORATORY 1/0/0) - SUPPORT VECTOR MACHINES (1/1) - UNSUPERVISED LEARNING (11/5) PRINCIPAL COMPONENTS ANALYSIS. CENTROID-BASED CLUSTERING: K-MEANS. HIERARCHICAL CLUSTERING. OTHER EXAMPLES OF CLUSTERING. GAUSSIAN MIXTURES AND THE EXPECTATION-MAXIMIZATION ALGORITHM. DENSITY-BASED CLUSTERING: DBSCAN. NONPARAMETRIC STATISTICS AND - INTRODUCTION TO FUNCTIONAL DATA ANALYSIS (2/0) - SOFTWARE AND TOOLS: R MATLAB APACHE SPARK |
Teaching Methods | |
---|---|
THE COURSE INCLUDES THEORETICAL LECTURES AND CLASSROOM EXERCISES ALSO WITH THE USAGE OF COMPUTERS. |
Verification of learning | |
---|---|
THE FINAL EXAM CONSISTS OF DISCUSSING A PROJECT WORK, AIMED AT EVALUATING: THE KNOWLEDGE AND UNDERSTANDING OF THE CONCEPTS PRESENTED DURING THE COURSE; THE ABILITY OF SOLVING STATISTICAL-DATA-ANALYSIS PROBLEMS APPLYING THE METHODS AND TOOLS ILLUSTRATED DURING THE COURSE. FURTHERMORE, THE PERSONAL JUDGEMENT, THE COMMUNICATION SKILLS AND THE LEARNING ABILITIES ARE ALSO EVALUATED. |
Texts | |
---|---|
AN INTRODUCTION TO STATISTICAL LEARNING, G. JAMES, D. WITTEN, T. HASTIE, R. TIBSHIRANI, SPRINGER, 2013. AN ELEMENTARY INTRODUCTION TO STATISTICAL LEARNING, S. KULKARNI, G. HARMAN, WILEY, 2010. |
More Information | |
---|---|
THE COURSE LANGUAGE IS ENGLISH. |
BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2021-02-19]