APPLIED STATISTICS

Paolo ADDESSO APPLIED STATISTICS

0612700133
DEPARTMENT OF INFORMATION AND ELECTRICAL ENGINEERING AND APPLIED MATHEMATICS
EQF6
COMPUTER ENGINEERING
2024/2025



YEAR OF COURSE 3
YEAR OF DIDACTIC SYSTEM 2022
SPRING SEMESTER
CFUHOURSACTIVITY
324LESSONS
18EXERCISES
216LAB
Objectives
THE COURSE AIMS AT PROVIDING:
- THE MAIN TOOLS TO VISUALIZE THE DATA AND DESCRIBE THEM THROUGH SIMPLE MODELS;
- THE MOST RELEVANT METHODS TO DESIGN AN EXPERIMENT, TO VALIDATE A MODEL, AND TO EXAMINE THE DIFFERENT INFLUENCE FACTORS;
- THE ABILITY OF USING THE MAIN SOFTWARE TOOLS FOR DATA ANALYSIS.

KNOWLEDGE AND UNDERSTANDING
UNDERSTANDING NON-DETERMINISTIC PHENOMENA DESCRIBED BY PROBABILITY THEORY, BASIC NOTIONS OF DESCRIPTIVE STATISTICS TO REPRESENT UNIVARIATE AND MULTIVARIATE DATA AND STATISTICAL INFERENCE. ANALYSIS OF THE MAIN FACTORS USEFUL TO DESCRIBE A PHENOMENON. LINEAR REGRESSION MODELS. HYPOTHESIS TESTS.

APPLYING KNOWLEDGE AND UNDERSTANDING
EXAMINING SIMPLE DATASETS OF PRACTICAL INTEREST WITH SOFTWARE TOOLS COMMONLY ADOPTED IN APPLIED STATISTICS.
ESTIMATING THE PARAMETERS OF LINEAR REGRESSION MODELS.
PLANNING SIMPLE EXPERIMENTS TO COLLECT DATA AND EXAMINING THE INFLUENCE FACTORS USING ANALYSIS OF VARIANCE.
Prerequisites
PREREQUISITES: SUITABLE KNOWLEDGE OF MATHEMATICS. BASICS OF PROBABILITY THEORY.

PREPARATORY COURSES: ANALISI DEI SEGNALI.
Contents
DIDACTIC UNIT 1: INTRODUCTION TO THE COURSE, DESCRIPTIVE STATISTICS AND INTRODUCTION TO R (LECTURE/PRACTICE/LABORATORY HOURS 6/0/4)
- 1 (2 HOURS LECTURE): INTRODUCTION TO THE COURSE. ELEMENTS OF PROBABILITY THEORY: COMBINATORICS, RANDOM VARIABLES, AND CORRELATION COEFFICIENT.
- 2 (2 HOURS LECTURE): VECTORS OF RANDOM VARIABLES. LAW OF LARGE NUMBERS. CENTRAL LIMIT THEOREM. MAIN R.V. MODELS.
- 3 (2 HOURS LECTURE): INTRODUCTION TO DESCRIPTIVE STATISTICS: HISTOGRAMS, MEAN, MEDIAN, MODE. DISPERSION INDICES: STANDARD DEVIATION AND SAMPLE VARIANCE.
- 4 (2 HOURS LABORATORY): INTRODUCTION TO R AND BASIC COMMANDS, DATAFRAMES. VISUALIZATION IN R OF ASYMPTOTIC THEOREMS AND DESCRIPTIVE STATISTICS CONCEPTS.
- 5 (2 HOURS LABORATORY): CORRELATION ANALYSIS IN R, SCATTER PLOT, BOX PLOT, CORRELATION PLOT.

KNOWLEDGE AND UNDERSTANDING.
PROBABILISTIC MODELS OF COMMON USE IN STATISTICS. VECTORS OF RANDOM VARIABLES AND JOINT AND MARGINAL DISTRIBUTIONS. MAIN PROBABILITY THEOREMS TO ANALYZE DATA. INDICES AND GRAPHIC TOOLS TO DESCRIBE THE VARIABILITY OF RANDOM MEASURES. ANALYZING THE CORRELATIONS AMONG RANDOM VARIABLES.

APPLICATION KNOWLEDGE AND UNDERSTANDING
EVALUATION AND VISUALIZATION OF RANDOM VARIABLES DISTRIBUTIONS. PLOTS AND TABLES OF COMMON USE IN STATISTICS. MANAGING SIMPLE DATASETS BY SOFTWARE TOOLS FOR DATA ANALYSIS. VISUALIZING THE CORRELATION BETWEEN COUPLES OF RANDOM VARIABLES.


DIDACTIC UNIT 2: INFERENTIAL STATISTICS (LECTURE/PRACTICE/LABORATORY HOURS 6/2/4)
- 6 (2 HOURS LECTURE): ELEMENTS OF ESTIMATION THEORY: POINT AND INTERVAL ESTIMATION. ESTIMATION OF THE NORMAL POPULATION MEAN WITH KNOWN VARIANCE.
- 7 (2 HOURS LECTURE): CHI SQUARE AND STUDENT T DISTRIBUTIONS. INTERVAL ESTIMATION OF THE MEAN AND VARIANCE OF THE NORMAL POPULATION WITH UNKNOWN VARIANCE. UNILATERAL CONFIDENCE INTERVALS WITH EXAMPLES
- 8 (2 HOURS LECTURE): ESTIMATORS AND THEIR PROPERTIES (LINEARITY, CORRECTNESS, EFFICIENCY, CONSISTENCY, ASYMPTOTIC NORMALITY, SUFFICIENCY). MSE AND INTRODUCTION TO THE BIAS-VARIANCE TRADE OFF. LIKELIHOOD FUNCTION.
- 9 (2 HOURS EXERCISES): EXAMPLES OF LIKELIHOOD FUNCTIONS. EXERCISES ON THE MAXIMUM LIKELIHOOD (ML) ESTIMATOR.
- 10 (2 HOURS LABORATORY): CONFIDENCE INTERVALS, CHI SQUARE AND T DISTRIBUTIONS IN R.
- 11 (2 HOURS LABORATORY): ML ESTIMATION EXERCISES IN R

KNOWLEDGE AND UNDERSTANDING.
ESTIMATION OF THE PARAMETERS OF DISTRIBUTIONS COMMONLY USED IN PRACTICE. CONFIDENCE INTERVALS IN SIMPLE CASES. PROPERTIES OF THE ESTIMATORS USEFUL FOR STATISTICAL LEARNING.

APPLICATION KNOWLEDGE AND UNDERSTANDING
ESTIMATION OF THE MEAN AND THE VARIANCE OF DATA COLLECTED IN A DATA SET. COMPUTATION OF CONFIDENCE INTERVALS OF THE PARAMETERS BY ANALYZING A SAMPLE OF A NORMAL DISTRIBUTION. MAXIMUM LIKELIHOOD ESTIMATION PROCEDURES IN SOME PRACTICAL EXAMPLES.


DIDACTIC UNIT 3: STATISTICAL HYPOTHESIS TESTING (LECTURE/PRACTICE/LABORATORY HOURS 8/2/2)
- 12 (2 HOURS LECTURE): STATISTICAL HYPOTHESIS TESTING: INTRODUCTION. TYPE 1 AND TYPE 2 ERRORS OF A HYPOTHESIS TEST. NULL HYPOTHESIS AND CRITICAL REGION.
- 13 (2 HOURS LECTURE): HYPOTHESIS TESTING ON THE MEAN OF A NORMAL SAMPLE, WITH KNOWN VARIANCE: EXERCISE ON A UNILATERAL TEST.
- 14 (2 HOURS EXERCISES) HYPOTHESIS TESTING ON THE MEAN OF A NORMAL SAMPLE, WITH UNKNOWN VARIANCE: EXAMPLE OF A BILATERAL TEST. EXERCISE ON A TEST ON THE DIFFERENCES OF MEANS OF TWO NORMAL SAMPLES (WITH THE SAME VARIANCE).
- 15 (2 HOURS LECTURE): TEST ON THE VARIANCE OF A NORMAL SAMPLE. LEVEL OF SIGNIFICANCE AND POWER OF A TEST. ROC CURVE AND AUC.
- 16 (2 HOURS LECTURE) GOODNESS OF FIT AND Q-Q PLOT. NEYMAN-PEARSON'S LEMMA AND EXAMPLE OF A TEST ON THE MEAN OF A NORMAL SAMPLE WITH KNOWN VARIANCE. DEFINITION OF THE P-VALUE AND ITS USE IN HYPOTHESIS TESTING.
- 17 (2 HOURS LABORATORY) IMPLEMENTATION IN R OF TESTS ON THE MEAN OF A NORMAL SAMPLE, IN CASE OF KNOWN AND IN CASE OF UNKNOWN VARIANCE. Q-Q PLOT IN R AND NORMAL ASSUMPTION VERIFICATION IN DATA SETS BY VISUAL TOOLS.

KNOWLEDGE AND UNDERSTANDING.
HYPOTHESIS TESTS AND THEIR ERRORS. HYPOTHESIS TESTS IN COMMON CASES. PERFORMANCE FIGURES OF TESTS BETWEEN TWO HYPOTHESES. VERIFYING NORMAL DISTRIBUTION ASSUMPTIONS.

APPLICATION KNOWLEDGE AND UNDERSTANDING
COMPUTER-AIDED IMPLEMENTATION OF HYPOTHESIS TESTS ON THE PARAMETERS OF THE VARIABLES IN SIMPLE DATASETS. PLOTS FOR GOODNESS-OF-FIT.


DIDACTIC UNIT 4: DESIGN OF EXPERIMENT, ANALYSIS OF VARIANCE, LINEAR REGRESSION (LECTURE/PRACTICE/LABORATORY HOURS 8/0/6)
- 18 (2 HOURS LECTURE) ELEMENTS OF DESIGN OF EXPERIMENTS: COMPLETELY RANDOMIZED PLANS, RANDOMIZED BLOCK PLANS. ANALYSIS OF VARIANCE (ANOVA): DEFINITION AND FISHER-SNEDECOR TEST STATISTIC.
- 19 (2 HOURS LECTURE) ANOVA FOR COMPLETELY RANDOMIZED PLANS. ANOVA FOR RANDOMIZED BLOCK PLANS. TUKEY TEST FOR MULTIPLE COMPARISONS.
- 20 (2 HOURS LABORATORY) ANOVA (1-WAY E 2-WAY) AND TUKEY'S TEST IN R LANGUAGE.
- 21 (2 HOURS LECTURE) REGRESSION ANALYSIS: INTRODUCTION, LEAST SQUARES ESTIMATORS, MEAN AND VARIANCE OF THE ESTIMATORS. CONFIDENCE INTERVALS ON THE ESTIMATED PARAMETERS.
- 22 (2 HOURS LECTURE) HYPOTHESIS TEST ON THE SLOPE PARAMETER OF SIMPLE LINEAR REGRESSION. POLYNOMIAL REGRESSION WITH EXAMPLES.
- 23 (2 HOURS LABORATORY) INTRODUCTION TO REGRESSION ANALYSIS IN R.
- 24 (2 HOURS LABORATORY) POLYNOMIAL REGRESSION AND BASICS OF MODEL SELECTION (FOR POLYNOMIAL REGRESSION) IN R SOFTWARE LANGUAGE.

KNOWLEDGE AND UNDERSTANDING.
DESIGNING SIMPLE EXPERIMENTS TO COLLECT DATA. ANALYZING INFLUENCING FACTORS TOWARDS THE ANALYSIS OF VARIANCE. LINEAR REGRESSION MODELS. TESTS TO EVALUATE THE STATISTICAL SIGNIFICANCE OF LINEAR MODEL PARAMETERS. PERFORMANCE INDICES FOR THE ASSESSMENT OF THE FIT OF A REGRESSION MODEL TO DATA.

APPLICATION KNOWLEDGE AND UNDERSTANDING
APPLICATION OF THE ANALYSIS OF VARIANCE TO SIMPLE DATASETS AND IDENTIFICATION OF THE INFLUENCING FACTORS ON THE MEASURED QUANTITIES. BUILDING SIMPLE AND POLYNOMIAL REGRESSION MODELS BY USING SIMPLE DATASETS. EVALUATION OF THE FIT OF A REGRESSION MODEL IN SIMPLE DATASETS.


TOTAL LECTURE/PRACTICE/LABORATORY HOURS 28/4/16
Teaching Methods
THE COURSE INCLUDES THEORETICAL LECTURES AND CLASSROOM EXERCISES. SOME CLASSROOM EXERCISES CAN BE SOLVED BY USING R DURING THE LABORATORY ACTIVITY.
IN ORDER TO PARTICIPATE TO THE FINAL ASSESSMENT AND TO GAIN THE CREDITS CORRESPONDING TO THE COURSE, THE STUDENT MUST HAVE ATTENDED AT LEAST 70% OF THE HOURS OF ASSISTED TEACHING ACTIVITIES.

Verification of learning
THE FINAL EXAM CONSISTS OF A TEAM PROJECT WORK AND AN ORAL INTERVIEW. THE PROJECT WORK AIMS AT EVALUATING THE ABILITY TO ANALYZE A SIMPLE DATASET IN R. THE ORAL INTERVIEW AIMS AT EVALUATING: THE ABILITY OF ANALYZING DATA BY APPLYING THE METHODS AND TOOLS ILLUSTRATED DURING THE COURSE; THE KNOWLEDGE AND UNDERSTANDING OF THE CONCEPTS PRESENTED DURING THE COURSE. FURTHERMORE, THE PERSONAL JUDGEMENT AND THE COMMUNICATION SKILLS ARE ALSO EVALUATED.
Texts
D. PICCOLO, STATISTICA PER LE DECISIONI. 3° ED., IL MULINO, 2020.
S. M. IACUS, G. MASAROTTO, LABORATORIO DI STATISTICA CON R, 2° ED., MCGRAW-HILL, 2014

SUGGESTED TEXTBOOKS FOR FURTHER READINGS:
M. GUIDA, AFFIDABILITÀ, ARACNE, 2020.
A. PAPOULIS, S. U. PILLAI, PROBABILITY, RANDOM VARIABLES AND STOCHASTIC PROCESSES, 4TH ED., MCGRAW-HILL, 2001.

SUPPLEMENTARY TEACHING MATERIAL WILL BE AVAILABLE ON THE UNIVERSITY E-LEARNING PLATFORM (HTTP://ELEARNING.UNISA.IT) ACCESSIBLE TO STUDENTS USING THEIR OWN UNIVERSITY CREDENTIALS.
More Information
THE COURSE LANGUAGE IS ITALIAN.
  BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2024-11-18]