Giuseppina ALBANO | DATA ANALYSIS AND STATISTICAL MODELING
Giuseppina ALBANO DATA ANALYSIS AND STATISTICAL MODELING
cod. 0222800003
DATA ANALYSIS AND STATISTICAL MODELING
0222800003 | |
DEPARTMENT OF MANAGEMENT & INNOVATION SYSTEMS | |
EQF7 | |
DATA SCIENCE E GESTIONE DELL'INNOVAZIONE | |
2024/2025 |
OBBLIGATORIO | |
YEAR OF COURSE 1 | |
YEAR OF DIDACTIC SYSTEM 2022 | |
SPRING SEMESTER |
SSD | CFU | HOURS | ACTIVITY | |
---|---|---|---|---|
SECS-S/01 | 9 | 63 | LESSONS |
Exam | Date | Session | |
---|---|---|---|
ALBANO | 12/12/2024 - 10:00 | SESSIONE ORDINARIA | |
ALBANO | 12/12/2024 - 10:00 | SESSIONE DI RECUPERO | |
ALBANO | 09/01/2025 - 10:00 | SESSIONE ORDINARIA | |
ALBANO | 09/01/2025 - 10:00 | SESSIONE DI RECUPERO |
Objectives | |
---|---|
THE MAIN GOAL OF THE COURSE IS TO ALLOW STUDENTS TO ACQUIRE KNOWLEDGE RELATING TO THE MAIN TECHNIQUES: I) DATA ANALYSIS AND VISUALIZATION; II) OF STATISTICAL INFERENCE; III) CONSTRUCTION OF PREDICTIVE MODELS. KNOWLEDGE AND UNDERSTANDING: AT THE END OF THE COURSE, STUDENTS WILL ALSO HAVE ACQUIRED THE COMPUTATIONAL KNOWLEDGE NECESSARY FOR CORRECT IMPLEMENTATION OF CLASSICAL STATISTICAL INFERENCE TECHNIQUES, USING THE “R” PROGRAMMING LANGUAGE. APPLYING KNOWLEDGE AND UNDERSTANDING: IN ADDITION TO KNOWLEDGE, STUDENTS WILL DEVELOP: I) SKILLS NECESSARY TO ANALYZE AND INTERPRET QUANTITATIVE INFORMATION AND PRODUCE INDICATORS, STATISTICAL MODELS AND REPORTS TO SUPPORT DECISION MAKING ACTIVITIES IN SEVERAL FIELDS; II) ABILITY TO ANALYZE AND EVALUATE, AS WELL AS TO INDEPENDENTLY PRESENT, DOCUMENTS THAT INCLUDE QUANTITATIVE INFORMATION, MAKING CRITICAL JUDGMENTS ON THE METHODS OF DATA COLLECTION, ON THE METHODS OF PROCESSING THE COLLECTED INFORMATION, ON INFERENCE TECHNIQUES, ON PREDICTIVE MODELS BUILT AND ON VALIDITY OF THE CONCLUSIONS REACHED. MAKING JUDGMENTS: AT THE END OF THE COURSE, THE STUDENT WILL KNOW: • CRITICALLY EVALUATE AND INDEPENDENTLY IMPLEMENT ADEQUATE DATA SCIENCE SOLUTIONS IN DIFFERENT CONTEXTS; • ASSESS THE POTENTIAL AND LIMITATIONS OF USE OF THE TECHNIQUES AND MODELS LEARNED; • CHOOSE THE DECISION-MAKING CRITERIA, METHODOLOGIES, TECHNIQUES AND TECHNOLOGIES MOST SUITABLE FOR THE SOLUTION OF SPECIFIC PROBLEMS AND CLASSES OF PROBLEMS. COMMUNICATION SKILLS: WITH REGARD TO COMMUNICATION SKILLS, THE STUDENT WILL KNOW: • COMMUNICATE KNOWLEDGE, IDEAS, PROBLEMS AND SOLUTIONS IN A CLEAR AND AMBIGUOUS WAY, USING TECHNICAL TERMINOLOGY APPROPRIATELY AND ADAPTING THE METHODS OF EXPRESSION TO THE CULTURAL AND PROFESSIONAL CHARACTERISTICS OF THE RECIPIENTS OF THE COMMUNICATION; • IMPLEMENT APPROPRIATE SUMMARIES TO EFFECTIVELY COMMUNICATE THE RESULTS OF DATA ANALYSIS (INCLUDING BIG DATA) AND HIGHLIGHT THE ESSENTIAL ASPECTS USEFUL FOR IDENTIFYING SOLUTIONS. LEARNING ABILITY: THE STUDENT WILL DEVELOP THE SKILLS OF: • STUDY INDEPENDENTLY, EFFECTIVELY INTEGRATING THE ACQUIRED KNOWLEDGE; • EFFECTIVELY UNDERTAKE HIGHER LEVEL TRAINING COURSES. |
Prerequisites | |
---|---|
SOME BASIC NOTIONS WILL BE RECALLED ON: 0. DESCRIPTIVE STATISTICS 1. ELEMENTARY CALCULUS OF PROBABILITIES 2. RANDOM VARIABLES: DEFINITION, DISTRIBUTION FUNCTION, EXPECTED VALUE AND VARIANCE, V.C. REMARKABLE 3. POINT ESTIMATION, PROPERTIES OF ESTIMATORS, CONSTRUCTION METHODS 4. HYPOTHESIS TESTING: DEFINITIONS, Z TEST 5. CONFIDENCE INTERVAL: DEFINITIONS, CONFIDENCE INTERVAL Z |
Contents | |
---|---|
INTRODUCTION TO MULTIVARIATE DATA ANALYSIS (23 H) 1. DATA HANDLING 2. R, R-STUDIO, OBJECTS IN R, PACKAGES IN R, IMPORT / EXPORT DATA IN / FROM R 3. DATA DISPLAY - UNIVARIATE AND BIVARIATE GRAPHICS 4. DATA DISPLAY - MULTIVARIATE GRAPHICS 5. CREATE A STATISTICAL REPORT - RMARKDOWN MULTIVARIATE STATISTICS: INTERDEPENDENCE ANALYSIS (UNSUPERVISED STATISTICAL LEARNING) (10 H) 1. PRINCIPAL COMPONENT ANALYSIS: DEFINITIONS AND AIM, SOLUTION, SELECTION OF COMPONENTS, INTERPRETATION OF RESULTS, CIRCLE OF CORRELATIONS CONSTRUCTION OF ESTIMATORS (10 H) 1. MAXIMUM LIKELIHOOD METHOD 2. MLE ESTIMATORS FOR RANDOM VARIABLES (BERNOULLIAN, POISSON, NORMAL) 3. PROPERTIES OF MLE ESTIMATORS STATISTICAL MODELS FOR THE ANALYSIS OF THE DEPENDENCE (20 H) 1. DEFINITION OF A STATISTICAL MODEL 2. LINEAR AND NON-LINEAR MODELS 3. CLASSIFICATION OF MODELS 4. SIMPLE LINEAR REGRESSION: HYPOTHESIS, INTERPRETATION, ESTIMATION OF PARAMETERS, GOODNESS OF FIT, T TEST 5. MULTIPLE LINEAR REGRESSION: HYPOTHESIS, INTERPRETATION, ESTIMATION OF THE PARAMETERS, VALIDATION OF THE MODEL, VIOLATIONS OF THE HYPOTHESIS 6. LOGIT MODEL: HYPOTHESIS, INTERPRETATION, ML ESTIMATION OF PARAMETERS, Z TEST |
Teaching Methods | |
---|---|
THE COURSE IS COMPOSED OF 63 FRONTAL DIDACTICS DIVIDED INTO THEORETICAL LESSONS AND COMPLEMENTS AND EXERCISES. ATTENDANCE TO LESSONS IS NOT MANDATORY BUT STRONGLY RECOMMENDED. |
Verification of learning | |
---|---|
THE EVALUATION OF THE PROFIT CONSISTS IN THE DRAFTING AND DISCUSSION OF A STATISTICAL REPORT ON A REAL DATASET AGREED WITH THE TEACHER. THE SCORE IS BETWEEN 0 AND 30 AND THE WRITTEN PART IS PASSED IF THE SCORE IS GREATER OR EQUAL TO EIGHTEEN. THE ORAL EXAM, CONSISTING IN THE DISCUSSION OF THE REPORT, DURATION OF ABOUT 20 MINUTES AND FOCUS ON THE TOPICS COVERED IN THE REPORT AND ON THE THEORETICAL TOPICS OF THE COURSE. THE FINAL EVALUATION IS THE MEAN OF SCORES ACHIEVED IN THE WRITTEN AND ORAL EXAMINATION: THIS OVERALL SCORE REPRESENTS THE FINAL SCORE (THIRTY) AND THE EXAM IS PASSED IF IT REACHES THE MINIMUM THRESHOLD OF 18. IN THE EVALUATION OF THE ORAL EXAMINATION, IN ADDITION TO THE KNOWLEDGE OF THE SUBJECT COVERED, THE ACCURACY OF THE LANGUAGE AND THE ABILITY TO USE THE STATISTICAL TOOLS ACQUIRED CRITICALLY WILL BE ALSO CONSIDERED. |
Texts | |
---|---|
SLIDES PROVIDED BY THE TEACHER ANALISI DEI DATI E DATA MINING PER LE DECISIONI AZIENDALI, SERGIO ZANI, ANDREA CERIOLI, EDITORE: GIUFFRÈ. R FOR DATA SCIENCE, GARRETT GROLEMUND E HADLEY WICKHAM, |
More Information | |
---|---|
THE TEACHING LANGUAGE IS ITALIAN. ERASMUS STUDENTS MAY EVENTUALLY AGREE WITH THE TEACHER THE USE OF ENGLISH TEXT BOOKS. COURSE ATTENDANCE IS NOT MANDATORY, BUT IT IS STRONGLY RECOMMENDED FOR THE CHARACTERISTICS OF THE DISCIPLINE. NON-ATTENDING STUDENTS MUST INDEPENDENTLY PREPARE THE SCHEDULED PROGRAM. |
BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2024-11-29]