OLGA SACCO | STATISTICAL ANALYSIS OF ENVIRONMENTAL DATA
OLGA SACCO STATISTICAL ANALYSIS OF ENVIRONMENTAL DATA
cod. 0512700034
STATISTICAL ANALYSIS OF ENVIRONMENTAL DATA
0512700034 | |
DEPARTMENT OF CHEMISTRY AND BIOLOGY "ADOLFO ZAMBELLI" | |
EQF6 | |
ENVIRONMENTAL SCIENCES | |
2024/2025 |
OBBLIGATORIO | |
YEAR OF COURSE 2 | |
YEAR OF DIDACTIC SYSTEM 2022 | |
SPRING SEMESTER |
SSD | CFU | HOURS | ACTIVITY | |
---|---|---|---|---|
CHIM/04 | 4 | 32 | LESSONS | |
CHIM/04 | 2 | 24 | EXERCISES |
Exam | Date | Session | |
---|---|---|---|
ANALISI STATISTICA PER L'AMBIENTE | 07/04/2025 - 10:00 | SESSIONE ORDINARIA |
Objectives | |
---|---|
THE GENERAL OBJECTIVE OF THE COURSE IS TO CONTRIBUTE TO THE TRAINING OF AN ENVIRONMENTAL OPERATOR EXPERT IN THE VARIOUS ASPECTS OF STATISTICAL ANALYSIS IN THE ENVIRONMENTAL FIELD. IN DETAIL, THE SPECIFIC OBJECTIVE OF THE TEACHING IS TO PROVIDE THE STUDENT WITH KNOWLEDGE, AND THE ABILITY TO APPLY IT, REGARDING STATISTICAL REASONING, HOW DATA IS ORGANISED, PRODUCING DESCRIPTIVE AND EXPLORATORY GRAPHS, UNDERSTANDING THE USE OF STATISTICAL MODELS FOR ANALYSIS OF DATA, EXPLAIN HOW TO COMMUNICATE AND DISCUSS THE RESULTS OF AN ANALYSIS OF ENVIRONMENTAL DATA. KNOWLEDGE AND UNDERSTANDING (DUBLIN DESCRIPTOR N.1) THE TEACHING AIMS TO ALLOW THE STUDENT TO GAIN KNOWLEDGE, WITH THE SUPPORT OF ADVANCED TEXTBOOKS, OF THE BASIC CONCEPTS OF STATISTICAL ANALYSIS WHICH CONCERNS: - THEORETICAL-METHODOLOGICAL ASPECTS - TECHNICAL-PRACTICAL ASPECTS THROUGH THE USE OF THE STATISTICAL LANGUAGE R. IT IS ALSO THE AIM OF THE TEACHING TO GIVE THE STUDENT THE TOOLS TO DEVELOP AND/OR APPLY INNOVATIVE IDEAS, POSSIBLY ALSO IN RESEARCH FIELDS, TO ANALYZE ENVIRONMENTAL DATA SETS USING APPROPRIATE STATISTICAL TECHNIQUES. FINALLY, THE COURSE AIMS TO PROVIDE THE STUDENT WITH USEFUL KNOWLEDGE TO UNDERSTAND AND ADDRESS CUTTING-EDGE TOPICS IN THE FIELD: - THE CHOICE OF STATISTICALLY RELIABLE DATA - STATISTICS APPLIED TO THE ANALYSIS OF ENVIRONMENTAL DATA APPLIED KNOWLEDGE AND UNDERSTANDING (DUBLIN DESCRIPTOR N.2) THE TEACHING AIMS TO ENABLE THE STUDENT TO CONCEIVE AND SUPPORT ARGUMENTS THAT ADDRESS THE ISSUES RELATING TO THE COLLECTION OF ENVIRONMENTAL DATA IN AN INDIRECT OR DIRECT WAY, ALSO WITH THE USE OF MULTIVARIATE STATISTICAL METHODS AND SPECIALIST SOFTWARE (R ENVIRONMENT ). THE KNOWLEDGE LEARNED WILL ALLOW THE STUDENT TO SOLVE PROBLEMS, IN NEW (UNFAMILIAR) AND INTERDISCIPLINARY FIELDS, RELATING TO STATISTICAL ANALYSIS FOR THE ENVIRONMENT, BOTH FROM THE THEORETICAL AND PRACTICAL ASPECTS. INDEPENDENCE OF JUDGMENT (DUBLIN DESCRIPTOR N.3) THE TEACHING AIMS TO ENABLE THE STUDENT TO COLLECT AND INTERPRET RELEVANT DATA RELATING TO ENVIRONMENTAL PROBLEMS FOR A CORRECT SETUP (OR EVALUATION) OF A STATISTICAL MODEL. IN DETAIL, THE STUDENT WILL BE ABLE TO INDEPENDENTLY INTEGRATE THEIR KNOWLEDGE SO AS TO BETTER MANAGE THE COMPLEXITY OF THE PROBLEMS RELATING TO THEORETICAL ASPECTS RELATED TO ENVIRONMENTAL STATISTICAL ANALYSIS, FORMULATING JUDGEMENTS, EVEN WITH PARTIAL DATA, BASED ON THE CHOICE OF THE MOST APPROPRIATE STATISTICAL METHODS FOR THE ANALYSIS OF ENVIRONMENTAL PHENOMENA AND ON THE KNOWLEDGE OF THE HYPOTHESES THAT JUSTIFY THE METHODS USED. COMMUNICATION SKILLS (DUBLIN DESCRIPTOR N.4) THE TEACHING WILL ENABLE THE STUDENT TO COMMUNICATE, TO SPECIALIST AND NON-SPECIALIST INTERLOCUTORS, CONCEPTS AND CONCLUSIONS, AS WELL AS THE RATIONALE UNDERLYING THEM, THE RESULT OF THE KNOWLEDGE ACQUIRED, IN RELATION TO ENVIRONMENTAL STATISTICAL ANALYSIS BOTH WITH REGARD TO THE THEORETICAL ASPECTS - METHODOLOGICAL, BOTH IN RELATION TO TECHNICAL-PRACTICAL ASPECTS. ABILITY TO LEARN (DUBLIN DESCRIPTOR N.5) THE COURSE WILL PROVIDE THE STUDENT WITH THE NECESSARY SKILLS FOR SELF-MANAGED AND AUTONOMOUS STUDY, WHICH ALLOWS HIM TO FIND INDIVIDUAL PATHS TO ADDRESS THE COMPLEX AND MULTIDISCIPLINARY PROBLEMS OF THE R STATISTICAL ENVIRONMENT FOR MODEL ESTIMATION AND EVALUATION OF ITS GOODNESS OF FIT. |
Prerequisites | |
---|---|
IT IS REQUIRED THE KNOWLEDGE OF THE GENERAL MATHEMATICAL TOOLS SUCH AS LINEAR ALGEBRA (MATRICES, VECTORS, AND RELATED OPERATIONS; EIGENVALUES AND AUTOVETTORI). MOREOVER IT IS VERY USEFUL TO BE CONFIDENT WITH THE MATHEMATICAL FORMALISMS SUCH AS READING FORMULAS AND EQUATIONS AT THE LEVEL OF A GENERAL MATHEMATICAL COURSE AS UNDERGRADUATE. IT IS ALSO NECESSARY TO HAVE BASIC KNOWLEDGE OF COMPUTER USE FOR THE PRACTICAL DATA ANALYSIS. IT IS NOT REQUIRED THE KNOWLEDGE OF THE PROGRAMMING LANGUAGE R. |
Contents | |
---|---|
THE COURSE IS ORGANIZED IN A THEORETICAL AND METHODOLOGICAL PART (32 HOURS) AND A TECHNICAL-PRACTICAL PART (24 HOURS) FOR A TOTAL OF 56 HOURS AND 6 CFU. THE MAIN TOPICS FOR THE THEORETICAL AND METHODOLOGICAL PART ARE: -STATISTIC IN ENVIRONMENTAL SCIENCE -DESIGN OF EXEPERIMENTS AND SAMPLING -CLASSIFICATION OF DATA AND MEASUREMENT SCALES -THE USE OF CLASSIFICATION TABLES -GRAPHIC REPRESENTATIONS OF UNIVARIATE DISTRIBUTIONS -MEAN, MODE, MEDIAN -DATA DISPERSION AND VARIABILITY -SYMMETRY AND CURTOSIS - DISTRIBUTIONS AND DENSITIES: MATEMATHICAL ASPECTS FROM PROBABILITY THEORY. -COMBINATORIAL CALCULATIONS - DISCTRE AND CONTINUOS DISTRIBUTIONS OF GENERAL INTEREST. -CHI-SQUARE, STUDENT AND FISHER DISTRIBUTIONS -COMPARISON OF OBSERVED AND THEORETYCAL DISTRIBUTIONS -VERIFICATION OF THE HYPOTHESIS -INFERENCE ON A GROUP OF MEANS BY T-TEST -NON PARAMETRIC TESTS -ANOVA ANALYSIS - FROM THE STATISTICAL PROBABILITY: ESTIMATORS, CORRELATION MEASURES, MEASURES OF ASSOCIATION AND ELEMENTS OF DESCRIPTIVE STATISTICS. - CONCEPTS OF STATISTICAL INDEPENDENCE/STATISTICAL DEPENDENCE, CORRELATION, ASSOCIATION, CAUSALITY. - MULTIPLE LINEAR REGRESSION. - STEPWISE REGRESSION AND MODEL SELECTION. - MEASURES DISTANCE, SIMILARITY AND DISSIMILARITY: MATHEMATICAL PROPERTIES AND EXAMPLES. - TRANSFORMATIONS OF RANDOM VARIABLES. - INTRODUCTION TO CLUSTERING, - HIERARCHICAL CLUSTERING: IDEAS, PRINCIPLES AND BASIC ALGORITHMS - PARTITIONAL CLUSTERING: IDEAS, PRINCIPLES AND BASIC ALGORITHMS THE MAIN TOPICS FOR THE TECHNICAL AND PRACTICAL PART ARE - INTRODUCTION TO THE R PROGRAMMING ENVIRONMENT. - R AND R PACKAGES, INSTALLATION AND COMMAND LINES - R STUDIO - DATA AND DATA STRUCTURES IN R - READING AND WRITING FILES IN R - USE OF GRAPHICAL FUNCTIONS IN R - CONTROL STRUCTURES IN R - FUNCTIONS AND GRAPHICS FOR DESCRIPTIVE STATISTICS IN R - PROBABILITY DISTRIBUTIONS IN R - SIMPLE LINEAR REGRESSION IN R - MULTIPLE LINEAR REGRESSION IN R |
Teaching Methods | |
---|---|
THE COURSE CONSISTS OF 56 HOURS OF THEORETICAL/METHODOLOGICAL LESSONS AND PRACTICAL EXERCISES OR LABORATORY (6 CFU). IN PARTICULAR, THERE WILL BE 32 HOURS OF TEACHING ON THEORETICAL-METHODOLOGICAL ASPECTS AND 24 HOURS OF LABORATORY WITH THE COMPUTER FOR TECHNICAL AND PRACTICAL APPLICATIONS. THE COURSE IS ORGANIZED AS FOLLOWS: CLASSROOM LESSONS ON ALL COURSE TOPICS (16 LESSONS OF 2 HOURS EACH), PRACTICAL EXERCISES WITH THE COMPUTER ON ALL TOPICS OF THE COURSE (8 LESSONS OF 3 HOURS EACH). THE TECHNICAL-PRACTICAL EXERCISES WILL FOLLOW THE THEORETICAL LESSONS ON THE SAME SUBJECT. FOR PRACTICAL EXERCISES STUDENTS WILL USE THEIR OWN COMPUTER AND INSTALL THE STATISTICAL SOFTWARE R (WHICH IS OPEN-SOURCE). FOR PRACTICAL EXERCISES, STUDENTS CAN WORK INDIVIDUALLY AND IN PAIRS. COURSE MATERIAL SUCH AS: SLIDES, EXAMPLES IN ,R, EXERCISES AND DATASETS ALL COURSE MATERIAL WILL BE PROVIDED AT THE BEGINNING OF THE COURSE. STUDENTS ARE KINDLY INVITED TO READ THE CLASS MATERIAL BEFORE THE LESSON TO BETTER BENEFIT AND INTERACT DURING THE CLASS HOURS |
Verification of learning | |
---|---|
THE ACHIEVEMENT OF THE COURSE’S OBJECTIVES IS CERTIFIED WITH AN EXAM CONSISTING IN A WRITTEN EXAM AND A PRATICAL ONE. THE FINAL GRADE DEPENDS FROM THE SCORES GIVEN IN EACH PART. 18/30 POINTS ARE NECESSARY TO PASS THE EXAM. THE WRITTEN EXAM CONSISTS IN 10 QUESTIONS ON THE THEORETICAL AND METHODOLOGICAL ASPECTS CONSIDERED IN THE COURSE’S PROGRAM (1 POINT PER QUESTION). THE PRATICAL WORK CONSISTS IN FEW EXERCIZES USING STATISTICAL PROGRAM ON THE TOPICS EXPOSED IN THE CLASSES (THE SCORE IS UP TO 20 POINTS). THE MAXIMUM POINT (30/30) IS ACHIEVED WHEN THE STUDENT ANSWERS CORRECTLY TO THE THEORETICAL QUESTIONS AND CARRIES OUT THE PROPOSED EXERCISE CORRECTLY THROUGH THE AID OF THE STATISTICAL SOFTWARE. |
Texts | |
---|---|
1)WALTER W. PIEGORSCH, A. JOHN BAILER “ANALYZING ENVIRONMENTAL DATA” – WILEY (2005) 2)RICHARD G. BRERETON “CHEMOMETRICS – DATA ANALYSIS FOR THE LABORATORY AND CHEMICAL PLANT” – WILEY (2003) 3)PETER DALGAARD - "INTRODUCTORY STATISTICS WITH R" - SPRINGER 4)COURSE NOTES |
BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2025-03-26]