METODOLOGIE INFORMATICHE PER L'ANALISI DEI DATI

Amelia Giuseppina NOBILE METODOLOGIE INFORMATICHE PER L'ANALISI DEI DATI

0222500012
DIPARTIMENTO DI SCIENZE AZIENDALI - MANAGEMENT & INNOVATION SYSTEMS
INFORMATION TECHNOLOGY AND MANAGEMENT
2015/2016

YEAR OF COURSE 2
YEAR OF DIDACTIC SYSTEM 2014
PRIMO SEMESTRE
CFUHOURSACTIVITY
636LESSONS
Objectives
THE AIM OF THE COURSE IS THE DEVELOPMENT OF METHODS AND TECHNIQUES FOR THE TREATMENT AND ANALYSIS OF DATA USING ONE OF THE MOST POWERFUL AND FLEXIBLE STATISTICAL SOFTWARE, I.E. THE R PROGRAMMING LANGUAGE (HTTP://WWW.R-PROJECT.ORG/ ). IN THE FIRST PART OF THE COURSE ARE PRESENTED THE MAIN FEATURES OF R AND ITS BASIC FUNCTIONALITY. IN THE SECOND PART ARE ADDRESSED A NUMBER OF APPLICATIVE PROBLEMS RELATED TO THE PROCESSING AND ANALYSIS OF DATA AND THE MOST WIDESPREAD TECHNIQUES OF DESCRIPTIVE AND INFERENTIAL STATISTICS. THE LAST PART OF THE COURSE DEALS WITH THE SIMULATION OF SOME STOCHASTIC PROCESSES. THE COURSE AIMS TO ENABLE STUDENTS TO APPLY THEIR KNOWLEDGE AND THEIR EXPERTISE IN VARIOUS FIELDS WHICH REQUIRE A QUANTITATIVE SUPPORT TO THE DEVELOPMENT OF COMPUTER APPLICATIONS FOR THE MANAGEMENT, MANIPULATION AND ANALYSIS OF STATISTICAL DATA.
Prerequisites
BASIC KNOWLEDGE OF PROBABILITY AND STATISTICS
Contents
THE INTEGRATED ENVIRONMENT R. INTRODUCTION AND HISTORICAL NOTES. HOW TO INTERACT WITH R. VECTORS, ARRAYS AND MATRICES. LISTS. DATA FRAME. FACTORS. DEFINITION OF NEW FUNCTIONS. CONDITIONAL STATEMENTS. ITERATIVE STATEMENTS. SCRIPTS, AND OUTPUT DIRECTORY.
TABLES AND GRAPHS. SIMPLE FREQUENCY DISTRIBUTIONS. DOUBLE FREQUENCY DISTRIBUTIONS. CONDITIONED FREQUENCY DISTRIBUTIONS. THE MAIN GRAPHICAL REPRESENTATIONS. GRAPHIC FUNCTIONS AT A HIGH LEVEL, LOW LEVEL AND INTERACTIVE GRAPHICS. BAR CHARTS, PIE CHARTS AND STICKS. HISTOGRAMS. BOXPLOT. GRAPHICAL REPRESENTATIONS OF TABLES. GRAPHICAL REPRESENTATIONS TO COMPARE VARIABLES. SCATTERPLOT. GRAPHS OF FUNCTIONS.
DESCRIPTIVE STATISTICS OF THE DATA IN R. INTRODUCTION TO DESCRIPTIVE STATISTICS. EMPIRICAL DISCRETE AND CONTINUOUS DISTRIBUTION FUNCTION. POSITION AND DISPERSION INDICES. SAMPLE MEAN, SAMPLE MEDIAN AND SAMPLE MODE. PERCENTILES AND QUARTILES. SAMPLE VARIANCE, SAMPLE STANDARD DEVIATION AND COEFFICIENT OF VARIATION. THE FORM OF A FREQUENCY DISTRIBUTION. SKEWNESS AND KURTOSIS. CORRELATION, COVARIANCE AND CORRELATION COEFFICIENT.
CLUSTER ANALYSIS WITH R. INTRODUCTION TO THE ANALYSIS OF THE CLUSTER. BASICS AND DEFINITIONS. FUNCTIONS OF DISTANCE AND SIMILARITY MEASURES. OPTIMIZATION METHODS. HIERARCHICAL METHODS. ANALYSIS OF THE DENDROGRAM. NON-HIERARCHICAL METHODS.
RANDOM VARIABLES WITH R. DISCRETE PROBABILITY DISTRIBUTIONS AND THEIR SIMULATION (BERNOULLI, BINOMIAL, GEOMETRIC PASCAL, HYPERGEOMETRIC, POISSON). CONTINUOUS PROBABILITY DISTRIBUTIONS AND THEIR SIMULATION (UNIFORM, EXPONENTIAL, NORMAL, CHI-SQUARE, STUDENT). SOME IMPORTANT RESULTS RELATED TO THE RANDOM VARIABLES ANALYZED BY THE SIMULATION IN R.
INFERENTIAL STATISTICS WITH R. STATISTICAL INFERENCE. SAMPLING DISTRIBUTIONS. POINT ESTIMATE. CONFIDENCE INTERVALS. CONFIDENCE INTERVALS. CONFIDENCE INTERVALS FOR THE MEAN AND THE VARIANCE OF A NORMAL POPULATION. CONFIDENCE INTERVAL FOR THE PARAMETER OF A POPULATION OF BERNOULLI, POISSON AND EXPONENTIAL. MEAN DIFFERENCES IN NORMAL POPULATIONS. MEAN DIFFERENCES IN BERNOULLI POPULATIONS. TECHNIQUES OF MULTIVARIATE STATISTICAL ANALYSIS.
Teaching Methods
THE TEACHING METHOD INCLUDES THEORETICAL LESSONS INTEGRATED BY CONTINUOUS EXERCISES AND PROBLEMS, ALL CONNECTED TO THE METHODOLOGIES FOR DATA ANALYSIS AND FOR THE SIMULATION OF CERTAIN STOCHASTIC PROCESSES. CLASS ATTENDANCE IS STRONGLY RECOMMENDED. THE STUDENTS ARE GUIDED TO LEARN IN A CRITICAL AND RESPONSIBLE WHAT THAT THE TEACHER PRESENTS DURING THE LECTURES. STUDENTS ARE THUS ENCOURAGED TO COMMUNICATE TO THE ENTIRE CLASS THE IDEAS OF DEVELOPMENT AND OF IMPLEMENTATION OF STATISTICAL AND COMPUTATIONAL PROBLEMS, AND ARE ALSO ENCOURAGED TO ACQUIRE SKILLS AND EXPERTISE IN MANAGING THE COMPLEXITY OF NEW PROBLEMS RELATED TO DATA ANALYSIS. TO HELP STUDENTS IN THE STUDY, THE TEACHER WILL PROVIDE DURING THE COURSE COMPREHENSIVE NOTES OF THE LECTURES, THAT INCLUDE THE VARIOUS TOPICS AND PROBLEMS ADDRESSED.
Verification of learning
THE EXAM CONSISTS OF A PROJECT AND AN ORAL TEST. THE VOTE WILL DEPEND ON THE KNOWLEDGE AND ON THE ABILITY TO APPLY THE METHODS TO SOLVE CONCRETE APPLICATION PROBLEMS. STUDENTS WHO HAVE ATTENDED ASSIDUOUSLY HAVE AN ADVANTAGE IN THE ORAL DISCUSSION BECAUSE, DURING THE LESSONS, THEY HAVE BEEN STIMULATED TO LEARN AND TO CONNECT IN A SYSTEMATIC MANNER AND CRITICIZES THE VARIOUS TOPICS, AS WELL AS TO MANAGE THE COMPLEXITY OF NEW PROBLEMS.
Texts
- MICHAEL J. CRAWLEY (2012) THE R BOOK, WILEY.
- JANE M. HORGAN (2009) PROBABILITY WITH R. AN INTRODUCTION WITH COMPUTER SCIENCE APPLICATIONS. WILEY
- MARIA L. RIZZO. (2008) STATISTICAL COMPUTING WITH R. CHAPMAN & HALL/CRC TAYLOR & FRANCIS GROUP.
- LECTURE NOTES OF THE TEACHER
  BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2016-09-30]