DATA SCIENCE

Domenico PARENTE DATA SCIENCE

0222600021
DIPARTIMENTO DI SCIENZE AZIENDALI - MANAGEMENT & INNOVATION SYSTEMS
EQF7
BUSINESS INNOVATION AND INFORMATICS - BUSINESS, INNOVAZIONE ED INFORMATICA
2018/2019

OBBLIGATORIO
YEAR OF COURSE 2
YEAR OF DIDACTIC SYSTEM 2016
PRIMO SEMESTRE
CFUHOURSACTIVITY
1060LESSONS
Objectives
THE COURSE (60 HOURS AND 10 ECTS) AIMS AT PROVIDING STUDENTS WITH AN ENDOWMENT OF KNOWLEDGE RELATED TO DATA ANALYSIS, IN ORDER TO ALLOW A SCALABLE MANAGEMENT OF COMPLEX SYSTEMS. IT ALSO AIMS AT DEVELOPING ANALYTICAL CAPABILITIES TO SOLVE COMPLEX PROBLEMS, WHOSE SOLUTIONS GO TOWARD SYNERGIC APPROACHES IN TERMS OF DATA MINING ALGORITHMS, ADVANCED COMPUTATIONAL PARADIGMS, DISTRIBUTED SYSTEM FOR DATA MANAGEMENT, TARGETED AT DATA-DRIVEN DISCOVERY AND PREDICTIVE ANALYSIS.

THE STUDENT, AT THE END OF THE COURSE, WILL HAVE ACQUIRED THEORETICAL KNOWLEDGE AND PRACTICAL SKILLS RELATED TO DATA ANALYSIS AND ANALYTICS (FOR SOLVING PROBLEMS RELATED TO THE ACQUISITION AND MANAGEMENT OF BIG DATA) AND THE ABILITY TO USE THE MAIN TECHNIQUES AND TOOLS FOR THE RESOLUTION OF SPECIFIC PROBLEMS.

THE STUDENT WILL BE ENCOURAGED TO DEVELOP ANALYTICAL SKILLS TARGETED AT EXTRACTING INTRINSIC DATA FEATURES AND THE CAPABILITY TO GET AN ABSTRACTION THAT EMPHASIZES THE NATURE OF THE PROCESSED DATA.

THE COURSE AIMS AT FOSTERING THE DEVELOPMENT OF SKILLS IN DATA COLLECTION AND DATA ANALYSIS, THROUGH HYBRID APPROACHES THAT COMBINE COMPLEX STRATEGIES TO EXTRACT EFFECTIVE INFORMATION FROM ROUGH DATA
Prerequisites
BASIC NOTIONS OF DATA BASES
Contents
GOAL IS TO PROVIDE A SOLID AND MODERN ACADEMIC PREPARATION FOR UNDERSTANDING AND MANAGING THE VARIOUS PERSPECTIVES AND NUANCES INVOLVED IN THE DATA ANALYSIS.
THE TOPICS INCLUDE THE MANIPULATION OF LARGE-SCALE DATA; AN OVERVIEW OF POTENTIAL EXISTING TECHNOLOGIES AND THE RELATED OPERATIONAL CONTEXTS; METHODOLOGICAL AND FORMAL APPROACHES TO DATA ANALYTICS; DATA VISUALIZATION AND PROVENANCE; HIGH-LEVEL MODELING THROUGH THE SEMANTIC WEB TECHNOLOGIES.

THE COURSE IS DIVIDED INTO TWO PARTS:
FIRST PART
- INTRODUCTORY LESSON ON DATA SCIENCE, ITS EMPLOYMENT AND THE ROLE IN DIFFERENT APPLICATION AREAS (2H).

- DATA ANALYSIS, DATA VISUALIZATION (2H)
- BASIC STATISTICS FOR DATA DESCRIPTION AND ANALYSIS (4H).
-PRINCIPAL UNSUPERVISED, SUPERVISED AND PREDICTIVES ALGORITHMS TO ANALYZE DATA (6H),
- BACKGROUND ON THE PYTHON LIBRARIES FOR THE DATE MANIPULATION (2H).
- STUDY OF ADVANCED TECHNIQUES FOR MODELING AND ABSTRACTION OF RAW DATA: THE INFORMATION COMING FROM THE SAME DATA IS PROCESSED THROUGH "STRATEGIES AND METHODS TO IDENTIFY, COLLECT, DEVELOP, STORE AND MAKE ACCESSIBLE KNOWLEDGE" THANKS TO EVOLVED INSTRUMENTS OF INFORMATION TECNOLOGY (2H).
- METHODOLOGY FOR EXPLICIT FORMALIZATION OF KNOWLEDGE THROUGH TECHNIQUES AND LANGUAGES ONTOLOGY-DRIVEN (2H).
- SEMANTIC TECHNOLOGIES: RDF, RDF-S (2H),
- SEMANTIC TECHNOLOGIES: OWL (2H)
- WELL-KNOWN ONTOLOGICAL LANGUAGES AS SKOS AND FOAF (2H),
-APPLICATIONS IN THE TEXT MINING DOMAIN (2H)
- NATURAL LANGUAGE PROCESSING (2H).

THE SECOND PART
INTRODUZIONE AL DATA MINING (3H)
SIMILARITÀ (3H)
MINING DATA STREAMS (4H)
FREQUENT ITEMSET (4H)
CLUSTERING(4H)
ADVERTISING ON THE WEB (4H)
RECOMMENDATION SYSTEMS (4H)
LARGE-SCALE MACHINE LEARNING (4H)

Teaching Methods
THE COURSE INCLUDES LECTURES IN CLASSROOMS AND PRACTICAL EXERCISES ON THE TOPICS COVERED IN CLASS.
Verification of learning
THE EXAM INCLUDES A WRITTEN TEST AND AN ORAL EXAMINATION.
Texts
JOEL GRUS, "DATA SCIENCE FROM SCRATCH", 2015 - O'REILLY MEDIA.
J. LESKOVEC, A. RAJARAMAN, J.D. ULLMAN, "MINING OF MASSIVE DATASETS", 2ND ED., CAMBRIDGE UNIVERSITY PRESS.
More Information
SLIDES AND OTHER MATERIAL PROVIDED BY THE TEACHER
  BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2019-10-21]