Annibale ELIA | LINGUISTICS AND NEW MEDIA
Annibale ELIA LINGUISTICS AND NEW MEDIA
cod. 0323100029
LINGUISTICS AND NEW MEDIA
0323100029 | |
DIPARTIMENTO DI SCIENZE POLITICHE, SOCIALI E DELLA COMUNICAZIONE | |
EQF7 | |
CORPORATE COMMUNICATION E MEDIA - CORPORATE COMMUNICATION AND MEDIA | |
2017/2018 |
YEAR OF COURSE 1 | |
YEAR OF DIDACTIC SYSTEM 2017 | |
SECONDO SEMESTRE |
SSD | CFU | HOURS | ACTIVITY | |
---|---|---|---|---|
L-LIN/01 | 6 | 40 | LESSONS |
Objectives | |
---|---|
THE LINGUISTICS AND NEW MEDIA COURSE INTRODUCES THE STUDENTS TO COMPUTATIONAL LINGUISTICS, BY EXPLORING THE ALGORITHMS, THE TOOLS AND THE RESOURCES DEDICATED TO NATURAL LANGUAGE PROCESSING (NLP), WITH PARTICULAR REFERENCE TO THE SEMANTIC WEB AND MORE GENERALLY THEIR APPLICABILITY IN THE CONTEXT OF THE NEW MEDIA. INTO A MULTIDISCIPLINARY CONTEXT, THE COURSE AIMS TO PROVIDE THE STUDENTS WITH THE BASIC NLP ELEMENTS IN ORDER TO UNDERSTAND HAW THE INTERACTION BETWEEN MACHINE AND COMPUTER WORKS WHEN THEY USE NATURAL LANGUAGE. WITH THE PROLIFERATION OF THE NEW MEDIA, THE HUMAN/MACHINE INTERACTION IS EMPOWERED BY A VALUE WITH A STRATEGIC IMPORTANCE. IN THIS CONTEXT, COMPUTATIONAL LINGUISTICS BRINGS THE USER CLOSER TO THE LATEST GENERATION MEDIA. THE MAIN ISSUES THAT WILL BE FACED DURING THE COURSE WILL BE THE NLP PIPELINE (NORMALIZATION, TOKENIZATION, LEMMATIZATION, POSTAGGING) AND SOME OF THE MOST POPULAR NLP TASKS, SUCH AS TEXT PARSING, TEXT SUMMARIZATION, DISTRIBUTIONAL SEMANTICS, SEMANTIC ROLE LABELING, OPINION MINING E DECEPTION DETECTION. AT THE END OF THE COURSE, STUDENTS MUST BE ABLE TO ACCOMPLISH •THE FORMAL ANNOTATION, MANUAL AND SEMI-AUTOMATIC, OF LEXICAL RESOURCES AND CORPORA USING WELL-KNOWN MARKUP LANGUAGES, SUCH AS XML, JSON, OWL AND RDF. •THE FORMALIZATION OF LINGUISTIC PHENOMENA THROUGH ELECTRONIC DICTIONARIES, REGULAR EXPRESSIONS, FINITE STATE AUTOMATES AND ONTOLOGIES; •THE CONSTRUCTION AND ANALYSIS OF TEXTUAL CORPORA; FURTHERMORE, STUDENTS MUST DEMONSTRATE THE ACQUISITION OF ADEQUATE SCIENTIFIC AND CRITICAL SKILLS, BY •DRAFTING DOCUMENTS OF HIGH SCIENTIFIC VALUE; •VERIFYING SOURCES AND CONSULTING ACADEMIC MATERIAL FOR THE REALIZATION OF THE STATE OF ART IN RELATION TO COMPUTATIONAL LINGUISTICS TOPICS AND TASKS; •DEEPENING AND APPLYING TECHNIQUES AND MODELS STUDIED DURING THE COURSE. |
Prerequisites | |
---|---|
THE STUDENT SHOULD HAVE THE FOLLOWING REQUIREMENTS: - APPROPRIATE KNOWLEDGE OF OF GENERAL LINGUISTICS THEORIES, WITH PARTICULAR REFERENCE TO THE LEXICON-GRAMMAR THEORY AND THE GENERATIVE-TRANSFORMATIONAL THEORY. - BASIC KNOWLEDGE OF INFORMATICS. - GOOD KNOWLEDGE OF ENGLISH LANGUAGE. A KNOWLEDGE OF AT LEAST ONE PROGRAMMING LANGUAGE IS STRONGLY RECOMMENDED. |
Contents | |
---|---|
AFTER THE INTRODUCTION ON COMPUTATIONAL LINGUISTICS AND THE CONCEPT OF NEW MEDIA, THE COURSE WILL FOCUS ON THE FOLLOWING TOPICS: · THE DEVELOPMENT AND THE VERIFICATION OF TEXT ANALYSIS RESOURCES: ELECTRONIC DICTIONARIES, XML AND JSON DATABASES, OWL, RDF AND RDFS ONTOLOGIES · THE SYNTACTIC ANALYSIS OF TEXTS THROUGH THE USE OF FINITE STATE AUTOMATA AND REGULAR EXPRESSIONS. · THE STRATEGIES FOR THE RESOLUTION OF NLP OPEN PROBLEMS, SUCH AS AMBIGUITY RESOLUTION, AUTOMATIC TRANSLATION, SEMANTIC ANNOTATION, ETC. · THE AUTOMATIC LEARNING ALGORITHMS APPLIED TO COMPUTATIONAL LINGUISTICS: POS TAGGING, PARSING. · THE SEMANTIC ANALYSIS OF TEXTS BASED ON RULES OR ON MACHINE LEARNING ALGORITHMS INCLUDING DISTRIBUTIONAL SEMANTICS, CLUSTERING, AND MUTUAL INFORMATION. · A DETAILED STUDY AND A PRACTICAL APPLICATION OF MOST POPULAR NLP TASKS, SUCH AS TEXT SUMMARIZATION, SEMANTIC ROLE LABELING, OPINION MINING, SENTIMENT ANALYSIS, EMOTION DETECTION, DECEPTION DETECTION, MORPHOSEMANTICS |
Teaching Methods | |
---|---|
THE COURSE WILL CONSIST OF: - FONTAL LESSONS, 28 HOURS, 14 LESSONS; - STUDENTS' WORKSHOP, 6 HOURS, 3 LESSONS; - SEMINARS RELATING TO PRACTICAL APPLICATIONS, 6 HOURS, 3 LESSONS. |
Verification of learning | |
---|---|
THE EXAM PROVIDES TWO EVALUATION MOMENTS: A) THE FIRST EXAMINATION ARRANGEMENT, WHOSE SCORE WILL CONTRIBUTE TO THE AVERAGE OF THE FINAL SCORE, WILL BE RESERVED FOR THE ATTENDANTS, AND WILL BE DIVIDED INTO TWO DIFFERENT MOMENTS: 1) DRAFTING OF A GROUP SCIENTIFIC REPORT ON ONE OF THE SUBJECTS COVERED DURING THE FIRST HALF OF THE COURSE. GROUPS COMPOSED OF 3 STUDENTS WILL HAVE TWO WEEKS TO DEVELOP A RESEARCH PROJECT IN COMPUTATIONAL LINGUISTICS AND NEW MEDIA AND PRODUCE A REPORT IN THE FORM OF A SCIENTIFIC ARTICLE IN ITALIAN OR ENGLISH, IN WHICH IT MUST BE SPECIFIED EVERY STUDENT'S CONTRIBUTION. THE REPORT WILL HAVE A 2/3 WEIGHT. 2) THE ORAL PRESENTATION OF THE RESEARCH PROJECT TO THE CLASS. EACH STUDENT IN THE GROUP WILL TAKE PART OF THE SPEECH. THE TIME DEDICATED TO EACH SPEECH WILL BE OF 20 MINUTES. B) THE SECOND EXAMINATION ARRANGEMENT IS THE FOLLOWING: 1) FOR THE STUDENTS THAT ATTENDED THE COURSE, IT IS POSSIBLE TO MAKE A WRITTEN TEST ON THE SUBJECTS COVERED DURING THE COURSE AND THE REFERENCE TEXTS; THE TEST WILL INCLUDE 31 MULTIPLE-CHOICE QUESTIONS AND WILL TAKE UP TO 60 MINUTES. 2) FOR STUDENTS THAT DID NOT ATTEND THE COURSE, IT IS PLANNED AN ORAL EXAM FOR A MAXIMUM OF 40 MINUTES, FOCUSING ON BOTH THE SUBJECTS COVERED DURING THE COURSE AND THE REFERENCE TEXTS. |
Texts | |
---|---|
- IMPRESE TRA WEB 2.0 E BIG DATA, DELLA VOLPE, M., CEDAM, PADOVA, 2015 - TESTO E COMPUTER. ELEMENTI DI LINGUISTICA COMPUTAZIONALE, LENCI, A., MONTEMAGNI, S., PIRRELLI, V., CAROCCI EDITORE, 2016. - SLIDES AND PAPERS PROPOSED DURING THE COURSE |
More Information | |
---|---|
- |
BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2019-05-14]