Genoveffa TORTORA | NATURAL LANGUAGE PROCESSING
Genoveffa TORTORA NATURAL LANGUAGE PROCESSING
cod. 0522500138
NATURAL LANGUAGE PROCESSING
0522500138 | |
COMPUTER SCIENCE | |
EQF7 | |
COMPUTER SCIENCE | |
2023/2024 |
YEAR OF DIDACTIC SYSTEM 2016 | |
SPRING SEMESTER |
SSD | CFU | HOURS | ACTIVITY | |
---|---|---|---|---|
INF/01 | 4 | 32 | LESSONS | |
INF/01 | 2 | 16 | LAB |
Objectives | |
---|---|
THE GOAL OF THIS COURSE IS TO PROVIDE STUDENTS WITH METHODOLOGICAL AND TECHNOLOGICAL SKILLS TO DESIGN AND DEVELOP NLP SYSTEMS. KNOWLEDGE AND UNDERSTANDING •KNOWLEDGE OF LINGUISTIC PHENOMENA THAT CHARACTERIZE AND MAKE HARD THE DEVELOPMENT OF NLP APPROACHES; •DEEP KNOWLEDGE OF THE MAIN TECHNIQUES FOR STRUCTURAL ANALYSIS AND SEMANTIC INTERPRETATION OF TEXTS; • DEEP KNOWLEDGE OF PRACTICAL TOOLS TO PERFORM NLP TASKS; • KNOWLEDGE OF NLP APPLICATIONS, SUCH AS MACHINE TRANSLATION, INFORMATION EXTRACTION, SENTIMENT ANALYSIS, AND INTERACTIVE DIALOG SYSTEMS (BOTH WRITTEN AND SPOKEN) APPLYING KNOWLEDGE AND UNDERSTANDING: •KNOW HOW TO ANALYSE GENERAL PROBLEMS AND APPLY PROPER STRATEGIES IN NATURAL LANGUAGE PROCESSING (NLP); •KNOW HOW TO CHARACTERIZE THE ROLE OF BOTH DATA AND APPLIED MACHINE LEARNING MODELS WITHIN NLP SYSTEMS; •KNOW HOW TO DESIGN AND IMPLEMENT APPLICATION SOLUTIONS FOR SOLVING SOME NLP PROBLEMS. |
Prerequisites | |
---|---|
STUDENTS SHOULD BE FAMILIAR WITH DI PROBABILITY, LINEAR ALGEBRA, PROGRAMMING, AND MACHINE LEARNING METHODS. NO PROPAEDEUTIC TEACHING ARE REQUIRED. |
Contents | |
---|---|
AFTER INTRODUCING THE NATURAL LANGUAGE PROCESSING, BY INCLUDING ITS CHARACTERIZATION AS A DISCIPLINE THAT COMBINES COMPUTER SCIENCE METHODS WITH RESEARCH INSIGHTS FROM LINGUISTICS (THE STUDY OF HUMAN LANGUAGE), THE COURSE WILL FOCUS ON THE FOLLOWING TOPICS: STRUCTURAL ANALYSIS OF TEXTS •WORDS, WORD COUNTING, LEXICONS (2 HOURS) •TEXT NORMALIZATION (2 HOURS) •DISTANCE MEASURES (2 HOURS) •PART-OF-SPEECH TAGGING (2 HOURS) TEXT SEMANTICS AND EMERGING ARCHITECTURES •VECTOR SEMANTICS AND WORD EMBEDDINGS (2 HOURS) •TEXT CLASSIFICATION WITH NEURAL NETWORKS (3 HOURS) •RECURRENT NEURAL NETWORKS AND LANGUAGE MODELS (3 HOURS) •TRANSFORMER ARCHITECTURE (2 HOURS) •PRETRAINED MODELS (2 HOURS) NLP: THE MAIN APPLICATIONS •INTERACTIVE DIALOG (2 HOURS) •MACHINE TRANSLATION (2 HOURS) •INFORMATION RETRIEVAL (2 HOURS) •SENTIMENT ANALYSIS (2 HOURS) •TEXT SUMMARIZATION (2 HOURS) •NATURAL LANGUAGE GENERATION (2 HOURS) LABORATORY: •TEXT PROCESSING WITH PYTHON (3 HOURS) •CATEGORIZATION AND WORD TAGGING (3 HOURS) •EXTRACTING INFORMATION FROM TEXT (3 HOURS) •TRANSFORMER ARCHITECTURE: APPLICATION EXAMPLES (3 HOURS) •DESIGN AND DEVELOPMENT OF NLP SOLUTIONS: PRESENTATION OF CASE STUDIES (4 HOURS) |
Teaching Methods | |
---|---|
THE COURSE INCLUDES: •FRONTAL LECTURES TO TRANSFER THE KNOWLEDGE RELATED TO THE COURSE CONTENTS (4 CFUS/32 HOURS) •LABORATORY SESSIONS AND TUTORIALS TO TRAIN STUDENTS ON PRACTICAL AND COLLABORATIVE ACTIVITIES (2 CFUS/16 HOURS) •EACH LECTURE WILL INCLUDE BOTH THE PRESENTATION BY TEACHERS OF THE COURSE CONTENTS AND TUTORIALS OF THEIR PRACTICAL APPLICATION |
Verification of learning | |
---|---|
•THE EXAM CONSISTS OF A PRELIMINARY WRITTEN TEST AND AN ORAL EXAMINATION TO VERIFY THE ACQUIRED KNOWLEDGE AND TO DISCUSS THE ACTIVITIES CARRIED OUT DURING THE COURSE. ACTIVITIES INCLUDE THE REALIZATION OF A PROJECT IN A GROUP. WRITTEN EXAMS CAN BE REPLACED BY PROGRESSIVE ASSESSMENT TESTS THAT INCLUDE QUESTIONS CONCERNING BOTH THE KNOWLEDGE AND UNDERSTANDING OF LECTURE ARGUMENTS AND THE ABILITY TO APPLY THEM THROUGH EXERCISES. •WRITTEN EXAMINATION (2 HOURS): TO EVALUATE THE GAINED KNOWLEDGE ON NATURAL LANGUAGE PROCESSING TECHNIQUES AND SOLUTIONS, THE TESTS WILL BE COMPOSED OF OPEN QUESTIONS AND EXERCISES. THE SCORES ARE ASSIGNED DEPENDING ON THE COMPLEXITY OF THE QUESTIONS OR EXERCISES (BETWEEN 4 AND 10 POINTS). THE EVALUATION CRITERIA INCLUDE THE CORRECTNESS AND COMPLETENESS OF THE LEARNING AND THE CLARITY OF THE PRESENTATION. THE FINAL MARK IS OUT OF 30. •ASSESSMENT TESTS: NON-CUMULATIVE TESTS COULD BE DELIVERED. STUDENTS WHO WILL PASS THE TESTS WILL BE EXEMPTED FROM THE WRITTEN EXAMINATION. THE AIM IS TO ENCOURAGE STUDENTS TO FOLLOW EFFECTIVELY THE COURSE. •PROJECT: THE PROJECT ALLOWS THE STUDENT TO PRACTICE ON THE CONTENTS LEARNED DURING THE COURSE. DURING THE ORAL EXAM, THE PROJECT WILL BE DISCUSSED DIRECTLY WITH THE TEACHER THAT WILL VERIFY THE FOLLOWING: •ADHERENCE TO THE REQUIREMENTS • COMPLETENESS AND THE CORRECTNESS OF THE PRODUCED SOFTWARE •COMPREHENSION OF THE REALIZED ARTIFACTS •ABILITY TO DESCRIBE THE OBTAINED RESULTS AND TO POINT OUT ANY LIMITATIONS AND PROBLEMS STILL OPEN. •ORAL EXAMINATION AIMS TO EVALUATE THE GENERAL KNOWLEDGE OF THE STUDENT WITH RESPECT TO THE ENTIRE COURSE PROGRAM. THE EVALUATION CRITERIA INCLUDE THE COMPLETENESS AND CORRECTNESS OF THE LEARNING AND THE CLARITY OF THE PRESENTATION. •FINAL EVALUATION: THE EVALUATION WILL BE GIVEN BY THE AVERAGE SCORE OF ASSESSMENT TESTS (OR THE WRITTEN EXAMINATION) AND THE POINTS OBTAINED BY DISCUSSING THE PROJECT AND THE ORAL TEST. |
Texts | |
---|---|
COURSE BOOK: D. JURAFSKY AND J. MARTIN. SPEECH AND LANGUAGE PROCESSING, PRENTICE HALL, THIRD EDITION (2022). RECOMMENDED READING: •EISENSTEIN, JACOB. INTRODUCTION TO NATURAL LANGUAGE PROCESSING. MIT PRESS, 2019. •HARDENIYA, NITIN, ET AL. NATURAL LANGUAGE PROCESSING: PYTHON AND NLTK. PACKT PUBLISHING LTD, 2016. •BIRD, STEVEN, EWAN KLEIN, AND EDWARD LOPER. NATURAL LANGUAGE PROCESSING WITH PYTHON: ANALYZING TEXT WITH THE NATURAL LANGUAGE TOOLKIT. " O'REILLY MEDIA, INC.", 2009. |
More Information | |
---|---|
ATTENDANCE OF LECTURES IS STRONGLY ENCOURAGED. STUDENTS MUST SPEND A CONSIDERABLE AMOUNT OF TIME STUDYING AT HOME, AND FOR DEVELOPING THE COURSE PROJECT. INFORMATION CONCERNING THE COURSE IS AVAILABLE ON THE E-LEARNING PLATFORM OF THE DIPARTIMENTO DI INFORMATICA AT HTTP://ELEARNING.INFORMATICA.UNISA.IT/EL-PLATFORM/ CONTACTS PROF.SSA GENOVEFFA TORTORA TORTORA@UNISA.IT PROF.SSA LOREDANA CARUCCIO LCARUCCIO@UNISA.IT |
BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2024-11-05]