ARTIFICIAL INTELLIGENCE AND LARGE LANGUAGE MODELS

Vincenzo DEUFEMIA ARTIFICIAL INTELLIGENCE AND LARGE LANGUAGE MODELS

0060100058
COMPUTER SCIENCE
Corso di Dottorato (D.M.226/2021)
COMPUTER SCIENCE
2024/2025



YEAR OF COURSE 1
YEAR OF DIDACTIC SYSTEM 2024
SPRING SEMESTER
CFUHOURSACTIVITY
318LESSONS
Objectives
THE EDUCATIONAL OBJECTIVE OF THE COURSE IS TO PROVIDE STUDENTS WITH THE THEORETICAL AND PRACTICAL SKILLS NECESSARY TO UNDERSTAND, DEVELOP, AND APPLY ARTIFICIAL INTELLIGENCE TECHNIQUES USING LARGE LANGUAGE MODELS (LLMS). THE COURSE AIMS TO COVER BOTH BASIC AND ADVANCED CONCEPTS RELATED TO LLMS, FROM UNDERSTANDING THE ARCHITECTURE AND TRAINING TECHNIQUES TO THE ETHICAL AND SECURITY IMPLICATIONS OF THEIR USE.**

KNOWLEDGE AND UNDERSTANDING:
TO PROVIDE STUDENTS WITH IN-DEPTH KNOWLEDGE IN THE FOLLOWING KEY AREAS:
- FUNDAMENTALS OF LARGE LANGUAGE MODELS AND THEIR EVOLUTION.
- ADVANCED ARCHITECTURES OF LLMS BASED ON TRANSFORMERS AND ATTENTION MECHANISMS.
- PRE-TRAINING AND FINE-TUNING METHODOLOGIES FOR LLMS, WITH CONCRETE EXAMPLES.
- LLMS AS KNOWLEDGE BASES AND PROMPT ENGINEERING STRATEGIES TO ENHANCE REASONING CAPABILITIES.

APPLYING KNOWLEDGE AND UNDERSTANDING:
THE COURSE AIMS TO DEVELOP THE FOLLOWING PRACTICAL SKILLS IN STUDENTS:
- THE ABILITY TO DESIGN, TRAIN, AND OPTIMIZE LARGE LANGUAGE MODELS USING MODERN AI TECHNOLOGIES.
- THE ABILITY TO IMPLEMENT AND EVALUATE PRE-TRAINING AND FINE-TUNING STRATEGIES FOR PRACTICAL APPLICATIONS.
- TO APPLY LLMS TO COMPLEX AND MULTIDISCIPLINARY CASE STUDIES.
Prerequisites
THE STUDENT MUST HAVE A FOUNDATIONAL KNOWLEDGE OF MACHINE LEARNING, AND POSSESS PROGRAMMING SKILLS IN PYTHON.
Contents
THE COURSE WILL FOCUS ON THE FOLLOWING TOPICS:

INTRODUCTION TO LARGE LANGUAGE MODELS (LLMS) (3 HOURS OF THEORY)
- DEFINITION OF LLMS AND HISTORICAL INTRODUCTION.
- OVERVIEW OF THE MAIN FAMILIES OF LLMS (E.G., GPT, BERT, T5).

LANGUAGE MODELS AND BEYOND (3 HOURS OF THEORY)
- LANGUAGE MODEL IN THE STRICT SENSE (PROBABILITY THEORY, N-GRAM LANGUAGE MODEL).
- LANGUAGE MODEL IN THE BROAD SENSE (BERT AND BEYOND).
- FURTHER REFLECTIONS ON LANGUAGE MODELS.

ARCHITECTURES BEHIND LLMS (3 HOURS OF THEORY)
- INTRODUCTION TO TRANSFORMERS.
- STRUCTURE AND FUNCTIONING: SELF-ATTENTION, MULTI-HEAD ATTENTION MECHANISM.
- SCALABILITY.

PRE-TRAINING AND FINE-TUNING OF LLMS (3 HOURS OF THEORY)
- MAIN PHASES OF LLM TRAINING: PRE-TRAINING AND FINE-TUNING.
- UNSUPERVISED PRE-TRAINING STRATEGIES AND PRE-TRAINING TASKS (E.G., MASKED LANGUAGE MODELING).
- REINFORCEMENT LEARNING FROM HUMAN FEEDBACK (RLHF).
- LLM EVALUATION.

EFFICIENCY OF LLMS (3 HOURS OF THEORY)
- EFFICIENCY WITHIN THE TRANSFORMER.
- EFFICIENCY BEYOND THE TRANSFORMER.
- EFFICIENCY AFTER LLMS.

KNOWLEDGE, REASONING, AND PROMPT ENGINEERING (3 HOURS OF THEORY)
- LLMS AS KNOWLEDGE BASES.
- UPDATING FACTS FOR LLMS.
- WHY REASONING IS SPECIAL IN LLMS.
- TECHNIQUES FOR BETTER REASONING.
Teaching Methods
THE COURSE INCLUDES 18 HOURS OF LECTURE-BASED TEACHING, AIMED AT PRESENTING THE FUNDAMENTAL CONCEPTS OF ARTIFICIAL INTELLIGENCE AND LARGE LANGUAGE MODELS (LLMS) AND DEVELOPING STUDENTS' ABILITIES TO DESIGN AND IMPLEMENT LLM-BASED SOLUTIONS FOR VARIOUS NLP APPLICATIONS. THE TOPICS IN THE SYLLABUS ARE PRESENTED USING POWERPOINT PRESENTATIONS AND SAMPLE CODE, ENCOURAGING CRITICAL DISCUSSIONS WITH THE CLASS. FOR EACH TOPIC COVERED, CONCRETE EXAMPLES AND DESIGN TASKS ARE ILLUSTRATED, WHICH CAN BE DEVELOPED AS COURSE PROJECTS BY ONE OR MORE STUDENTS.
Verification of learning
THE ACHIEVEMENT OF THE COURSE OBJECTIVES IS CERTIFIED THROUGH THE PASSING OF AN EXAM GRADED ON A SCALE OF THIRTY. THE EXAM INVOLVES THE DEVELOPMENT OF A PRACTICAL PROJECT AIMED AT ASSESSING THE ABILITY TO APPLY THE ACQUIRED KNOWLEDGE. THE PROJECT CAN BE CARRIED OUT INDIVIDUALLY OR IN GROUPS OF UP TO TWO PEOPLE, WITH STUDENTS CHOOSING A TOPIC FROM A RANGE OF PROPOSALS MADE BY THE INSTRUCTORS. DURING THE PROJECT, STUDENTS MUST INTERACT WITH THE INSTRUCTORS TO COMMUNICATE PROGRESS AND ANY CHALLENGES, AGREEING ON OBJECTIVES AND METHODS FOR PROCEEDING.
AT THE END, STUDENTS MUST SUBMIT A WRITTEN REPORT CONTAINING THE PROJECT DOCUMENTATION, ALONG WITH A POWERPOINT PRESENTATION LASTING APPROXIMATELY 30 MINUTES, WHICH WILL BE DISCUSSED DURING THE FINAL EXAM.
Texts
- JURAFSKY, D., & MARTIN, J. H. "SPEECH AND LANGUAGE PROCESSING: AN INTRODUCTION TO NATURAL LANGUAGE PROCESSING, COMPUTATIONAL LINGUISTICS, AND SPEECH RECOGNITION WITH LANGUAGE MODELS." 2024. HTTPS://WEB.STANFORD.EDU/~JURAFSKY/SLP3/
- VASWANI, A., ET AL. "ATTENTION IS ALL YOU NEED." 2017. HTTP://ARXIV.ORG/ABS/1706.03762
- DEVLIN, J., ET AL. "BERT: PRE-TRAINING OF DEEP BIDIRECTIONAL TRANSFORMERS FOR LANGUAGE UNDERSTANDING." 2018. HTTP://ARXIV.ORG/ABS/1810.04805
OTHER ARTICLES AVAILABLE IN THE ARXIV ARCHIVE.
More Information
ATTENDANCE IN THE COURSE IS STRONGLY RECOMMENDED. STUDENTS SHOULD BE PREPARED TO SPEND A SIGNIFICANT AMOUNT OF TIME STUDYING OUTSIDE OF CLASS. A SATISFACTORY PREPARATION REQUIRES AN AVERAGE OF 1 HOUR OF STUDY FOR EACH HOUR SPENT IN CLASS AND ABOUT 30 HOURS FOR THE DEVELOPMENT OF THE PROJECT.
THE LESSON MATERIALS WILL BE AVAILABLE ON THE DEPARTMENTAL E-LEARNING PLATFORM HTTP://ELEARNING.INFORMATICA.UNISA.IT/EL-PLATFORM/.

CONTACTS:
PROF. VINCENZO DEUFEMIA
DEUFEMIA@UNISA.IT
  BETA VERSION Data source ESSE3 [Ultima Sincronizzazione: 2024-11-18]