Teaching‎ > ‎

Machine Learning and Data Analytics 2018-2019

See also the official syllabus.


Knowledge and understanding:
  • Know main kinds of problems which can be tackled with ML, DM, and EC and those ones concerning text and natural language and recommendation
  • Know main ML and DM techniques; know the high-level working scheme of EAs.
  • Know design, development, and assessment phases of a ML system; know main assessment metrics and procedures suitable for a ML system.
Applying knowledge and understanding:
  • Formulate a formal problem statement for simple practical problems in order to tackle them with ML, DM, or EC techniques.
  • Develop simple end-to-end ML or DM systems.
  • Experimentally assess a simple end-to-end ML or DM system.
Making judgements:
  • Judge the technical soundness of a ML or DM system.
  • Judge the technical soundness of the assessment of a ML or DM system.
Communication skills:
  • Describe, both in written and oral form, the motivations behind choices in the design, development, and assessment of a ML or DM system, possibly exploiting simple plots.
Learning skills:
  • Retrieve information from scientific publications about ML, DM or EC techniques not explicitly presented in this course.


Basics of statistics: basic graphical tools of data exploration; summary measures of variable distribution (mean, variance, quantiles); fundamentals of probability and of univariate and multivariate distribution of random variables; basics of linear regression analysis.
Basics of linear algebra: vectors, matrices, matrix operations; diagonalization and decomposition in singular values.
Basics of programming and data structures: algorithm, data types, loops, recursion, parallel execution, tree.

Detailed program

First chunk (3 CFU, by prof. Matilde Trevisani)

(This chunk is part of the 12CFU version of the cource (mainly DSSC), not of the 9CFU version)
  • Introduction to data science; data analytics, machine learning and statistical learning approaches: common and distinctive aspects (more and more different in name only).
  • Recap. of main concepts and tools of probability and statistical inference.
  • Elements of statistical learning; regression function; assessing model accuracy and the bias-variance trade-off; cross-validation methods.
  • Supervised learning and linear models; model validation and selection; hints to regularization and extensions.

Second chunk (3 CFU, by prof. Eric Medvet)

  • Definitions of Machine Learning and Data Mining; why ML and DM are hot topics; examples of applications of ML; phases of design, development, and assessment of a ML system; terminology.
  • Elements of data visualization.
  • Supervised learning.
    • Tree-based methods.
      • Decision and regression trees: learning and prediction; role of the parameter and overfitting.
      • Trees aggregation: bagging, Random Forest, boosting.
      • Supervised learning system assessment: cross-fold validation; accuracy and other metrics; metrics for binary classification (FPR, FNR, EER, AUC) and ROC.
    • Support Vector Machines (SVM).
      • Separating hyperplane: maximal margin classifier; support vectors; learning as an optimization problem; maximal margin classifier limitations.
      • Soft margin classifier: learning, role of the parameter C.
      • Non linearly separable problems; kernel: brief background and main options (linear, polynomial, radial); intuition behind radial kernel; SVM,
      • Multiclass classification with SVM.

Third chunk (3 CFU, by prof. Matilde Trevisani)

  • Supervised learning for classification.
    • Training and test error rate; the Bayes classifier.
    • Logistic regression.
    • Linear and quadratic discriminant analysis.
    • The K-nearest neighbors classifier.
  • Unupervised learning.
    • Dimensionality reduction methods: principal component analysis; biplot.
    • Cluster analysis: hierarchical methods, partitional methods (k-means algorithm).

Fourth chunk (3 CFU, by prof. Eric Medvet)

  • Text mining
    • Sentiment analysis
    • Features for text
    • Topic modeling
  • Recommender systems
    • Content-based filtering
    • Collaborative filtering
  • Evolutionary computation


Student must register for the exam session of their interest using the online sistem (esse3). Note that there are deadlines for registration (usually 1 week before the session date). 

Lessons timetable and course calendar

The course will start on October, 8th for the 12CFU version (DSSC) and on November, 6th for the 9CFU version.
Lessons will be held in Classroom 3B, H2bis building, in Piazzale Europa campus.


Suggested textbooks

  • Kenneth A. De Jong. Evolutionary computation: a unified approach. MIT press, 2006
  • Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning: Data Mining, Inference, and Prediction. Springer, Berlin: Springer Series in Statistics, 2009.
  • Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. An Introduction to Statistical Learning, with applications in R. Springer, Berlin: Springer Series in Statistics, 2014.

Course material

The course material (slides) for my portion (Medvet, 3+3 CFU) is attached at the bottom of this page.
The full pack of slides might be updated during the course.
The annotated slides will be provided after the lectures.
See also the University Videocenter for the recordings of the lectures.

Results of students' assessment

MACHINE LEARNING AND DATA ANALYTICS6,007,007,006,00------8,50-
MATEMATICA LM-dm2708,638,078,449,099,168,048,068,749,179,317,937,91
INGEGNERIA ELETTRONICA E INFORMATICA LM-dm2707,837,557,888,569,378,408,338,359,059,108,057,84
MACHINE LEARNING AND DATA ANALYTICS8,308,398,417,979,798,158,377,739,139,119,097,39
DATA SCIENCE AND SCIENTIFIC COMPUTING LM-dm2708,027,808,318,399,128,238,058,189,059,208,607,80
MACHINE LEARNING E DATA ANALYTICS8,087,627,157,008,457,187,305,828,678,008,856,45
SCIENZE STATISTICHE E ATTUARIALI LM-dm2707,957,247,908,778,917,997,917,848,839,028,377,61

Descrizione domande
D1  Le conoscenze preliminari possedute sono risultate sufficienti per la comprensione degli argomenti trattati?
D2  Il carico di studio di questo insegnamento è proporzionato ai crediti assegnati?
D3  Il materiale didattico (indicato o fornito) è adeguato per lo studio della materia?
D4  Le modalità di esame sono state definite in modo chiaro?
D5  Gli orari di svolgimento dell’attività didattica sono rispettati?
D6  Il docente stimola / motiva l’interesse verso la disciplina?
D7  Il docente espone gli argomenti in modo chiaro?
D8  Le attività didattiche integrative (esercitazioni, laboratori, seminari, ecc.) risultano utili ai fini dell’apprendimento? (se non sono previste attività didattiche integrative, rispondete non previste)
D9  L’insegnamento è stato svolto in maniera coerente con quanto dichiarato sul sito web del corso di studio?
D10  Il personale docente è effettivamente reperibile per chiarimenti e spiegazioni?
D11  Sei interessato agli argomenti dell’insegnamento?
D12  Sei complessivamente soddisfatto dell’insegnamento?
Subpages (1): Student project
Eric Medvet,
Dec 17, 2018, 3:37 PM
Eric Medvet,
Dec 17, 2018, 3:28 PM
Eric Medvet,
Nov 6, 2018, 4:52 AM
Eric Medvet,
Nov 7, 2018, 4:20 AM
Eric Medvet,
Nov 8, 2018, 6:07 AM
Eric Medvet,
Nov 13, 2018, 5:37 AM
Eric Medvet,
Nov 14, 2018, 4:50 AM
Eric Medvet,
Nov 15, 2018, 4:33 AM
Eric Medvet,
Dec 11, 2018, 9:23 AM
Eric Medvet,
Dec 12, 2018, 5:25 AM
Eric Medvet,
Dec 13, 2018, 5:55 AM
Eric Medvet,
Dec 18, 2018, 9:30 AM
Eric Medvet,
Dec 28, 2018, 4:50 AM
Eric Medvet,
Dec 28, 2018, 4:51 AM