GIANCARLO MANZI , responsible for the course
Degree in FINANCE AND ECONOMICS (MEF)  Classe LM16 Enrolled from 2017/2018 academic year  Laurea Magistrale  2018/2019
Compulsory course or activity  yes 

Year of course  1s 
Term or semester  3rd term 
Scientific fields (settori scientificodisciplinari) 

ECTS credits (CFU) compulsory  9 
ECTS credits  facultative   
Aims and objectives: Course objectives are:
• To introduce students to the expanding world of big data analysis.
• To introduce students to basic concepts, techniques and applications of computational statistics & data mining to be used in finance and economics.
• To develop skills for using the R software in order to solve practical problems
• To achieve skills for doing independent study and research.
Language of instruction: English
Teaching methods: 75% lecturestyle lessons
25% classroom teaching activities focused on examples and applications in R
Syllabus: Part I
(i) Introduction to data mining and statistical learning. (ii) Exploratory data analysis and visualization. (iii) Supervised vs. unsupervised methods: introduction. (iii) Parametric vs. nonparametric methods: introduction. (iv) Quick review of Maximum Likelihood Methods (v) Multiple linear regression. (vi) Classification methods: logistic regression, linear discriminant analysis and the Knearest neighbors method. The Bayes classifier. (vii) Resampling methods: cross validation and the bootstrap. (vii) Shrinkage methods: Ridge regression, the Lasso and other shrinkage methods. (ix) Regression splines and local regression. (x) Treebased methods: random forest, bagging and boosting. Introduction to Bayesian networks. (xi) Support vector machines. (xii) Unsupervised learning: PCA, clustering and multidimensional scaling methods; correspondance analysis. Principal component regression. (xiii) Introduction to Bayesian methods in data mining. (xiv) Elementary text mining. (xv) Data mining in finance.
Part II
(i) Computerintensive statistical methods: overview. (ii) Pseudorandom number and variable generation. (iii) Monte Carlo methods for numerical integration. (iv) Simulationbased inference. (v) MCMC methods: overview. (vi) MCMC methods: MetropolisHastings and Gibbs sampling.
Syllabus  nonattending students: Part I
(i) Introduction to data mining and statistical learning. (ii) Exploratory data analysis and visualization. (iii) Supervised vs. unsupervised methods: introduction. (iii) Parametric vs. nonparametric methods: introduction. (iv) Quick review of Maximum Likelihood Methods (v) Multiple linear regression. (vi) Classification methods: logistic regression, linear discriminant analysis and the Knearest neighbors method. The Bayes classifier. (vii) Resampling methods: cross validation and the bootstrap. (vii) Shrinkage methods: Ridge regression, the Lasso and other shrinkage methods. (ix) Regression splines and local regression. (x) Treebased methods: random forest, bagging and boosting. Introduction to Bayesian networks. (xi) Support vector machines. (xii) Unsupervised learning: PCA, clustering and multidimensional scaling methods; correspondance analysis. Principal component regression. (xiii) Introduction to Bayesian methods in data mining. (xiv) Elementary text mining. (xv) Data mining in finance.
Part II
(i) Computerintensive statistical methods: overview. (ii) Pseudorandom number and variable generation. (iii) Monte Carlo methods for numerical integration. (iv) Simulationbased inference. (v) MCMC methods: overview. (vi) MCMC methods: MetropolisHastings and Gibbs sampling.
The aim of this course is to provide a basic understanding of supervised and unsupervised statistical learning from data. It will help students to acquire the basic methodology and the most popular tools used in applications. Data mining topics include: review of basic likelihood theory, multiple linear, nonlinear regression and other parametric and classification methods; variable selection; logistic regression; the Bayes classifier; linear and quadratic discriminant analysis; regression shrinkage methods (Ridge, Lasso and other methods); dimension reduction (Principal component analysis, multidimensional scaling and correspondance analysis); knearestneighbors; decision trees; support vector machines; clustering. Other computational statistics topics include pseudorandom number and variate generation, Monte Carlo methods for numerical integration and basic Monte Carlo Markov Chain methods. Students' practice will focus on usage of statistical software packages (mainly R). Applications of data mining in finance (time series clustering) and text mining will be also presented.
Readings: Main textbooks:
(i) An Introduction to Statistical Learning, with applications in R (2013) by G. James, D. Witten, T. Hastie, R. Tibshirani, Springer.
(ii) Introducing Monte Carlo Statistical Methods with R (2010) by C.P. Robert, G. Casella, Springer.
Suggested reading for insights into some topics in main textbooks:
(i) The Elements of Statistical Learning, 2nd edition (2009), T. Hastie, R. Tibshirani, J. Friedman, Springer.
(ii) Machine Learning: a Probabilistic Perspective (2012), K.P. Murphy, The MIT Press.
(iii) Monte Carlo Statistical Methods (2004) by C.P. Robert, G. Casella, Springer.
Further reading will be suggested during the course.
Readings  nonattending students: Main textbooks:
(i) An Introduction to Statistical Learning, with applications in R (2013) by G. James, D. Witten, T. Hastie, R. Tibshirani, Springer.
(ii) Introducing Monte Carlo Statistical Methods with R (2010) by C.P. Robert, G. Casella, Springer.
Suggested reading for insights into some topics in main textbooks:
(i) The Elements of Statistical Learning, 2nd edition (2009), T. Hastie, R. Tibshirani, J. Friedman, Springer.
(ii) Machine Learning: a Probabilistic Perspective (2012), K.P. Murphy, The MIT Press.
(iii) Monte Carlo Statistical Methods (2004) by C.P. Robert, G. Casella, Springer.
Further reading will be suggested during the course.
Exam  unico 

Type of assessment  Esame 
Assessment  voto verbalizzato in trentesimi 
Prerequisites, exams and assessment A basic knowledge of statistics and probability fundamentals is required. Basics on regression methods are useful to speed up the first part of the course.
Matrix algebra and multivariate calculus will be beneficial but are not strictly required.
A basic R knowledge and some programming skills are also useful but not required.
Evaluation will be performed through an oral examination on boh theoretical topics and possible applications. Homeworks and assignments will be delivered during the course.
Prerequisites, exams and assessment  non attendant students A basic knowledge of statistics and probability fundamentals is required. Basics on regression methods are useful to speed up the first part of the course.
Matrix algebra and multivariate calculus will be beneficial but are not strictly required.
A basic R knowledge and some programming skills are also useful but not required.
Evaluation will be performed through an oral examination on boh theoretical topics and possible applications. Homeworks and assignments will be delivered during the course.
Propaedeutical courses No mandatory prerequisites are required, but a good knowledge of basic statistical and mathematical topics is welcome.
Scientific fields
Lezioni: 60 hours
Teacher  Office location  

GIANCARLO MANZI , responsible for the course  Mercoledì 16.3019.30. Dopo la pausa natalizia il ricevimento studenti riprenderà regolarmente a partire da mercoledì 16 gennaio 2019  DEMM stanza 37  3° piano. 