Introduction
This course provides an overview of the use of statistical based inference modelling for computer science applications. In particular, the course describes modern methods for making decisions about data. The first part of the course provides an introduction to the programming languages Octave and R with practical examples of statstical data modelling. In the second part of the course, the fundamental ideas of Bayesian inference and maximum likelihood are studied, together with the major algorithms that are widely used within the machine learning field.
Requirements
This subject must be coursed jointly with the subject of Information Code in order to complete the optional field on Information Theory
Learning outcomes
-
Know how to statistically describe data using R or GNU/Octave. -
Be familiar with concepts in modern machine learning research. -
Have experience with the major algorithms for simple examples.
Teachers:
- David Olivieri
- Silvana Gómez Meire
Invited Speakers:
- Mª Isabel Seruca Leal
Course Contents
- Programming Tools
- An Introduction to GNU Octave/Matlab for Modelling data
- Programming scripts in Octave/Matlab
- Statistics of data in Octave/Matlab
- Intro to R
- Programming in R and representing data
- Packages in R for Modelling data - Stastical Descriptions (R/Octave/Matlab).
- Probability distributions and representations
- Moments of distributions
- Information theoretic properties of a distribution
- Regression, Least squares, Nonlinear least square
- Multivariate analysis - Building Models for Data
- Buidling Models of data
- Concept of Maximum Likelihood
- PCA
- Markov Chain Monte Carlo (MCMC)
- Gaussian Processes
- Clustering and Classification issues - Inference and Classification
- EM algorithm
- Bayesian Inference Theory
- Gaussian Mixture models
- Viterbi Decoding
- Markov Processes and Hidden Markov Models
- Hierarchical clustering
- Support Vector Machines
- Neural Networks
- Bayesian Networks - Learning, Emergence, and Self-Organization
- Thinking about information complexity
- Ant colonies and collective intelligence
- Self-organization vs. entropy
- Learning in Cybernetics
Course Activities
- Classroom Lessons (2 ECTS).- It will consist mainly on magistral sessions and readings to introduce the work to be done in the other activities.
- Experiments and Practices (1 ECTS).- It will consist on the develop of little examples and exercises supervised by the teachers.
- Seminars (2 ECTS).- It will consist mainly in the presentation of a concrete item by students in small groups.
- Conferences (0,5 ECTS).- It will consist on the elaboration of an original scientific article where a practical application of the information theory is explained, and its oral exposition.
- Tutorials (0,5 ECTS).- It will consist in the solution of
some problems proposed by the teachers, and the pursuit of the seminars and conferences.
Course Assessment
Evaluation Procedure for any Student
Problem sets will be assigned each week and students will be required to submit solutions within an allotted time period. The homework problem sets will cover the material discussed in class and are designed to gain practical experience with concepts.
The evaluation of the course is based upon the level of successful execution of the homework solutions. Along with solving the assigned problems, a percentage of the evaluation of each homework set will be allotted to both presentation of the results as well as a proposed solution to a related problem that each student may propose. There will be a penalty for late submissions. The grading system will be the following:
- Successful completion: 6 points max.
- presentation of solutions: 1 point
- Proposed original problem: 3 points max.
- Late submission: -1 /each week
At the end of each subcourse, students will be selected to defend their proposed problems or discussions. The successful defense of this problem will count as a complete homework set.


