"If you can’t explain it simply, you don’t understand it well enough"
Albert Einstein


This is a blog with experiences, experiments and insights in the world of Data Science.

They are actually a series of personal notes and projects but I tried to structure them in a way that they could be readable by anyone.

Articles and insights





All projects


CardinalityKit

Advanced Cardinality Estimation for Privacy-Preserving Analytics


Scomp Link

The Astromech Arm for Your Python Projects

May the code be with you


Easy tagging and automatic tagging

The goal of this project is to make tagging procedures easier. I decided to create a Python package that allows me to speed up any image and text tagging procedures.


Getting emails from Yellow Pages

In this article we will focus on Web Scraping procedures with two programming languages (R and Python) to try to get emails from the Yellow Pages website.


Recycling, a statistical classification problem

Using Machine Learning, Deep Learning and Ensemble Learning techniques, we want to create a bin that, thanks to image recognition and also exploiting other information from selected sensors (for example ultrasonic response, photoresistors, weight, etc.), is able to understand if the thrown object should be placed in paper, plastic, etc.


Analysis of risk factors for an injury

This study aims to analyze which risk factors most influence the healing time of an injury, and also to analyze which possible factors can determine the non-healing of the injury.


Film history analysis

This project focuses on the film industry on which an exploratory analysis is carried out to identify interesting elements of this field.


Predicting the result of a football match

After analyzing the results of the last 25 years of football competitions of all teams from the 5 main European leagues, I decided to create a Bayesian model that could predict the result of a football match and then, using Machine Learning techniques, identify some matches that could be predicted with minimal error.


Pseudo-random number generation

Basic methods of random and pseudo-random number generation.


Evolution of statistics and data

From classical statistics to Machine Learning: evolution of data.


Time Series

Stationarity and non-stationarity of a time series
Multivariate stochastic processes: VARMA and VAR models
Cointegration


Monte Carlo Method

Monte Carlo Method with variance reduction techniques: control variables and antithetic variables.


Data Science?

What is data science.


Ensemble Learning

Stacked models and ensemble learning techniques.


Fisher's Exact Test

Hypothesis testing for non-parametric statistics with two dichotomous nominal variables.


Network Analysis

Network analysis in R
Network analysis in Python


Forecasting Theory

Fundamentals of forecasting theory.


Convolutional Neural Network

Convolutional neural networks: definition and elements.


Correlation vs Causation

The difference between correlation and causality.


Market Analysis and Marketing

Schematic notions on the steps in market research.


Large Scale Testing

The multiple testing problem.


UCM Components

Trend, Cycle, Seasonality and Noise components of Unobserved Components Models.


Boosting

Combining weak classifiers to build a stronger one with boosting techniques.


Neural Network and Multi-Layer Perceptron (MLP)

Introduction to neural networks and the Multi-Layer Perceptron architecture.


Regression Splines

Non-parametric regression technique for modeling non-linearities and interactions.


Record Linkage

Record Linkage and Knowledge Discovery.


Bayesian Statistics

Introduction to the Bayesian approach
Prior selection
Hyperparameter selection methods
Predictive inference with the Bayesian approach
Posterior synthesis


UCM Models in State Space Form

1. Introduction to State Space models
2. Introduction to UCM models in State Space form with the KFAS library
3. State Space models estimation of unknown parameters
4. Application of State Space models estimation and auxiliary residuals
5. Forecasting, filtering and smoothing


The Steps of an Analysis

Summary of the steps of an analysis, from data to prediction.


Psychometrics: Myers Briggs Personality Test

Psychometrics and the Myers Briggs personality test as used in market analysis.


From Spatial Statistics to Geostatistics

0. Introduction to geostatistical data
1. Introduction to spatial statistics
2. Spatial data in R
3. Spatial Point Processes: The Poisson process
4. Spatial Point Processes: The test for CSR
5. Spatial Point Processes: Estimation of the intensity function
6. Introduction to Geostatistics, large and small scale variability
7. Geostatistical Model
8. Exploratory analysis EDA and ESDA
9. Spatial prediction and kriging in R


Competizioni bee viva

Ames House Price
OK Cupid
Bike sharing