Browsed by
Tag: statistics

Causal Inference cheat sheet for data scientists

Causal Inference cheat sheet for data scientists

Being able to make causal claims is a key business value for any data science team, no matter their size.Quick analytics (in other words, descriptive statistics) are the bread and butter of any good data analyst working on quick cycles with their product team to understand their users. But sometimes some important questions arise that need more precise answers. Business value sometimes means distinguishing what is true insights from what is incidental noise. Insights that will hold up versus temporary marketing…

Read More Read More

Est-ce que cette piscine est bien notée ?

Est-ce que cette piscine est bien notée ?

J’ai pris la (mauvaise ?) habitude d’utiliser Google Maps et son système de notation (chaque utilisateur peut accorder une note de une à cinq étoiles) pour décider d’où je me rend : restaurants, lieux touristiques, etc. Récemment, j’ai déménagé et je me suis intéressé aux piscines environnantes, pour me rendre compte que leur note tournait autour de 3 étoiles. Je me suis alors fait la réflexion que je ne savais pas, si, pour une piscine, il s’agissait d’une bonne ou…

Read More Read More

Riddler and Voter Power Index

Riddler and Voter Power Index

Oliver Roeder has a nice puzzle: the riddler. Just like last week, this week’s puzzle has an interesting application to the US Election and I enjoyed it really much, so I figured I might just write a blog post 🙂 In this article, we’ll solve this week’s riddler two different ways (just because :p) and discuss an indicator used on FiveThirtyEight’s prediction model for the election: the Voter Power Index. Exact solution and Stirling approximation I won’t write again the problem and notations,…

Read More Read More

Data analysis of the French football league players with R and FactoMineR

Data analysis of the French football league players with R and FactoMineR

This year we’ve had a great summer for sporting events! Now autumn is back, and with it the Ligue 1 championship. Last year, we created this data analysis tutorial using R and the excellent package FactoMineR for a course at ENSAE (in French). The dataset contains the physical and technical abilities of French Ligue 1 and Ligue 2 players. The goal of the tutorial is to determine with our data analysis which position is best for Mathieu Valbuena 🙂 The dataset A small precision…

Read More Read More

[Sampling] Talk at INSPS – Avignon

[Sampling] Talk at INSPS – Avignon

I’m in the beautiful city of Avignon for the 3rd ISNPS conference, which is held in the extroardinary Palace of the Popes Convention center. I’ve been invited by Ricardo Cao to give a talk wednesday morning during on sampling methods for big graphs. Sampling methods for graphs from Antoine Rebecq