Posts by Year

AI in Society: 2. Algorithmic Fairness

6 minute read

Data is to machine learning as fuel is to cars. None of the AI superpowers can be gained without abundant data. But it is not only the quantity of the data t...

Tackling the Cold Start Problem in Recommender Systems

9 minute read

As part of my machine learning internship at Wish, I’m tackling a common problem in recommender systems called the “cold start problem”. Cold start happens w...

AI in Society: 1. Interpretability

7 minute read

Can we open the black-box of all the complicated models out there?

AI in Society: 0. Limitations of Machine Learning Today

2 minute read

Introduction to AI in Society blog series. The impact of AI today is already enormous. But there still are many obstacles to overcome.

Robust PCA

3 minute read

PCA is great because you can reduce a data matrix to a lower dimension without losing much. Although it is widely used, PCA doesn’t work well when there are ...

SVD -> PCA

2 minute read

I wrote a blog about Robust PCA. As a prerequisite for the readers, I will explain what SVD and PCA are. As we shall see, PCA is essentially SVD, and learnin...

EM, Mixture of Gaussians and K-means

7 minute read

This post ties together EM (Expectation Maximization), GMM (Gaussian Mixture Models), K means and variational inference. If you have taken an introductory ma...

Multivariate Normal Cheatsheet

1 minute read

Multivariate normal (MVN) is used everywhere in machine learning, from simple regressions, linear discriminant analysis, Kalman filters to gaussian processes...

What regression coefficients really mean

4 minute read

There is nothing in statistics that is as easy as regressions to use but as hard to make a correct interpretation. Especially in causal inference, even the c...

Formal definition of an experiment

4 minute read

Science experiments, social experiments, thought experiments, … We use the word “experiment” somewhat often in real life. But have you ever wondered what exp...

The Theory behind AB Testing: Introduction to Causal Inference

6 minute read

If you’ve been working in the tech industry, or have thought about doing so, you’ve probably heard of AB testing. Some of you have even conducted one. The id...

Is Bell Curve really a Great Intellectual Fraud?

6 minute read

Bell Curve = Great Intellectual Fraud? I recently read a New York Times best-seller titled “Black Swan” by Nicholas Taleb. The book discusses how hard it is ...

Must-know probability distributions from a single coin toss

6 minute read

There are countless numbers of probability distributions. Some of them are so widely used and beautiful that they deserve a name. Surprisingly, all of those ...

What is probability?

3 minute read

We come across probability not just in statistics classrooms but also in real life. But, have you thought about what probability really means? I would like t...

20 questions to detect fake data scientists

7 minute read

I found an interesting blog post recently, titled: 20 Questions to Detect Fake Data Scientists. I could answer some of them, but not all. So I decided to ans...

Kojin Oshiba

Posts by Year

2018

AI in Society: 2. Algorithmic Fairness

Tackling the Cold Start Problem in Recommender Systems

AI in Society: 1. Interpretability

AI in Society: 0. Limitations of Machine Learning Today

Robust PCA

SVD -> PCA

2017

EM, Mixture of Gaussians and K-means

Multivariate Normal Cheatsheet

What regression coefficients really mean

Formal definition of an experiment

The Theory behind AB Testing: Introduction to Causal Inference

Is Bell Curve really a Great Intellectual Fraud?

Must-know probability distributions from a single coin toss

What is probability?

20 questions to detect fake data scientists