/images/avatar.png

A random walk through HMM (2) - structure and inference

Introduction I found most of HMM tutorial are presented in a top-down manner, where a lot of mathematical notations are thrown to the learners without a concrete example. This could sometimes make the learning process very frustrating, at least for me. In this post, I will try to demonstrate the core idea of HMM, as well as a commonly-used inference algorithm using a toy example. The example is from Dr.Xiaole Liu’s Youtube channel, and I highly recommend you to check out her video if you want to develop intuition of HMM rather than get killed by notations.

A random walk through HMM (1) - a background

Introduction This series of blog posts is aiming to discuss the Hidden Markov Model (HMM), given its wide applications in various fields including natural language processing, population genetics, finance, and so on. Besides its usefulness, I found HMM particularly interesting to learn, since it connects multiple disciplines like probability, linear algebra, machine learning and computer science. In this post, I will introduce Markov Model, which serves as the backbone of HMM model.

An Intuitive Explanation of Bayesian Network

Introduction Bayesian network, a probabilistic model that represents the causal relationship between variables, has gain its popularity in various fields. In biology, for example, people start to use this model to infer genetic regulatory network (GRN) due to its nice property of being directional. The aim of this blog post is to provide a gentle and less-mathematical introduction to Bayesian network.  An example Suppose we are going to take a math exam next week.

Model the Gene Expression (2): Likelihood Ratio Test

In the last post, we used a GLM framework to model the gene expression. $$ y \sim NB(\mu, r) \\ log( \mu )= b_1 x + b_0 $$ Using maximum likelihood estimation, we were able to find a set of parameters $\hat b_0, \hat b_1, \hat r$, that maximizes the likelihood function. But if you send this model (the estimated parameters) to biologists, they wouldn’t be happy. And we all know what is lacking: the p-value!

Model the Gene Expression (1): A GLM framework

Before you start reading this post, please familiarize yourself with MLE and linear model.  Background In transcriptomic research, we often want to determine if genes are unregulated or down-regulated under a particular perturbation. For example, we have a medication that may cure type 2 diabetes. In our experiment, 6 patients are split into two groups, with 3 patients taking the medication, and 3 patients taking the placebo. The patients' blood samples are then collected to measure the transcriptomic profile (mRNA abundance level for each gene) using NGS technology (RNA seq).

Calculate SVD by hand (and decompose Spongebob)

In my previous post, I have manually implemented PCA by finding the eigenvectors and eigenvalues of a covariance matrix. Today, let’s try to perform PCA using a different approach called Singular Value Decomposition. Then we are going to decompose SPONGEBOB! Note: you might find this post to be useful, if you are new to PCA.  Algorithm Again, we are going to use the same dataset we have used before.