Probability Theory and Statistics

Some of the fundamental Statistical and Probability Theory needed for Machine Learning are:

  • Probability Rules & Axioms
  • Bayes’ Theorem
  • Prior and Posterior
  • Random Variables
  • Variance and Expectation
  • Combinatorics
  • Probability distributions
  • Conditional and Joint Distributions
  • Standard Distributions (Bernoulli, Binomial, Multinomial, Uniform and Gaussian)
  • Moment Generating Functions
  • Maximum Likelihood Estimation (MLE) and Maximum a Posteriori Estimation (MAPs)
  • Sampling Methods

Linear Algebra

Linear algebra is a cornerstone because everything in machine learning is a vector or a matrix.

  • Dot products
  • Distance
  • Rank and inversion
  • Eigenvalues and eigenvectors
  • Symmetric Matrices
  • Orthogonalization & Orthonormalization
  • Matrix factorization can be useful in terms of dimensional reduction of feature space like SVD, Latent Semantic Analysis, Non-negative Matrix Factorization, Principal Component Analysis (PCA), Eigendecomposition of a matrix, LU Decomposition, QR Decomposition/Factorization and etc.

Multivariate Calculus

Calculus is also important in order to understand the learning algorithms, the optimization process, how the error and learning rate is used to minimize the error generated by the cost function at each iteration.

  • Differentiation matters because of gradient descent. Gradient descent is almost everywhere. It found its way even into the tree domain in the form of gradient boosting – a gradient descent in function space.
  • Integral Calculus
  • Partial Derivatives
  • Vector-Values Functions
  • Directional Gradient
  • Hessian
  • Jacobian
  • Laplacian and Lagragian Distribution

Algorithms and Complex Optimizations

This is important for understanding the computational efficiency and scalability of our Machine Learning Algorithm and for exploiting sparsity in our datasets. Knowledge of data structures (Binary Trees, Hashing, Heap, Stack etc), Dynamic Programming, Randomized & Sublinear Algorithm, Graphs, Gradient/Stochastic Descents and Primal-Dual methods are needed.


Highly recommend learning statistics with a heavy focus on coding up examples, preferably in Python or R.


This comprises of other Math topics not covered in the four major areas described above. They include Real and Complex Analysis (Sets and Sequences, Topology, Metric Spaces, Single-Valued and Continuous Functions, Limits), Information Theory (Entropy, Information Gain), Function Spaces and Manifolds.


log in

Use demo/demo public access

reset password

Back to
log in
Choose A Format
Personality quiz
Trivia quiz