Introduction to the Kalman Filter, part I

Dealing with the real world is rough. (With an opening like that, you can probably guess how my junior high days went.) To be clear here, we’re talking about robots having to deal with the real world (not me, luckily). It’s hard enough for our little silicon brethren to have limited sensory capabilities, but on top of that, the data that’s coming in is usually noisy. For example, sonar range data can be inconsistent, throwing off distance measurements by many centimeters. Even laser data isn’t perfect; if you watch the range data coming in from stationary laser sensors, you’ll notice the range data shake indecisively around the actual range. My oh my, what’s one to do? Well, you could shell out a little more money for ultra-sensitive sensors, but even they have their limitations and are subject to error. To add insult to injury, our robot’s misjudgment isn’t limited to data coming in from sensors; it’s also prone to misjudging what it’s done from a control perspective. For example, tell your robot to move forward 1 meter and it’ll comply as best it can; but due to various surface textures (road vs. dirt vs. carpet vs. sand), the robot may report that it’s traveled 1 meter when in fact it’s gone a bit further or a bit less. What we’d like to do is to filter out the chaff from the wheat…or the noise from the useful data in this case. Likely the most widely used, researched, and proven filtering technique is the Kalman filter. (Read: this is an important concept to understand!)

In short, if not a bit curtly, the Kalman filter is a recursive data processing algorithm which processes all available measurements, regardless of their precision, to estimate the current value of variables of interest, such as a robot’s position and orientation. There is a plethora of literature written concerning the Kalman filter, some useful and some otherwise; accordingly, to help you better understand the Kalman filter, I’ll guide you through a series of readings which I’ve found pivotally assistive in understanding this technique and discuss key points from those references.

The first step on our journey into the Kalman filter rabbit hole is Peter Maybeck’s Stochastic Models, Estimation, and Control, Vol. 1, Chapter 1 [1]. There’s no writing that I’ve found that gives a more tractable and approachable overview of the concepts behind the Kalman filter. If you’re following along at home (singing along with the bouncing dot), please take the time to read Maybeck’s introduction and take the time to understand what you’re reading…I’ll wait patiently for your return. OK, I’ve gotten impatient and am now moving on.

So what did we just learn? Below are a few points to emphasize key ideas from Maybeck’s introduction…

  • A Kalman filter can only be applied to a system exhibiting three attributes:
    1. It must be a linear model. (Interestingly, if we relax this condition to accommodate non-linear models, we must use a technique known as an Extended Kalman filter which will be a subject of a successive post; this is particularly important when – e.g. – a robot starts taking a curved path, thus becoming a non-linear problem.)
    2. The noise within the model must be white noise.
    3. The noise within the model must be Gaussian (normal bell-shaped curve having a mean and standard deviation).
  • The Kalman filter is built upon the ideas encapsulated by Bayes filter. (For a great introduction to Bayes filter, see chapter 2.4 of Sebastian Thrun’s Probabilistic Robotics [2].) The Kalman filter is an optimal implementation of Bayes filter as its result is the mean and the mode of the Gaussian result, it is the maximum likelihood estimate, and it is the linear estimate whose variance is less than that of any other linear unbiased estimate; in other words, whichever angle you look at it, the result is just about as good as it gets.
  • There are two kinds of noise which needs to be considered: process (or control) noise, which results from imprecise movement and manipulation control, and measurement noise which results from imprecise sensory data, such as sonar and laser data. When calculating an estimate, the process and process noise is considered first, with the measurement and measurement noise then being applied next.
  • The “white noise” of the system being analyzed need not be constant; a “shaping filter” can be used to describe how the noise changes over time assuming that the amplitude of the noise can be described at any point in time as a Gaussian distribution.
  • In reading Maybeck’s introduction, you may have been surprised to see as described in equation 1-1 of the introduction. If you skip to equation 1-16, you’ll be reminded why this is the case. As the variance of the process/control noise approaches infinity and the variance of the process prediction approaches infinity, the gain approaches 1. Accordingly, no “weight” is put on the control input while all trust is put on the measurement result itself. Now, going back to equation 1-1, since control isn’t being considered at all, it is effectively removed from the equation, similarly to as if it had an infinite error variance.

Algorithmically, in our one-dimensional context, the Kalman filter takes the following steps to estimate the variable of interest:

  1. Calculate the predicted state based on process (control) input (1-11),
  2. Calculate the error variance of the predicted state (1-12),
  3. Calculate the gain which will be used to “correct” the predicted state when the measurement data is applied (1-15),
  4. Re-calculate the predicted state, taking into account measurement data and the gain (1-13), and
  5. Re-calculate the error variance taking into account the improvement that the measurement data has added to the prediction (1-14).

Nice, right? The Kalman filter takes into account the previous state estimation, and the confidence of that estimation, to then determine the current estimate based on control input and measurement output, all the while tweaking itself – with the gain – along the way. And since it only needs to maintain the last estimate and confidence of the estimate, it takes up very little memory to implement. While very approachable, Maybeck’s introduction to the Kalman filter is greatly simplified…for that very reason. For example, Maybeck’s introduction assumes movement on a one-dimensional plane (an x value) with one-dimensional control input (the velocity). In more realistic contexts, we store the position and orientation of a robot as a vector and the control and measurement data as vectors as well. In the next post, we’ll examine what modifications need to be made to Maybeck’s simplification to apply the Kalman filter to more real world scenarios. This is where we must get our old college textbooks out to realize that we should have paid more attention in our matrices class.

Billy McCafferty


[1] Maybeck, P. 1979. Stochastic Models, Estimation, and Control, Vol. 1.

[2] Thrun, S., Burgard, W., Fox, D. 2005. Probabilistic Robotics.

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *

You may also like these