As discussed in Part I, the Kalman filter is an optimal, prediction/correction estimation technique which minimizes estimated error and can be used a wide array of problems. While Part I focused on a one-dimensional problem space to more easily convey the underlying concepts of the Kalman filter, Part II will now expand the discussion to bring the technique to higher dimensions (e.g., 2D and 3D) while still being constrained to linear problems.
What’s so badass about the Kalman filter you ask? Let’s highlight a few areas where the Kalman filter may provide value. (This should help you remain motivated while you’re delving into a myriad of Greek symbols and matrix transformations!) The Kalman filter can help:
- Determine the true(ish) pose of a mobile robot given its control input and measurements;
- Provide sensor fusion capabilities by further correcting an estimate with each subsequent measurement input;
- Track an object within a video (e.g., face tracking); and
- Determine the attitude of a satellite using measurement of star locations.
To limit the scope of discussions, the context of this tutorial will be on determining the true pose of a mobile robot given noisy control inputs and measurement data.
In examining the Kalman filter a bit more, we’ll discuss:
- Applicable systems for use of the Kalman filter
- Kalman filter algorithm, inputs and outputs
- Kalman filter algorithm formalized
- Limitations of the Kalman filter
- Direction for further study
The current discussion will avoid the derivations of the associated, base equations in favor of pragmatic use of the filter itself. For a comprehensive derivation and discussion of the involved equations and mathematical roots, see Robert Stengel’s Optimal Control and Estimation.
Applicable Systems for Use of the Kalman filter
In Part I, it was discussed that a system must adhere to three constraints for Kalman filter applicability:
- It must be describable by a linear model,
- The noise within the model must be white, and
- The noise within the model must be Gaussian. More precisely, the state must be estimable via:
To better understand this, the state estimation equation is broken down as follows:
- xk is the estimate of the state after taking into consideration the previous state, the control input, and process noise.
- xk-1 is the state estimate from the previous time iteration (k-1) where k is the current time iteration. The state will likely be a vector; e.g., (x y θ)T for a mobile robot on a 2D plane having a location (x, y) and orientation θ.
- uk is the control input for the current time iteration. While being sensor data in the strict sense, odometry measurements may be considered control input; and would be a good fit for our mobile robot example. Accordingly, control input could be represented as a three dimensional vector containing an initial rotation (in radians), a translation distance (distance travelled in a straight line), and a second rotation (δrot1 δtrans δrot2)T.
- wk-1 is the process noise, or the amount of error inherent in the state estimation when taking into consideration noisy control input. (The process noise is Guassian, having mean μ of 0 and covariance Σ.) The process noise is a vector having the same dimension n as the state vector
- A is a matrix which transforms the previous state into the current state without regard to any control input. For example, if the system being modeled were a satellite orbiting the earth, A would be a matrix which would modify the state to reflect the orbital distance traveled between time iterations k and (k – 1). In more earth-bound scenarios, A would be an identity matrix. Regardless, A is a square matrix of size (nxn) where n is the dimension of the state vector xk.
- B is a matrix which transforms the control input to be compatible for summing to the previous state to reflect the current state. Within our mobile robot context, the control input is (δrot1 δtrans δrot2)T which cannot simply be added to the previous state to determine the current state; accordingly, B must transform uk into a vector reflecting the relative state change induced by the control input. For example, if the control input were to move the robot 55 mm on the x axis, 120 mm on the y axis, and 0.8 rad, then the result of Buk would be (55mm 120mm 0.8 rad) which could then be easily summed to the previous state to get the current state. B is a matrix of size (nxl) where n is the dimension of the state vector and l is the dimension of the control vector.
In addition to adhering to the linear state estimation just discussed, in order to be Kalman-filter-compatible, the system being modeled must also have a measurement model estimable via:
The measurement equation is broken down as follows:
- zk is the predicted measurement model after taking into account the estimated state and known measurement noise. Don’t take that at face value; ask yourself why it’s important to be able to predict what the measurement model will be for a given state. Spoiler alert… If we’re able to predict what the measurement model should be for a given state, then we can compare the predicted measurement model against the actual measurement data returned by our sensors. The difference between the two will be used for improving the state estimation within the Kalman filter algorithm. For our little robot, the measurement model may be a vector of laser scans (s1 s2 … sn)T, with each scan having a range and orientation (x θ)T.
- xk is the result of the state estimation equation discussed above.
- vk is the measurement noise, or the amount of error inherent in the measurement estimation when taking into consideration noisy sensor input. (The measurement noise is Guassian, having mean μ of 0 and covariance Σ.) The measurement noise is a vector having the same dimension n as the resulting measurement vector.
- H is a matrix which transforms the state xk into the predicated measurement model. In other words, if given the state xk, Hxk will calculate what the measurement model should look like if there were no uncertainty involved. H is of size (mxn) where m is the dimension of the measurement vector zk and n is the dimension of the state vector.
And to answer your question, yes, H can be quite onerous to implement. In fact, H may actually be implemented as a function, accepting a state and returning a measurement model based on the known map, estimated location and orientation; the Extended Kalman filter or other extension would need to be leveraged if we digressed in this way from our “simple” linear model.
In summary, if the system can be modeled by the the process and measurement equations described above, then the Kalman filter may be used on the system to estimate state, when given control and measurement inputs. Let’s now look at the general Kalman filter algorithm, at a very high level, including specific inputs and outputs.
Kalman Filter Algorithm, Inputs and Outputs
The Kalman filter algorithm follows a surprisingly straight-forward algorithm broken down into two phases. The first phase is called the time estimate (or prediction) in which the previous state and control input is used to estimate the current state and estimate covariance. The second phase is called the measurement update (or correction) in which the Kalman gain is calculated and the state estimate and covariance is improved upon using measurement data and the Kalman gain. Roughly, the algorithm is as follows:
- Estimate the predicted state based on process (control) input. This estimate is the a priori estimate.
- Calculate the state estimate covariance (our confidence in the state estimate).
- Calculate the Kalman gain which will be used for weighting the amount of correction the measurement data will have on the state estimate.
- Refine the estimated state using measurement input. This refined estimate is the a posteriori estimate.
- Refine the state estimate covariance (our confidence in the state estimate after taking measurement data into account).
During the first time iteration t0, the Kalman filter accepts as input the initial state and estimate covariance (which may be zero if the initial state is known with 100% certainty) along with the control input u and measurement data z. On subsequent time iterations tn, the Kalman filter accepts as input the output from the previous run (with mean and covariance – discussed more below) along with the control input u and measurement data z from tn.
The output of the Kalman filter is an estimate of the state represented by a normal distribution having mean μ (the estimated state) and covariance Σ (the confidence, or more accurately, the noise, in that estimate). (As a reminder, the covariance of a normal distribution is the standard deviation squared σ2.) Note that μ need not be limited to a scalar value; in fact, it’ll almost always be a vector. For example, the pose of a mobile robot may be a three dimensional vector containing the location and orientation (x y θ)T. Accordingly, this vector would be the resulting mean value. Furthermore, with a three dimensional state vector as the mean, the covariance Σ would be a (3×3) diagonal matrix having a covariance for each corresponding value of the vector, as shown at right.
Kalman Filter Algorithm Formalized
We’ve discussed the initial Kalman filter equations for process and measurement estimation; we’ve also discussed the overall algorithm for implementation, broken down into prediction and correction phases. What’s missing are the actual calculations for concretely carrying out the estimation process itself. The concrete calculations for implementing the Kalman filter algorithm are derived from the process and measurement equations by taking the partial derivatives of them and setting them to zero for minimizing error…and jumping around three times and standing on your head for 10 minutes. (My eyes quickly begin to glaze over when I start to follow derivations of this nature…but if you like this kind of stuff, Sebastian Thrun shows the complete derivation within Probabilistic Robotics; Robert Stengel takes it to 11 within Optimal Control and Estimation with more Greek symbols than you can shake a stick at.) But I digress…
To formalize, the Kalman filter algorithm accepts four inputs:
- μt-1 – the mean state vector
- Σt-1 – the covariance of the mean (the error in the state estimate)
- ut – the process (control) input
- zt – the measurement data
With the given inputs, the Kalman filter algorithm is implemented as follows:
Line 1 should be comfortingly familiar; this is the calculation for estimating the current state given the previous state and control input. But what’s missing from the original process equation? Have you spotted it yet? I’ll give you a noisy hint. (No mean for the pun…thank you, thank you, I’ll be here all week.) That’s right, the noise parameter has been left off of the state estimation equation in line 1. Line 1 simply calculates the a priori state estimate, ignoring process noise.
Line 2 calculates the covariance of the current state estimate, taking process noise into consideration. Matrix A has already been discussed; it comes from the Kalman filter state estimate equation described earlier. R is a diagonal matrix representing the process noise covariance.
Line 3 calculates the Kalman gain which will be used to weight the effect of the measurement model when correcting the estimate. C is identical to the matrix H described earlier in the base Kalman filter measurement equation. As tricky as this line looks (and some of those matrix calculations can make your head hurt a bit), the only thing new is Q; this diagonal matrix is the measurement noise covariance. The resulting Kalman gain K is a matrix having dimensions (nxm) where n is the dimension of the state vector and m is the dimension of the measurement vector.
(As a side, take note that in different reading sources, the meaning of R and Q may be switched; Q would be process noise and R would be measurement noise and would have their locations in the equations swapped, accordingly. Just be cognizant of which is which within the source you’re reading from.)
Line 4 updates the state estimate taking into account the weighted measurement information. Note that the Kalman gain is multiplied by the difference between the actual measurement model and the predicted measurement model. What happens if they happen to be identical? …Jeopardy daily double sounds playing in the background… If the actual and predicted measurement models happen to be identical, then the estimated state will not be corrected at all since our sensors have verified that we’re exactly where we thought we were; i.e., don’t fix what ain’t broken. The result of line 4 is the a posteriori state estimate.
Finally, line 5 corrects the covariance, taking into account the Kalman gain used to correct the state estimate.
As output, the Kalman filter algorithm returns two values:
- μt – the current mean state vector.
- Σt – the current covariance of the mean (the error in the state estimate). The covariance matrix would be a diagonal matrix having dimensions (nxn) where n is the dimension of the state vector.
With these outputs, it is now known with some Σ amount of error what the current state of the system is; or where our intrepid little robot is on the map.
Limitations of the Kalman Filter
The Kalman filter is incredibly powerful and can be used in a surprising number of scenarios. The primary limitation of the Kalman filter is that it assumes use within a linear system. Many systems are non-linear (such as a mobile robot moving with a rotational trajectory) yet may still benefit from the Kalman filter. The applicable approach is to form a linear estimate of the non-linear system for use by the Kalman filter; similar in effect to a Taylor series expansion. Popular extensions to the Kalman filter to support non-linear systems include the Extended Kalman filter and, even better, the Unscented Kalman filter. Specifically, chapter 7 of Sebastian Thrun’s Probabilistic Robotics goes into good detail on describing how to apply both of these extensions to the context of mobile robotics.
Googling for “Kalman filter” will quickly show just how much more there is to this topic. But I hope this two part series has helped to clarify the overall algorithm with particular attention to describing the various elements of the calculations themselves. (And if nothing else, gives me a refresher to return to!)