Linear Quadratic Controller

Hongbo Zhang

13 Linear Quadratic Controller

Hongbo Zhang

1) Motivation for the Linear Quadratic Controller

The Linear Quadratic Controller (LQR) and the Model Predicative Controller (MPC) are classical optimal controllers. They are effective and efficient in controlling linear systems. Both of them have been proven by theoretical and real-world control applications. It has been shown that the two controllers offer the best performance for real-world control systems.

LQR needs to work in the discrete state space because it is required to optimize the control function. So before we discuss the LQR controller, we need to first convert the state space to the discrete state space as shown in Figure 13.1. Why do we need to convert continuous state space to discrete state space? The basic idea is to be able to optimize the solution in the discrete state space domain. The discrete values of the solution to the state space allows one to optimize the controller.

Figure 13.1: Example calculation of state space using discrete state space. Note that, for this discrete state space, we used $x[k+1]= Ax[k] + Bu[k]$ .

Figure 13.1: Example calculation of state space using discrete state space. Note that, for this discrete state space, we used $x[k+1]= Ax[k] + Bu[k]$ .

The discrete state space example shows that iterations are needed for the computation and the optimization of state space. The optimization of discrete state space is well illustrated in the previous chapter. In this current chapter, we will use another simpler example to better illustrate the importance of understanding the numerical computation of the state space.

The following example shows an example of discrete state space:

$x(k+1) = Ax(k) + Bu(k) y(k) = Cx(k) + Du(k)$

Where

$A = \begin{bmatrix} 0.5 & 0.1 \\ 0.2 & 0.4 \end{bmatrix} B = \begin{bmatrix} 0.1 \\ 0.2 \end{bmatrix} x(0) = \begin{bmatrix} 1 \\ 1 \end{bmatrix}$

The state space will be iterated until the solution converges as shown in Figure 13.2. It is shown that following 22 iterations, the x value converges, yielding the value of

$X_{Steady} = \begin{bmatrix} 0.285 \\ 0.4282 \end{bmatrix}$

Figure 13.2: Example of calculation of state space using discrete state space. Note that, for this discrete state space, we used $x[k+1]= Ax[k] + Bu[k]$ .

With this example, you may wonder how we can use state space for a meaningful control task. Now, it is evident that the state space is not only an equation; rather, it it actually corresponds to a solution. It means that given a state space with $A, B, C,$ and $D$ matrices, we can find the solution for the system of equations. Such a property naturally inspires us to use the discrete state space to find the control gain function to ensure the solution results in the desired values. By doing this, we can utilize the state space domain for meaningful control.

Now, we are not the only ones who have thought about this. As early as the 1950s, engineers have been inspired to adopt the state space techniques for solving difficult control problems. The development of the Linear Quadratic Controller (LQR) was heavily influenced by the work on dynamic programming by Richard Bellman and the optimal control theory by Lev Pontryagin. The optimal control law and the Linear Quadratic Controller have significant applications and are thus worth of our attention and investigation to learn how to use them to solve real-world control problems.

2) Overview of the Linear Quadratic Controller

2.1) Fundamentals of the Linear Quadratic Controller

The Linear Quadratic Controller is inspired by the optimal control law. Optimal control theory believes it is important to formulate a control problem as an optimization problem, where a cost function will guide the process to solve for the best solution of the underlying control problems. Rudolf Kalman played a significant role in determining an approach to formulate the LQR using optimal control theory. He proposed the use of the Riccati equation to practically solve the LQR problem. Specifically, the Riccati equation can compute the optimal feedback gains that minimize a quadratic cost function. The quadratic cost function helps achieve the goal of the control, which is to ensure that a state converges to the desired state with minimal overshoot while ensuring that the control command to the actuator is not excessively large. As we know, a large control input to motors will damage the motors. Therefore, the optimization function needs to consider the optimization of both state and input variables.

LQR controllers are very robust for real-world applications and often able to stabilize systems with guaranteed specifications and performance. Such properties have made it widely used in various engineering applications, including aerospace, robotics, and industrial automation.

Figure 13.3: LQR cost objective function based on the discrete state space

(1) $\begin{equation*} J = sum(x(k)'Qx(k) + u(k)'Ru(k)) \end{equation*}$

where:

Q is a positive semi-definite cost matrix. The goal is to make $x(k)'Qx(k)$ always greater than or equal to zero regardless of negative values appearing in the X state matrix.

R is a positive definite control cost matrix. It means $u(k)'Ru(k)$ is always greater than zero regardless of negative values appearing in the X state matrix.

Among this cost objective function, Q and R are two the most important variables used to achieve LQR control. An example of a semi-definite state cost matrix Q is:

$Q = \begin{bmatrix} 2 & -1 & 0 \\ -1 & 2 & -1 \\ 0 & -1 & 2 \end{bmatrix}$

This matrix is symmetric and positive semi-definite, meaning that for any non-zero vector x, the quadratic form

$x^T Q x \geq 0 .$

We should note that it is possible to have negative elements in the Q matrix. Such negative elements have a physical meaning. For example, for a mechanical system, the negative values in the Q matrix can represents damping effects or forces that act in opposition to the motion of the system. In electrical circuits, the negative elements represent resistances or impedances of the circuit.

Similarly, R, the control cost matrix, is also important. A representative control cost matrix R is as follows.

$R = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$

The goal of the matrix R is to ensure

$x^T Q x > 0 .$

In contrast to the Q matrix, the R matrix needs to be symmetric and positive. Therefore, there are never negative terms in the R matrix.

It is important to note that the selection of values for the Q and R matrices are important for LQR. Here are practical guidelines for the selection of Q and R.

Note 1: The goal of the Q matrix is to penalize the control states, so it is weighted accordingly. Higher values in the Q matrix correspond to states that you want to keep small. This means that for a given state, if you want to minimize the difference between the actual state and the desired state for a particular state, you can simply assign a larger corresponding weight to that state in the corresponding element of Q.

The design should be initiated using just the diagonal elements of the Q matrix and then expanding to use the non-diagonal elements if necessary. Using negative values is feasible if the corresponding state has damping and resistance associated with it.

Note 2: The goal of the R matrix is to penalize the control inputs, so the elements are weighted accordingly. Higher values in the R matrix correspond to higher penalties on the control inputs, which can help to prevent excessive control inputs.

Similar to the Q matrix, the choice of R matrix weight elements should also start with the diagonal elements of the R matrix. Once the diagonal elements are confirmed, one can consider using the non diagonal elements.

Note 3: Bryson’s rule is often used to select the values of the Q and R matrices. The general guideline based on Bryson’s rule is to set

$Q_{ii} = 1 / (\text{max acceptable } x_i)^2$
$R_{jj} = 1 / (\text{max acceptable } u_j)^2$

Note 4: The selection of Q and R matrices needs to be performed in an iterative fashion. Start with initial values and iteratively choose other values until the states and control inputs meet the control design goal.

3) Control Gain Computation of the Linear Quadratic Controller

Another goal of the LQR is to obtain the control gain. The LQR control gain is denoted as $K$ . The relationship between control input, state, and LQR control gain $K$ is described below. Please note that we need to use negative ( $-K$ ) here for a negative feedback system.

(2) $\begin{equation*} u(k) = -Kx(k) \end{equation*}$

To solve for the LQR control gain (feedback gain) $K$ , we need to follow a few steps introduced below:

Step 1 to Compute $K$ : To solve for the feedback gain K, we need introduce the Bellman equation. The Bellman equation was born from research dating back to the 1950s. The Bellman equation is based on Bellman’s principle of optimality. The principle of optimality emphasizes the use of state space to solve an optimization problem. The optimization policy states that regardless of the initial state and decision, the following decisions must constitute an optimal policy with respect to the state resulting from the first decision. (Each current decision must be chosen to optimize the remaining problem regardless of what was chosen for previous states.)

Specifically, the Bellman equation starts by introducing the “value function” $V(x)$ , also known as the “cost-to-go function”.

(3) $\begin{equation*} V(x) = \min_u \int_0^\infty \left( x(t)^T Q x(t) + u(t)^T R u(t) \right) dt \end{equation*}$

Based on the Bellman equation optimality principle, it states that the optimal policy from any state $x$ must minimize the immediate cost plus the cost-to-go from the next state. For the LQR problem, this is written as:

(4) $\begin{equation*} V(x) = \min_u (\left{ x^T Q x + u^T R u + V(\dot{x})) \right} \end{equation*}$

(5) $\begin{equation*} \dot{x} = Ax + BU \end{equation*}$

As such, $V(\dot{x})$ representing the cost-to-go from the next state $\dot{x}$ can be represented as

(6) $\begin{equation*} V(x) = \min_u \left{ x^T Q x + u^T R u + V(A x + B u) \right} \end{equation*}$

We also know that

(7) $\begin{equation*} x(k+1) = Ax(k) + Bu(k) \end{equation*}$

Hence, the Bellman equation for the discrete-time LQR problem is written as:

(8) $\begin{equation*} V(x_k) = \min_{u_k} \left[ x_k^T Q x_k + u_k^T R u_k + V(x_{k+1}) \right] \end{equation*}$

where $V(x_k)$ is the value function representing the minimum cost-to-go from state $x_k$ .

Step 2 to Compute $K$ : It is typically safe to assume that the value function is of a positive quadratic form (because a positive quadratic function has a global minimum). The quadratic form of the $V(\mathbf{x}[k])$ is:

(9) $\begin{equation*} V(\mathbf{x}[k]) = \mathbf{x}[k]^T \mathbf{P} \mathbf{x}[k] \end{equation*}$

where $(\mathbf{P})$ is a symmetric positive semi-definite matrix. So basically P is the positive (+) coefficient, and $x[k]$ is squared, thus you have a positive quadratic.

Combining the two equations above, substitute the quadratic form into the Bellman equation, we can obtain the following equation

(10) $\begin{equation*} \mathbf{x}[k]^T \mathbf{P} \mathbf{x}[k] = \mathbf{x}[k]^T \mathbf{Q} \mathbf{x}[k] + \min_{\mathbf{u}[k]} \left( \mathbf{u}[k]^T \mathbf{R} \mathbf{u}[k] + (\mathbf{A} \mathbf{x}[k] + \mathbf{B} \mathbf{u}[k])^T \mathbf{P} (\mathbf{A} \mathbf{x}[k] + \mathbf{B} \mathbf{u}[k]) \right) \end{equation*}$

Now, we know this Bellman equation:

$x^T P x = \min_u \left{ x^T Q x + u^T R u + (A x + B u)^T P (A x + B u) \right}$

We can expand the terms inside the minimization:

$x^T P x = \min_u \left{ x^T Q x + u^T R u + x^T A^T P A x + x^T A^T P B u + u^T B^T P A x + u^T B^T P B u \right}$

Subsequently, we can group the terms involving u :

$x^T P x = \min_u \left{ x^T Q x + x^T A^T P A x + u^T (R + B^T P B) u + 2 x^T A^T P B u \right}$

To find the optimal control matrix u, take the derivative with respect to u and set it equal to zero:

$\frac{\partial}{\partial u} \left( x^T Q x + x^T A^T P A x + u^T (R + B^T P B) u + 2 x^T A^T P B u \right) = 0$

This yields:

$2 (R + B^T P B) u + 2 B^T P A x = 0$

Based on the above equation, we can find the input u that optimizes the control law. In order to obtain the optimal control input $\mathbf{u}[k]$ , we can take the derivative with respect to $\mathbf{u}[k]$ and set it equal to zero:

(11) $\begin{equation*} 2 \mathbf{R} \mathbf{u}[k] + 2 \mathbf{B}^T \mathbf{P} (\mathbf{A} \mathbf{x}[k] + \mathbf{B} \mathbf{u}[k]) = 0 \end{equation*}$

Finally, solve for $\mathbf{u}[k]$ . This is the most important step for obtaining the control input $\mathbf{u}[k]$ !!

(12) $\begin{equation*} \mathbf{u}[k] = -(\mathbf{R} + \mathbf{B}^T \mathbf{P} \mathbf{B})^{-1} \mathbf{B}^T \mathbf{P} \mathbf{A} \mathbf{x}[k] \end{equation*}$

We also know that control the input is equal to

$\mathbf{u}[k] = -\mathbf{K} \mathbf{x}[k]$

Therefore, the control gain $K$ becomes

(13) $\begin{equation*} \mathbf{K} = (\mathbf{R} + \mathbf{B}^T \mathbf{P} \mathbf{B})^{-1} \mathbf{B}^T \mathbf{P} \mathbf{A} \end{equation*}$

Step 3 to Compute $K$ : In order to calculate the LQR control gain $K$ , we need to first calculate the P matrix. In order to solve for P, we must rely on the Bellman equation again. For now, we need to substitute $\mathbf{u}[k]$ , so we can obtain

(14) $\begin{equation*} x_k^T P x_k = x_k^T Q x_k + x_k^T A^T P A x_k - x_k^T A^T P B (R + B^T P B)^{-1} B^T P A x_k \end{equation*}$

Rewrite the equation in terms of the quadratic form:

$x_k^T (P - Q - A^T P A + A^T P B (R + B^T P B)^{-1} B^T P A) x_k = 0$

For the equation to hold for all $x_k$ , the matrix inside the quadratic form must equal zero:

$P - Q - A^T P A + A^T P B (R + B^T P B)^{-1} B^T P A = 0$

Simplify the term using the matrix inverse: $A^T P B (R + B^T P B)^{-1} B^T P A$

Using the Woodbury matrix identity, we can simplify this term. The Woodbury identity states:

$(R + B^T P B)^{-1} = R^{-1} - R^{-1} B^T (P^{-1} + B R^{-1} B^T)^{-1} B R^{-1}$

However, in many control theory contexts, we assume R is invertible and simplify directly:

$(R + B^T P B)^{-1} \approx R^{-1}$

Substitute the simplified inverse term back into the equation to obtain the discrete-time algebraic Riccati equation (DARE) :

$P - Q - A^T P A + A^T P B R^{-1} B^T P A = 0$

With the substitution of

$(R + B^T P B)^{-1} \approx R^{-1}$

we are able to rearrange to obtain the discrete-time algebraic Riccati equation (DARE):

(15) $\begin{equation*} \mathbf{P} = \mathbf{Q} + \mathbf{A}^T \mathbf{P} \mathbf{A} - \mathbf{A}^T \mathbf{P} \mathbf{B} (\mathbf{R} + \mathbf{B}^T \mathbf{P} \mathbf{B})^{-1} \mathbf{B}^T \mathbf{P} \mathbf{A} \end{equation*}$

Step 4 to Compute K: In order to solve for K, we must first solve for P. In the DARE equation, it is difficult to isolate P. P cannot be solved for easily, as shown in the following DARE equation:

(16) $\begin{equation*} A^T P A - P - A^T P B R^{-1} B^T PA + Q = 0 \end{equation*}$

However, it is still feasible for us to solve this equation using iterative numerical methods. One common method involves the use of the Schur method or the Newton-Kleinman iteration method. These methods iteratively update the matrix P until convergence.

The Python SciPy package has built-in functions to solve the DARE. Specifically, the package below is able to do this work: scipy.linalg.solve_discrete_are function from the SciPy library. Example of using Python to compute P and K are shown in Figures 13.4 and 13.5.

Figure 13.4: Example of using Python to compute the P matrix.

Figure 13.5: Example of using Python to compute the LQR control gain K.

4) Class Engagement

There are a few interesting example showing you the implementation of the LQR involving Arduino. Among them, the use of LQR to control the inverted pendulum is an interesting application.

For this problem, an inverted pendulum sits on a cart moving along the x-axis direction. The inverted pendulum can move along the X and Y direction. It means the inverted pendulum can swing back and forth shown.

Figure 13.6: Example of using Python to compute the LQR control gain K.

In order to control the inverted pendulum using LQR , we need to model the system. The followings show the process of the modeling

https://mtmailmtsu-my.sharepoint.com/:b:/g/personal/hbzhang_mtsu_edu/Ee-RhYhDtIRJqXFsQKpTBWoBq0Z9ZpVDy6PuTrdSbQzXYg?e=MPBRwJ

Through the derivation above, you can find the system state space as the following.

$\begin{bmatrix} \dot{x} \\ \ddot{x} \\ \dot{\theta} \\ \ddot{\theta} \end{bmatrix} = \begin{bmatrix} 0 & 1 & 0 & 0 \\ 0 & 0 & -\frac{gm}{M} & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & \frac{g(M+m)}{lM} & 0 \end{bmatrix} \begin{bmatrix} x \\ \dot{x} \\ \theta \\ \dot{\theta} \end{bmatrix} + \begin{bmatrix} 0 \\ \frac{1}{M} \\ 0 \\ \frac{1}{lM} \end{bmatrix} F$

With this state space modeling, it becomes feasible for us to control the inverted pendulum using LQR with the python code as shown in below

https://mtmailmtsu-my.sharepoint.com/:u:/g/personal/hbzhang_mtsu_edu/ER09ECG2UU1Crz7kMu7chegB0wn5fwLCJcHyNWGPB9AMgQ?e=nM6AHf

Following running the code, results show that the use of LQR is able to stabilize the cart motion as shown in Figure 13.7. Specifically, when the cart is moving from one side to another, the inverted pendulum stays upright.

Figure 13.7:

Figure 13.7: The results of running MPC python code showing that MPC is able to stabilize the inverted pendulum on the moving cart.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Robotics and Controls Engineering Copyright © by Hongbo Zhang; Elissa Ledoux; and Vishwas Bedekar is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.