Innovative automated control systems: Control-theoretic approach for fast online reinforcement learning

Reinforcement Learning (RL) is an effective way of designing model-free linear quadratic regulators (LQRs) for linear time-invariant networks with unknown state-space models. RL has wide ranging applications including industrial automation, self-driving automobiles, power grid systems, and even forecasting stock prices for financial markets.

However, conventional RL can result in unacceptably long learning times when network sizes are large. This can pose a serious challenge for real-time decision-making.

Tomonori Sadamoto at the University of Electro-Communications, Aranya Chakrabortty at North Carolina State University, USA, and Jun-ichi Imura at the Tokyo Institute of Technology have proposed a fast RL algorithm that enables online control of large-scale network systems.

Their approach is to construct a compressed state vector by projecting the measured state through a projection matrix. This matrix is constructed from online measurements of the states in a way that it captures the dominant controllable subspace of the open-loop network model. Next, a RL-controller is learned using the reduced-dimensional state instead of the original state such that the resultant cost is close to the optimal LQR cost.

The lower dimensionality of this approach enables a drastic reduction in the computational complexity for learning. Moreover, stability and optimality of the control performance are theoretically evaluated using robust control theory by treating the dimensionality-reduction error as an uncertainty. Numerical simulations through a 100-dimensional large-scale power grid model showed that the learning speed improved by almost 23 times while maintaining control performance.

The main contribution of the paper is to show how two individually well-known concepts in dynamical system theory and machine learning, namely, model reduction and reinforcement learning, can be combined to construct a highly efficient real-time control design for extreme-scale networks.

Transient response of frequency deviation of generators in the power system(a) without control, (b) by the optimal controller designed by an existing RL method, and (c) by the controller designed by the proposed algorithm.

References

Authors: Tomonori Sadamoto, Aranya Chakrabortty, and Jun-ichi Imura
Title of original paper: Fast Online Reinforcement Learning Control using State-Space Dimensionality Reduction
IEEE Transactions on Control of Network Systems (Early Access)
Digital Object Identifier (DOI): 10.1109/TCNS.2020.3027780
Affiliations: Department of Mechanical Engineering and Intelligent Systems, Graduate School of Informatics and Engineering, The University of Electro-Communications.
First Author website: http://www.sc.lab.uec.ac.jp/ts/index.html

February 2021 Issue

Research Highlights

Innovative automated control systems: Control-theoretic approach for fast online reinforcement learning

References