There is no abstract available for this article.
This work was supported in part by National Key Research and Development Program of China (Grant No. 2016YFB0901900), National Natural Science Foundation of China (Grant Nos. 61673229, U1301254), 111 International Collaboration Project of China (Grant No. B06002), and the Program for New Star of Science and Technology in Beijing (Grant No. xx2014B056).
Appendixes A–C.
[1] Weiss Y, Allerhand L I, Arogeti S. Yaw stability control for a rear double-driven electric vehicle using LPV-H methods. Sci China Inf Sci, 2018, 61: 070206 CrossRef Google Scholar
[2] Jia Q S. On State Aggregation to Approximate Complex Value Functions in Large-Scale Markov Decision Processes. IEEE Trans Automat Contr, 2011, 56: 333-344 CrossRef Google Scholar
[3] Cao X R, Ren Z, Bhatnagar S. A time aggregation approach to Markov decision processes. Automatica, 2002, 38: 929-943 CrossRef Google Scholar
[4] Powell W B. Approximate Dynamic Programming. Hoboken: John Wiley & Sons, Inc., 2007. Google Scholar
[5] Jia Q-S, Yang Y, Xia L, et al. A tutorial on event-based optimization with application in energy Internet (in Chinese). Control Theor Appl, 2018, 35: 32--40. Google Scholar
[6] Gu Z, Huan Z, Yue D, et al. Event-triggered dynamic output feedback control for networked control systems with probabilistic nonlinearities. Inf Sci, 2018, 457: 99--112. Google Scholar
[7] Xia L, Jia Q S, Cao X R. A tutorial on event-based optimization-a new optimization framework. Discrete Event Dyn Syst, 2014, 24: 103-132 CrossRef Google Scholar
[8] Wu J, Jia Q-S. A Q-learning method for scheduling shared EVs under uncertain user demand and wind power supply. In: Proceedings of the 2nd IEEE Conference on Control Technology and Applications, Copenhagen, 2018. Google Scholar
[9] Tang C, Li X, Wang Z. Cooperation and distributed optimization for the unreliable wireless game with indirect reciprocity. Sci China Inf Sci, 2017, 60: 110205 CrossRef Google Scholar
Figure 1
(Color online) Policy performance. (a) Performance of the objective function; (b) average run time in each iteration.