Page - 115 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Image of the Page - 115 -

Text of the Page - 115 -

4.2. IntelligentControl Therelationshipbetweenthestateandstate-actionvaluefunctionsare representedbythe followingtwoequations [Bar98] Vpi(s) = ∑ u pi(u|s)Qpi(s,u), (4.36) Qpi(s,u) = ∑ s′ Pu(s,s ′,r)[r(s,u,s′)+γVpi(s′)], (4.37) wherepi(u|s) is the probability of taking actionu in the state sunder thecontrolpolicypi. Combiningthese twoequations together, there is theso-calledBellmanequationgivenas [Bar98] Vpi(s) = ∑ u pi(u|s) ∑ s′ Pu(s,s ′,r)[r(s,u,s′)+γVpi(s′)]. (4.38) Definingtheoptimalstateandstate-actionvalue functionsas V∗(s) = max pi Vpi(s), (4.39) Q∗(s,u) = max pi Qpi(s,u), (4.40) respectively, the Bellman optimality equations are expressed as [Bar98] V∗(s) = max u E [ R(k+1)+γV∗(S(k+1)) ∣∣S(k) =s,U(k) =u] = max u ∑ u pi(u|s) ∑ s′ Pu(s,s ′,r)[r(s,u,s′)+γV∗(s′)], (4.41) Q∗(s,u) =E [ R(k+1)+γmax u′ Q∗(s,u′) ∣∣S(k) =s,U(k) =u] = ∑ s′ Pu(s,s ′,r) [ r(s,u,s′)+γmax u′ Q∗(s′,u′) ] . (4.42) The Bellman optimality equation is the foundation of RLC as well as theconventionaloptimalcontrol [LV09]. It reveals therelationshipbe- tweenthecurrentstate (orstate-actionpair)anditssuccessorstate (or state-actionpair),whichtransferstheprocessofdeterminingthelong- termoptimalcontrol sequence intoaone-stepsearchofequation 4.41 115

back to the book Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources"

Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Title: Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Author: Yiming Sun
Publisher: KIT Scientific Publishing
Location: Karlsruhe
Date: 2016
Language: English
License: CC BY-SA 3.0
ISBN: 978-3-7315-0467-2
Size: 14.8 x 21.0 cm
Pages: 260
Keywords: Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
Category: Technik