Seite - 115 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Bild der Seite - 115 -

Text der Seite - 115 -

4.2. IntelligentControl Therelationshipbetweenthestateandstate-actionvaluefunctionsare representedbythe followingtwoequations [Bar98] Vpi(s) = ∑ u pi(u|s)Qpi(s,u), (4.36) Qpi(s,u) = ∑ s′ Pu(s,s ′,r)[r(s,u,s′)+γVpi(s′)], (4.37) wherepi(u|s) is the probability of taking actionu in the state sunder thecontrolpolicypi. Combiningthese twoequations together, there is theso-calledBellmanequationgivenas [Bar98] Vpi(s) = ∑ u pi(u|s) ∑ s′ Pu(s,s ′,r)[r(s,u,s′)+γVpi(s′)]. (4.38) Definingtheoptimalstateandstate-actionvalue functionsas V∗(s) = max pi Vpi(s), (4.39) Q∗(s,u) = max pi Qpi(s,u), (4.40) respectively, the Bellman optimality equations are expressed as [Bar98] V∗(s) = max u E [ R(k+1)+γV∗(S(k+1)) ∣∣S(k) =s,U(k) =u] = max u ∑ u pi(u|s) ∑ s′ Pu(s,s ′,r)[r(s,u,s′)+γV∗(s′)], (4.41) Q∗(s,u) =E [ R(k+1)+γmax u′ Q∗(s,u′) ∣∣S(k) =s,U(k) =u] = ∑ s′ Pu(s,s ′,r) [ r(s,u,s′)+γmax u′ Q∗(s′,u′) ] . (4.42) The Bellman optimality equation is the foundation of RLC as well as theconventionaloptimalcontrol [LV09]. It reveals therelationshipbe- tweenthecurrentstate (orstate-actionpair)anditssuccessorstate (or state-actionpair),whichtransferstheprocessofdeterminingthelong- termoptimalcontrol sequence intoaone-stepsearchofequation 4.41 115

zurück zum Buch Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources"

Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Titel: Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Autor: Yiming Sun
Verlag: KIT Scientific Publishing
Ort: Karlsruhe
Datum: 2016
Sprache: englisch
Lizenz: CC BY-SA 3.0
ISBN: 978-3-7315-0467-2
Abmessungen: 14.8 x 21.0 cm
Seiten: 260
Schlagwörter: Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
Kategorie: Technik