Seite - 115 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Bild der Seite - 115 -
Text der Seite - 115 -
4.2. IntelligentControl
Therelationshipbetweenthestateandstate-actionvaluefunctionsare
representedbythe followingtwoequations [Bar98]
Vpi(s) = ∑
u pi(u|s)Qpi(s,u), (4.36)
Qpi(s,u) = ∑
s′ Pu(s,s
′,r)[r(s,u,s′)+γVpi(s′)], (4.37)
wherepi(u|s) is the probability of taking actionu in the state sunder
thecontrolpolicypi. Combiningthese twoequations together, there is
theso-calledBellmanequationgivenas [Bar98]
Vpi(s) = ∑
u pi(u|s) ∑
s′ Pu(s,s
′,r)[r(s,u,s′)+γVpi(s′)]. (4.38)
Definingtheoptimalstateandstate-actionvalue functionsas
V∗(s) = max
pi Vpi(s), (4.39)
Q∗(s,u) = max
pi Qpi(s,u), (4.40)
respectively, the Bellman optimality equations are expressed as
[Bar98]
V∗(s) = max
u E [ R(k+1)+γV∗(S(k+1)) ∣∣S(k) =s,U(k) =u]
= max
u ∑
u pi(u|s) ∑
s′ Pu(s,s
′,r)[r(s,u,s′)+γV∗(s′)],
(4.41)
Q∗(s,u) =E [
R(k+1)+γmax
u′ Q∗(s,u′) ∣∣S(k) =s,U(k) =u]
= ∑
s′ Pu(s,s
′,r) [
r(s,u,s′)+γmax
u′ Q∗(s′,u′) ]
.
(4.42)
The Bellman optimality equation is the foundation of RLC as well as
theconventionaloptimalcontrol [LV09]. It reveals therelationshipbe-
tweenthecurrentstate (orstate-actionpair)anditssuccessorstate (or
state-actionpair),whichtransferstheprocessofdeterminingthelong-
termoptimalcontrol sequence intoaone-stepsearchofequation 4.41
115
zurück zum
Buch Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources"
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Titel
- Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Autor
- Yiming Sun
- Verlag
- KIT Scientific Publishing
- Ort
- Karlsruhe
- Datum
- 2016
- Sprache
- englisch
- Lizenz
- CC BY-SA 3.0
- ISBN
- 978-3-7315-0467-2
- Abmessungen
- 14.8 x 21.0 cm
- Seiten
- 260
- Schlagwörter
- Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
- Kategorie
- Technik