Page - 115 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Image of the Page - 115 -
Text of the Page - 115 -
4.2. IntelligentControl
Therelationshipbetweenthestateandstate-actionvaluefunctionsare
representedbythe followingtwoequations [Bar98]
Vpi(s) = ∑
u pi(u|s)Qpi(s,u), (4.36)
Qpi(s,u) = ∑
s′ Pu(s,s
′,r)[r(s,u,s′)+γVpi(s′)], (4.37)
wherepi(u|s) is the probability of taking actionu in the state sunder
thecontrolpolicypi. Combiningthese twoequations together, there is
theso-calledBellmanequationgivenas [Bar98]
Vpi(s) = ∑
u pi(u|s) ∑
s′ Pu(s,s
′,r)[r(s,u,s′)+γVpi(s′)]. (4.38)
Definingtheoptimalstateandstate-actionvalue functionsas
V∗(s) = max
pi Vpi(s), (4.39)
Q∗(s,u) = max
pi Qpi(s,u), (4.40)
respectively, the Bellman optimality equations are expressed as
[Bar98]
V∗(s) = max
u E [ R(k+1)+γV∗(S(k+1)) ∣∣S(k) =s,U(k) =u]
= max
u ∑
u pi(u|s) ∑
s′ Pu(s,s
′,r)[r(s,u,s′)+γV∗(s′)],
(4.41)
Q∗(s,u) =E [
R(k+1)+γmax
u′ Q∗(s,u′) ∣∣S(k) =s,U(k) =u]
= ∑
s′ Pu(s,s
′,r) [
r(s,u,s′)+γmax
u′ Q∗(s′,u′) ]
.
(4.42)
The Bellman optimality equation is the foundation of RLC as well as
theconventionaloptimalcontrol [LV09]. It reveals therelationshipbe-
tweenthecurrentstate (orstate-actionpair)anditssuccessorstate (or
state-actionpair),whichtransferstheprocessofdeterminingthelong-
termoptimalcontrol sequence intoaone-stepsearchofequation 4.41
115
back to the
book Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources"
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Title
- Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Author
- Yiming Sun
- Publisher
- KIT Scientific Publishing
- Location
- Karlsruhe
- Date
- 2016
- Language
- English
- License
- CC BY-SA 3.0
- ISBN
- 978-3-7315-0467-2
- Size
- 14.8 x 21.0 cm
- Pages
- 260
- Keywords
- Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
- Category
- Technik