Seite - 118 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Bild der Seite - 118 -
Text der Seite - 118 -
4. ControlSystemDesign
performed at the end of each episode rather than at each time step,
whichmakesMCmethods inconvenient tobe implementedintheon-
line form.
TD methods are the combination of DP and MC [Tes95]. On the one
hand, TD methods can estimate the the value function directly from
raw experiences without any prior information of the plant, just like
MC methods. In addition, the update of the value function in TD is
easy to be implemented in the online form without waiting until the
end of each episode, which is similar to DP methods. As the name
indicates, theupdateof thestate-actionvaluefunction inTDmethods
is based on the so-called TD error [Tes95]. Depending on different
definitionsofTDerroranddifferentupdaterules,TDmethodscanbe
classifiedinto three typesas following[Bar98].
• Sarsa (State-Action-Reward-State-Action)
At each timek, the update rule for the state-action value function
insarsa isdefinedas [Bar98]
Q(S(k),U(k)) =Q(S(k),U(k))+α(k)[R(k)
+γQ(S(k+1),U(k+1))−Q(S(k),U(k))],
(4.44)
whereα(k) is the time-varying learning rate and the term within
thesquarebrackets isusedas theTDerroras [Bar98]
δtd=R(k)+γQ(S(k+1),U(k+1))−Q(S(k),U(k)). (4.45)
In order to guarantee the state-action value function converges to
theoptimalvaluefunction, thetime-varyinglearningrateshaveto
fulfill theconditionas [Bar98]
∞∑
k=1 α(k) =∞, and ∞∑
k=1 α(k)2<∞. (4.46)
After the state-action value function is updated, the controller can
be constructed by using the greedy algorithm (equation 4.43), or
themorerobust -greedyalgorithm,suchas [Bar98]
pi(u|s) = { 1− + /|U(s)|, u=u∗,
/|U(s)|, others, (4.47)
118
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Titel
- Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Autor
- Yiming Sun
- Verlag
- KIT Scientific Publishing
- Ort
- Karlsruhe
- Datum
- 2016
- Sprache
- englisch
- Lizenz
- CC BY-SA 3.0
- ISBN
- 978-3-7315-0467-2
- Abmessungen
- 14.8 x 21.0 cm
- Seiten
- 260
- Schlagwörter
- Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
- Kategorie
- Technik