Seite - 114 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Bild der Seite - 114 -
Text der Seite - 114 -
4. ControlSystemDesign
reward[Bar98]
G(k) = lim
n→∞ 1
n n∑
k=1 R(k), (4.33)
andthediscounted long-termexpectedreward[Bar98]
G(k) = n∑
k=1 γk−1R(k), (4.34)
where 0< γ < 1 is the discount factor. The involvement of the dis-
count factor is to avoid situations where the values of certain states
go to infinite during the learning process. In this dissertation, the dis-
counted long-term expected reward (equation 4.34) is used to within
value functions.
The other type of value function is called state-action value function
Qpi(s,u) (Q-function), which is defined as the expected long-term re-
ward starting from the state s, taking the action a and thereafter fol-
lowing the control policypi. It also has different formulations such as
the above shown state value function. Here the expression similar to
equation 4.34 isusedas thestate-actionvalue function[Bar98]
Qpi(s,u) =E [G(k)|S(k) =s,U(k) =u,pi ]
=E [ n∑
k=1 γk−1R(k) ∣∣S(k) =s,U(k) =u,pi] . (4.35)
Whereasrewardsdeterminetheimmediate, intrinsicdesirabilityofin-
dividual states of the plant, the value function indicate the long-term
desirability of states or state-action pairs by taking into account suc-
ceededstatesandrewards [Bar98].
114
two mostly used formulations are the averaged long-term expected
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Titel
- Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Autor
- Yiming Sun
- Verlag
- KIT Scientific Publishing
- Ort
- Karlsruhe
- Datum
- 2016
- Sprache
- englisch
- Lizenz
- CC BY-SA 3.0
- ISBN
- 978-3-7315-0467-2
- Abmessungen
- 14.8 x 21.0 cm
- Seiten
- 260
- Schlagwörter
- Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
- Kategorie
- Technik