Seite - 114 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Bild der Seite - 114 -

Text der Seite - 114 -

4. ControlSystemDesign reward[Bar98] G(k) = lim n→∞ 1 n n∑ k=1 R(k), (4.33) andthediscounted long-termexpectedreward[Bar98] G(k) = n∑ k=1 γk−1R(k), (4.34) where 0< γ < 1 is the discount factor. The involvement of the dis- count factor is to avoid situations where the values of certain states go to infinite during the learning process. In this dissertation, the dis- counted long-term expected reward (equation 4.34) is used to within value functions. The other type of value function is called state-action value function Qpi(s,u) (Q-function), which is defined as the expected long-term re- ward starting from the state s, taking the action a and thereafter fol- lowing the control policypi. It also has different formulations such as the above shown state value function. Here the expression similar to equation 4.34 isusedas thestate-actionvalue function[Bar98] Qpi(s,u) =E [G(k)|S(k) =s,U(k) =u,pi ] =E [ n∑ k=1 γk−1R(k) ∣∣S(k) =s,U(k) =u,pi] . (4.35) Whereasrewardsdeterminetheimmediate, intrinsicdesirabilityofin- dividual states of the plant, the value function indicate the long-term desirability of states or state-action pairs by taking into account suc- ceededstatesandrewards [Bar98]. 114 two mostly used formulations are the averaged long-term expected

zurück zum Buch Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources"

Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Titel: Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Autor: Yiming Sun
Verlag: KIT Scientific Publishing
Ort: Karlsruhe
Datum: 2016
Sprache: englisch
Lizenz: CC BY-SA 3.0
ISBN: 978-3-7315-0467-2
Abmessungen: 14.8 x 21.0 cm
Seiten: 260
Schlagwörter: Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
Kategorie: Technik