Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Technik
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Seite - 118 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 118 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Bild der Seite - 118 -

Bild der Seite - 118 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Text der Seite - 118 -

4. ControlSystemDesign performed at the end of each episode rather than at each time step, whichmakesMCmethods inconvenient tobe implementedintheon- line form. TD methods are the combination of DP and MC [Tes95]. On the one hand, TD methods can estimate the the value function directly from raw experiences without any prior information of the plant, just like MC methods. In addition, the update of the value function in TD is easy to be implemented in the online form without waiting until the end of each episode, which is similar to DP methods. As the name indicates, theupdateof thestate-actionvaluefunction inTDmethods is based on the so-called TD error [Tes95]. Depending on different definitionsofTDerroranddifferentupdaterules,TDmethodscanbe classifiedinto three typesas following[Bar98]. • Sarsa (State-Action-Reward-State-Action) At each timek, the update rule for the state-action value function insarsa isdefinedas [Bar98] Q(S(k),U(k)) =Q(S(k),U(k))+α(k)[R(k) +γQ(S(k+1),U(k+1))−Q(S(k),U(k))], (4.44) whereα(k) is the time-varying learning rate and the term within thesquarebrackets isusedas theTDerroras [Bar98] δtd=R(k)+γQ(S(k+1),U(k+1))−Q(S(k),U(k)). (4.45) In order to guarantee the state-action value function converges to theoptimalvaluefunction, thetime-varyinglearningrateshaveto fulfill theconditionas [Bar98] ∞∑ k=1 α(k) =∞, and ∞∑ k=1 α(k)2<∞. (4.46) After the state-action value function is updated, the controller can be constructed by using the greedy algorithm (equation 4.43), or themorerobust -greedyalgorithm,suchas [Bar98] pi(u|s) = { 1− + /|U(s)|, u=u∗, /|U(s)|, others, (4.47) 118
zurück zum  Buch Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources"
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Titel
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Autor
Yiming Sun
Verlag
KIT Scientific Publishing
Ort
Karlsruhe
Datum
2016
Sprache
englisch
Lizenz
CC BY-SA 3.0
ISBN
978-3-7315-0467-2
Abmessungen
14.8 x 21.0 cm
Seiten
260
Schlagwörter
Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
Kategorie
Technik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources