Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Technik
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Seite - 117 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 117 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Bild der Seite - 117 -

Bild der Seite - 117 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources

Text der Seite - 117 -

4.2. IntelligentControl thevaluefunctionandthenusingthevaluefunctiontoderivethecon- trolpolicyare theprocedures followedbyallRLCmethods. Temporaldifferencemethods According to different applicable situations and different approaches of updating the value function, RLC can be classified into three types: dynamic programming (DP) [BBBB95] [Si04], Monte Carlo (MC) [BC06] methods and temporal difference (TD) [Tes92] methods. DP is the most basic and classical RLC scheme, which requires com- plete knowledge of the plant. When the complete dynamics of the plant are known in advance, the transition probability distribution is easily obtained. The value function and corresponding optimal con- trolpolicycanbedirectlycalculatedvia iterativeupdatesaccordingto equations 4.37 and 4.43, respectively. In DP methods, both the value function and the control policy are updated simultaneously through interactions between each other, using either the policy iteration (PI) algorithm, the value iteration (VI) algorithm and other generalized policy iteration (GPI) algorithms. Differences between among algo- rithmsanddetailedintroductionscanbefoundin[KLM96]and[Si04]. DP methods are guaranteed to converge to the final optimality for fi- nite MDPs, but they are thought of limited usage in practice because of its inefficiency for high-dimensional problems and requirement of completeknowledgeof theenvironment [Bar98]. When the dynamics of the plant are not completely known, both MC and TD methods can be applied to approximate the value function based on real time experiences. MC is a learning strategy based on random explorations. In MC methods, the update of the value func- tion is done episode by episode [BC06]. Each episode starts from a random state, takes control actions defined randomly or from a con- trol policypi, and ends at a predefined terminal state. For each state occurred in an episode, its value function is simply calculated as the accumulatedrewardsfromitsfirst (orevery)appearanceuntil theend of the episode. After multiple episodes, the mean value function can be obtained and the greedy algorithms is applied to generate the fi- nal deterministic control policy (such as equation 4.43). The idea of MC methods is easy but the main problem is that the update is only 117
zurück zum  Buch Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources"
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Titel
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Autor
Yiming Sun
Verlag
KIT Scientific Publishing
Ort
Karlsruhe
Datum
2016
Sprache
englisch
Lizenz
CC BY-SA 3.0
ISBN
978-3-7315-0467-2
Abmessungen
14.8 x 21.0 cm
Seiten
260
Schlagwörter
Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
Kategorie
Technik
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources