Seite - 126 - in Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
Bild der Seite - 126 -
Text der Seite - 126 -
4. ControlSystemDesign
cally easier to be implemented. Its procedures are described in figure
4.12.
InitializeQ(s,u)arbitrarily forall state-actionpairs;
Initializee(s,u) = 0 forall state-actionpairs;
Take thefirst controlactionU(k)at timek= 1usingthe -greedy
controlpolicy;
Ateachtimek (k>1) :
repeat
1 Receive therewardR(k−1)andobserve thecurrentstateS(k)
basedonthe last state-actionpair (S(k−1),U(k−1));
2 Update theeligibility tracematrixby
e(S(k−1),U(k−1)) =e(S(k−1),U(k−1))+1;
3 Update the temporaldifferenceby
δtd=R(k−1)+γmaxu′Q(S(k),u′)−Q(S(k−1),U(k−1));
4 Update theQvalues forall state-actionpairsby
Q(s,u) =Q(s,u)+αδtde(s,u);
5 Choose thecontrolactionU(k) fromthecurrentstateS(k)
usingthe -greedycontrolpolicy;
6 IfU(k) is thegreedycontrolaction, thene(s,u) =γλe(s,u)
andotherwisee(s,u) = 0;
until the endof the controlprocess;
Figure4.12. Procedures in theWatkins’Q(λ) learningcontrol.
The lookup table basedQ(λ) is selected as the TD learning control
method here mainly under the consideration of control stability. The
lookuptablebasedTDlearningisgenerallymorestablethanthefunc-
tionapproximationbasedACmethods. Ontheonehand,TDlearning
with linear featurebasedapproximationfunctionshasbeenprovedto
converge to the optimal control policy [TVR97], but it is difficult to
approximate the dynamics of HEPHAISTOS using linear features, es-
pecially to cover the part that how different control actions influence
the state-action values . On the other hand, TD methods with non-
linear functionapproximationscaneasilybecomeunstableduringthe
learning process [B+95] [PSD01]. In contrast, the lookup table based
126
Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Titel
- Adaptive and Intelligent Temperature Control of Microwave Heating Systems with Multiple Sources
- Autor
- Yiming Sun
- Verlag
- KIT Scientific Publishing
- Ort
- Karlsruhe
- Datum
- 2016
- Sprache
- englisch
- Lizenz
- CC BY-SA 3.0
- ISBN
- 978-3-7315-0467-2
- Abmessungen
- 14.8 x 21.0 cm
- Seiten
- 260
- Schlagwörter
- Mikrowellenerwärmung, Mehrgrößenregelung, Modellprädiktive Regelung, Künstliches neuronales Netz, Bestärkendes Lernenmicrowave heating, multiple-input multiple-output (MIMO), model predictive control (MPC), neural network, reinforcement learning
- Kategorie
- Technik