Seite - 162 - in Short-Term Load Forecasting by Artificial Intelligent Technologies
Bild der Seite - 162 -
Text der Seite - 162 -
Energies2018,11, 2038
useasquarederror loss function,whereasaclassificationproblemmayuse logarithmic loss. Indeed,
anydifferentiable loss functioncanbeused.
Althoughboostingmethods reducesbiasmore thanbagging, theyaremore likely tooverfita
trainingdataset. Toovercomethis task, several regularizationtechniquescanbeapplied.
• Treeconstraints: thereareseveralways to introduceconstraintswhenconstructingregression
trees. Forexample, the followingtreeconstraintscanbeconsideredasregularizationparameters:
The number of gradient boosting iterationsN: increasingN reduces the error on the
trainingdataset,butmayleadtooverfitting.AnoptimalvalueofN isoftenselectedby
monitoringpredictionerroronaseparatevalidationdataset.
Treedepth: the size of the trees or number of terminal nodes in trees,which controls
themaximumallowed level of interaction betweenvariables in themodel. Theweak
learnersneedtohaveskillsbut theyshouldremainweak, thusshorter treesarepreferred.
Ingeneral,valuesof treedepthbetween4and8workwellandvaluesgreater than10are
unlikely toberequired, see [35].
Theminimumnumberofobservationper split: theminimumnumberofobservations
neededbeforeasplit canbeconsidered. Ithelps toreducepredictionvarianceat leaves.
• Shrinkageor learningrate: inregularizationbyshrinkage,eachupdate is scaledbythevalueof
the learningrateparameter“eta” in (0,1]. Shrinkagereduces the influenceofeach individual tree
andleavesspace for future trees to improve themodel.As it is stated in [28], small learningrates
provide improvements inmodel’sgeneralizationabilityovergradientboostingwithoutshrinking
(eta=1),but thecomputational time increases. Besides, thenumberof iterationsandlearningrate
are tightlyrelated: forasmaller learningrate“eta”,agreaterN is required.
• Random sampling: to reduce the correlation between the trees in the sequence, at each
step, a subsample of the trainingdata is selectedwithout replacement to fit the base learner.
Thismodificationprevent overfitting and itwasfirst introduced in [36],which is also called
stochasticgradientboosting. Friedmanobservedanimprovement ingradientboosting’saccuracy
with samplingsof aroundonehalf of the trainingdatasets. Analternative to rowsampling is
columnsampling,which indeedpreventsover-fittingmoreefficiently, see [37].
• Penalize tree complexity: complexityof a tree canbedefinedasacombinationof thenumber
of leaves and the L2 normof the leaf scores. This regularization not only avoids overfitting,
italso tends toselect simpleandpredictivemodels. Followingthisapproach, ref. [37]describes
ascalable treeboostingsystemcalledXGBoost. In thatpaper, theobjective tobeminimized is
a combinationof the loss functionand the complexityof the tree. In contrast to theprevious
ensemblemethods, XGBoost requires aminimal amount of computational resources to solve
real-worldproblems.
InXGBoost, themodel is trainedinanadditivemannerandit considersaregularizedobjective
that includesa loss functionandpenalizes thecomplexityof themodel. Following[37], ifwedenote
by yˆ(t)i , thepredictionof the i-th instanceof theresponseat the t-th iteration,weneedtofindthe tree
structure ft thatminimizes the followingobjective:
L(t) = n
∑
i= 1 l (
yi, yˆ (t−1)
i + ft(xi) )
+Ω(ft) (2)
In the first termof (2), l is a differentiable convex loss function thatmeasures the difference
betweentheobservedresponseyi andtheresultingprediction yˆi. Thesecondtermof (2)penalizes the
complexityof themodel,as follows:
Ω(f) = γT+ 1
2 λ‖w‖2 (3)
162
Short-Term Load Forecasting by Artificial Intelligent Technologies
- Titel
- Short-Term Load Forecasting by Artificial Intelligent Technologies
- Autoren
- Wei-Chiang Hong
- Ming-Wei Li
- Guo-Feng Fan
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2019
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-03897-583-0
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 448
- Schlagwörter
- Scheduling Problems in Logistics, Transport, Timetabling, Sports, Healthcare, Engineering, Energy Management
- Kategorie
- Informatik