Seite - 166 - in Short-Term Load Forecasting by Artificial Intelligent Technologies
Bild der Seite - 166 -
Text der Seite - 166 -
Energies2018,11, 2038
23dummies for thehourof theday, sixdummies for thedayof theweek,11dummies for themonth
of theyear,fivedummies for specialdays (FH1, . . . , FH5), twopredictorsofhistoric temperatures
(lags48hand72h),andsixpredictorsofhistoric loads (lags48h,72h,96h,120h,144h,and168h).
Foreachensemblemethod, theparameterselectionhasbeendevelopedandmeasuresofvariable
importancehavebeenobtained(seeTable3forthemeaningofeachterm). Inordertohavereproducible
models and comparable results, the same seedwas selected in all procedures that require random
sampling. In the caseofbaggingandrandomforest,wehaveselectedanoptimalnumberof trees
(ntree) through theOOBerror estimate andwehaveordered thepredictors according to thenode
impurity importancemeasure, see [28]. Forbagging, thenumberofpredictors thatareconsideredat
eachsplitmustbe the totalnumberofpredictors,whereas in thecaseof randomforest, theoptimal
parameterhasbeenselectedusingtheOOBerrorestimate fordifferentvaluesofmtry. In thecaseof
conditional forest, theconditionalvariable importancemeasure introducedin[40]hasbeenconsidered,
whichbetter reflects the true impactofeachpredictor inpresenceofcorrelatedpredictors.
While inbaggingandrandomforest theOOBerrorwasusedto tunetheparameters, in thecase
ofconditional forestandXGBoost theparameterswere tunedbymeansofcrossvalidationwithfive
folds (approximately oneyear in each fold). As for conditional forest, only twoparameters need
tobe tuned(ntreeandmtry), but inXGBoost, therearemoreparameters to tune.Althoughonecan
applycrossvalidationtaking intoaccountamulti-dimensiongridwithallof theparameters to tune
(thisapproachwould implyahighcomputationalcost),weconsideredasimplificationof thesearch
selecting subsample=0.5,maxdepth=6 (appropriate inmost problems) and looking for a good
combinationof“eta”and“nrounds”, seeTable4. The restofparametersof themethodweresetup
bydefault, according to theRpackage [38]. In thecaseofXGBoost, featureshavebeenorderedby
decreasing importancewhileusingthegainmeasuredefinedin[36].
Table3.Notation.
Term Description
ntree (N) Numberof treesor iterations inbagging, randomforestandconditional forest
mtry Numberofpredictorsconsideredateachsplit inbagging,
randomforestandconditional
forest
node impurity Importancemeasure inrandomforest
max_depth Maximumdepthofa tree
subsample Subsampleratioof the training instance
eta Shrinkageor learningrate
nrounds Numberofboosting iterations
gain Fractionalcontributionofeachfeature to themodel
Table4showstheresultsof theparameterselectionfor theXGBoostmethod.Recall thata lower
learning rate eta implies a greater number of iterationsnround, but a too largenround can lead to
overfitting. Combination (eta = 0.02, nrounds = 3400) provided the lowestRMSE and the highest
R-squaredscores for the testdata,whereas (eta=0.01,nrounds=5700)got the lowestMAPE.However,
anypairofparameters inTable4couldbeappropriatebecause they leadsimilaraccuracy.
166
Short-Term Load Forecasting by Artificial Intelligent Technologies
- Titel
- Short-Term Load Forecasting by Artificial Intelligent Technologies
- Autoren
- Wei-Chiang Hong
- Ming-Wei Li
- Guo-Feng Fan
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2019
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-03897-583-0
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 448
- Schlagwörter
- Scheduling Problems in Logistics, Transport, Timetabling, Sports, Healthcare, Engineering, Energy Management
- Kategorie
- Informatik