Seite - 161 - in Short-Term Load Forecasting by Artificial Intelligent Technologies
Bild der Seite - 161 -
Text der Seite - 161 -
Energies2018,11, 2038
the importance of each predictor in the final forecastingmodel andmight suggest a reduced set
ofpredictors.
2.2. RandomForest
Randomforestsare indeedageneralizationofbagging. Insteadofconsideringallof thepredictors
ateachsplitof the tree,onlyarandomsampleof“mtry”predictorscanbechoseneachtime. Themain
advantageof randomforests respect tobaggingcanbenoticed in thecaseofcorrelatedpredictors,as it
is stated in[28]: predictions fromthebaggedtreeswillbehighlycorrelatedsothatbaggingwillnot
reduce thevariancesomuch,whereasrandomforestsovercomethisproblembyforcingeachsplit to
consideronlyasubsetof thepredictors.
In thecaseof randomforest, theefficiencyof themethoddependsonasuitableselectionof the
numberof treesNandthenumberofpredictorsmtry testedateachsplit.Again, theOOBerrorcanbe
usedforsearchingasuitableNaswellasasuitablemtry.Aswithbagging, randomforestswillnot
overfit ifwe increaseN, so thegoal is tochooseavalue that is sufficiently large. Therandomforest
methodthat isusedinthispaperhasbeen implementedthroughout theRpackage“randomForest”,
see [29].
2.3. ConditionalForest
Conditional forests consist in an implementationof thebaggingandrandomforest ensemble
algorithms,bututilizingconditional inference treesasbase learners.Conditional inference treesare
notonlysuitable forprediction(itspartitioningalgorithmavoidoverfitting),butalso forexplanation
purposesbecausetheyselectvariables inanunbiasedway.Theyareespeciallyuseful inthepresenceof
high-order interactionsandwhenthenumberofpredictors is largewhencomparedto thesamplesize.
Inconditional forests, each tree isobtainedbybinaryrecursivepartitioning,as follows(see [30]):
firstly, thealgorithmtestswhetheranypredictor isassociatedwiththeresponse,anditchooses theone
thathas thestrongestassociation; secondly, thealgorithmmakesabinarysplit in thisvariable;finally,
theprevious twostepsarerepeatedforeachsubsetuntil therearenopredictors thatareassociated
with theresponse. Thefirst stepuses thepermutation tests forconditional inferencedevelopedin [31].
Aswithrandomforest, inthecaseofconditional forest,weneedasuitableselectionofthenumber
mtryof features testedateachsplit (the totalnumberofpredictorsmightbepreferred)andthenumber
of treesN (generallya lowervalue thanforrandomforest is required). In thispaper, theconditional
forestmethodhasbeen implementedthroughout theRpackage“party”, see [32].
2.4. Boosting
In contrast to the above ensemblemethods, in boosting the “N” base, learners are obtained
sequentially, that is, eachbase learner isdeterminedwhile taking intoaccount thesuccessanderrors
of thepreviousbase learners.
ThefirstboostingalgorithmwasAdaptiveBoosting(AdaBoost), as introducedin[33]. Instead
of using bootstrap sampling, the original training sample isweighted at each step, givingmore
importance to thoseobservations thatprovidedlargeerrorsatprevioussteps. Besides, theprediction
foranewobservation isgivenbyaweightedaverage (insteadofasimpleaverage)of theresponsesof
theNbase learners.
AdaBoostwas later recast inastatistical frameworkasanumericaloptimizationproblemwhere
theobjective is tominimizea loss functionusingagradientdescentprocedure, see [34]. Thisnew
approachwascalled“gradientboosting”,and it is consideredoneof themostpowerful techniques for
buildingpredictivemodels.
Gradientboosting involves threeelements: a loss function tobeoptimized, aweak learner to
makepredictions (in this case,decision treesobtained inagreedymanner), andanadditivemodel
toaddweaklearners (theoutput foreachnewtree isaddedto theoutputof theexistingsequenceof
trees). The loss functionuseddependsonthe typeofproblem. Forexample,a regressionproblemmay
161
Short-Term Load Forecasting by Artificial Intelligent Technologies
- Titel
- Short-Term Load Forecasting by Artificial Intelligent Technologies
- Autoren
- Wei-Chiang Hong
- Ming-Wei Li
- Guo-Feng Fan
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2019
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-03897-583-0
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 448
- Schlagwörter
- Scheduling Problems in Logistics, Transport, Timetabling, Sports, Healthcare, Engineering, Energy Management
- Kategorie
- Informatik