Seite - 161 - in Short-Term Load Forecasting by Artificial Intelligent Technologies

Bild der Seite - 161 -

Text der Seite - 161 -

Energies2018,11, 2038 the importance of each predictor in the ﬁnal forecastingmodel andmight suggest a reduced set ofpredictors. 2.2. RandomForest Randomforestsare indeedageneralizationofbagging. Insteadofconsideringallof thepredictors ateachsplitof the tree,onlyarandomsampleof“mtry”predictorscanbechoseneachtime. Themain advantageof randomforests respect tobaggingcanbenoticed in thecaseofcorrelatedpredictors,as it is stated in[28]: predictions fromthebaggedtreeswillbehighlycorrelatedsothatbaggingwillnot reduce thevariancesomuch,whereasrandomforestsovercomethisproblembyforcingeachsplit to consideronlyasubsetof thepredictors. In thecaseof randomforest, theefﬁciencyof themethoddependsonasuitableselectionof the numberof treesNandthenumberofpredictorsmtry testedateachsplit.Again, theOOBerrorcanbe usedforsearchingasuitableNaswellasasuitablemtry.Aswithbagging, randomforestswillnot overﬁt ifwe increaseN, so thegoal is tochooseavalue that is sufﬁciently large. Therandomforest methodthat isusedinthispaperhasbeen implementedthroughout theRpackage“randomForest”, see [29]. 2.3. ConditionalForest Conditional forests consist in an implementationof thebaggingandrandomforest ensemble algorithms,bututilizingconditional inference treesasbase learners.Conditional inference treesare notonlysuitable forprediction(itspartitioningalgorithmavoidoverﬁtting),butalso forexplanation purposesbecausetheyselectvariables inanunbiasedway.Theyareespeciallyuseful inthepresenceof high-order interactionsandwhenthenumberofpredictors is largewhencomparedto thesamplesize. Inconditional forests, each tree isobtainedbybinaryrecursivepartitioning,as follows(see [30]): ﬁrstly, thealgorithmtestswhetheranypredictor isassociatedwiththeresponse,anditchooses theone thathas thestrongestassociation; secondly, thealgorithmmakesabinarysplit in thisvariable;ﬁnally, theprevious twostepsarerepeatedforeachsubsetuntil therearenopredictors thatareassociated with theresponse. Theﬁrst stepuses thepermutation tests forconditional inferencedevelopedin [31]. Aswithrandomforest, inthecaseofconditional forest,weneedasuitableselectionofthenumber mtryof features testedateachsplit (the totalnumberofpredictorsmightbepreferred)andthenumber of treesN (generallya lowervalue thanforrandomforest is required). In thispaper, theconditional forestmethodhasbeen implementedthroughout theRpackage“party”, see [32]. 2.4. Boosting In contrast to the above ensemblemethods, in boosting the “N” base, learners are obtained sequentially, that is, eachbase learner isdeterminedwhile taking intoaccount thesuccessanderrors of thepreviousbase learners. TheﬁrstboostingalgorithmwasAdaptiveBoosting(AdaBoost), as introducedin[33]. Instead of using bootstrap sampling, the original training sample isweighted at each step, givingmore importance to thoseobservations thatprovidedlargeerrorsatprevioussteps. Besides, theprediction foranewobservation isgivenbyaweightedaverage (insteadofasimpleaverage)of theresponsesof theNbase learners. AdaBoostwas later recast inastatistical frameworkasanumericaloptimizationproblemwhere theobjective is tominimizea loss functionusingagradientdescentprocedure, see [34]. Thisnew approachwascalled“gradientboosting”,and it is consideredoneof themostpowerful techniques for buildingpredictivemodels. Gradientboosting involves threeelements: a loss function tobeoptimized, aweak learner to makepredictions (in this case,decision treesobtained inagreedymanner), andanadditivemodel toaddweaklearners (theoutput foreachnewtree isaddedto theoutputof theexistingsequenceof trees). The loss functionuseddependsonthe typeofproblem. Forexample,a regressionproblemmay 161

zurück zum Buch Short-Term Load Forecasting by Artificial Intelligent Technologies"

Short-Term Load Forecasting by Artificial Intelligent Technologies

Titel: Short-Term Load Forecasting by Artificial Intelligent Technologies
Autoren: Wei-Chiang Hong; Ming-Wei Li; Guo-Feng Fan
Herausgeber: MDPI
Ort: Basel
Datum: 2019
Sprache: englisch
Lizenz: CC BY 4.0
ISBN: 978-3-03897-583-0
Abmessungen: 17.0 x 24.4 cm
Seiten: 448
Schlagwörter: Scheduling Problems in Logistics, Transport, Timetabling, Sports, Healthcare, Engineering, Energy Management
Kategorie: Informatik