Seite - 243 - in Short-Term Load Forecasting by Artificial Intelligent Technologies

Bild der Seite - 243 -

Text der Seite - 243 -

Energies2018,11, 1893 • adatabase (this is theslowestoption), so that retrievalcanbeveryquick. Sinceweplantodealwithmillionsofseriesof thousandstimesteps,binaryﬁlesseemedlikea goodcompromisebecause theycaneasilyﬁtondisk—andoftenalso inmemory.OurRpackageuses this format internally,althoughitallowstoinputdata inanyof thesethreeshapes. Ifwewerespeaking of billionsof series of amillion time stepsormore, thendistributeddatabaseswouldbe required. In thiscaseonewouldonlyhas toﬁll thedatabaseandtell theRpackagehowtoaccess time-series. ThecurrentversionismostlywritteninRusingtheparallelpackageforefﬁciency, [35].Apartial versionwritten fully in Cwas slightly faster, but not enough compared to the loss of code clarity. ThecurrentRversioncanhandle the25millionssamplesonanovernightcomputationoverastandard desktopworkstation—assumingthecurvescanbestoredandaccessedquickly.Our implementation is callediecclust isavailableasopensourcesoftware. 7. ForecastingFrenchElectricityDataset 7.1.DataPresentation Weworkon thedataprovidedbyEDFalsoused in [24]which is composedofbig customers equipped with smart meters. Unfortunately, this dataset is conﬁdential and cannot be shared. Neverthelesswesuggest to the reader interested inbottomupelectricity consumption forecasting problemstorefer to theopendatasets listed in [3]. Thedatasetconsists inapproximately25000half-hourly loadconsumptionseriesover twoyears (2009–2010). Theﬁrstyear isused forpartitioningandthecalibrationofour forecastingalgorithm, thenthesecondyear isusedasa test set tosimulateareal forecastinguse-case. The initialdatasetcontainsover25,000 individual loadcurves. Totest theup-scalingabilityof our implementation,wecreate threedatasetsof sizes250,000;2,500,000and25,000,000. Inotherwords, weprogressively increase thesamplesizesbyafactorof10,100and1000respectively. Thecreation followsasimpleschemewhereeach individualcurve ismultipliedbytherealizationof independent variablesuniformlydistributedon [0.95,1.05]ateachtimestep. Eachcurve is thenreplicatedusing thisschemebyseveral timesequal to theup-scalingfactor. 7.2.NumericalExperiments Theﬁrst taskclusteringiscrucial forreducingthedimensionof thedataset.Wegivesometimings inorder to illustratehowour approach candealwith tensof thousandsof time series. Of course, the total computationtimedependsonthe technical speciﬁcationof thestructureusedtoperformthe computation. Inourcase,werestrictourselvestoastandardscientiﬁcworkstationwith8physicalcores and70Gigabitsof livememory.Weuseall theavailablecores toclusterchunksof5000observations followingthealgorithmdescribed inSection6forboth theﬁrstandsecondclusteringtask. Averysimplepretreatmentisdoneinordertoeliminateloadcurveswitheventualerrors. Forthis, wemeasure thestandarddeviationof thecontributionsofeachcurve tokeeponly the99%central observationseliminating theextremestones.With this, tooﬂatcurves (maybeconstant) consumptions orverywiggleonesareconsideredtobeabnormal. Table1givesmeanaveragerunningtimesover5replicates foreachof thedifferentsamplesizes. These ﬁgures show that our strategy yields on a linear increment on the computation timewith respect to thenumberof timeseries. Themaximumnumberofserieswetreat, that is25millionsof individualcurves,needsabout12htoachieve theﬁrst taskclustering. 243

zurück zum Buch Short-Term Load Forecasting by Artificial Intelligent Technologies"

Short-Term Load Forecasting by Artificial Intelligent Technologies

Titel: Short-Term Load Forecasting by Artificial Intelligent Technologies
Autoren: Wei-Chiang Hong; Ming-Wei Li; Guo-Feng Fan
Herausgeber: MDPI
Ort: Basel
Datum: 2019
Sprache: englisch
Lizenz: CC BY 4.0
ISBN: 978-3-03897-583-0
Abmessungen: 17.0 x 24.4 cm
Seiten: 448
Schlagwörter: Scheduling Problems in Logistics, Transport, Timetabling, Sports, Healthcare, Engineering, Energy Management
Kategorie: Informatik