Seite - 243 - in Short-Term Load Forecasting by Artificial Intelligent Technologies
Bild der Seite - 243 -
Text der Seite - 243 -
Energies2018,11, 1893
• adatabase (this is theslowestoption), so that retrievalcanbeveryquick.
Sinceweplantodealwithmillionsofseriesof thousandstimesteps,binaryfilesseemedlikea
goodcompromisebecause theycaneasilyfitondisk—andoftenalso inmemory.OurRpackageuses
this format internally,althoughitallowstoinputdata inanyof thesethreeshapes. Ifwewerespeaking
of billionsof series of amillion time stepsormore, thendistributeddatabaseswouldbe required.
In thiscaseonewouldonlyhas tofill thedatabaseandtell theRpackagehowtoaccess time-series.
ThecurrentversionismostlywritteninRusingtheparallelpackageforefficiency, [35].Apartial
versionwritten fully in Cwas slightly faster, but not enough compared to the loss of code clarity.
ThecurrentRversioncanhandle the25millionssamplesonanovernightcomputationoverastandard
desktopworkstation—assumingthecurvescanbestoredandaccessedquickly.Our implementation is
callediecclust isavailableasopensourcesoftware.
7. ForecastingFrenchElectricityDataset
7.1.DataPresentation
Weworkon thedataprovidedbyEDFalsoused in [24]which is composedofbig customers
equipped with smart meters. Unfortunately, this dataset is confidential and cannot be shared.
Neverthelesswesuggest to the reader interested inbottomupelectricity consumption forecasting
problemstorefer to theopendatasets listed in [3].
Thedatasetconsists inapproximately25000half-hourly loadconsumptionseriesover twoyears
(2009–2010). Thefirstyear isused forpartitioningandthecalibrationofour forecastingalgorithm,
thenthesecondyear isusedasa test set tosimulateareal forecastinguse-case.
The initialdatasetcontainsover25,000 individual loadcurves. Totest theup-scalingabilityof
our implementation,wecreate threedatasetsof sizes250,000;2,500,000and25,000,000. Inotherwords,
weprogressively increase thesamplesizesbyafactorof10,100and1000respectively. Thecreation
followsasimpleschemewhereeach individualcurve ismultipliedbytherealizationof independent
variablesuniformlydistributedon [0.95,1.05]ateachtimestep. Eachcurve is thenreplicatedusing
thisschemebyseveral timesequal to theup-scalingfactor.
7.2.NumericalExperiments
Thefirst taskclusteringiscrucial forreducingthedimensionof thedataset.Wegivesometimings
inorder to illustratehowour approach candealwith tensof thousandsof time series. Of course,
the total computationtimedependsonthe technical specificationof thestructureusedtoperformthe
computation. Inourcase,werestrictourselvestoastandardscientificworkstationwith8physicalcores
and70Gigabitsof livememory.Weuseall theavailablecores toclusterchunksof5000observations
followingthealgorithmdescribed inSection6forboth thefirstandsecondclusteringtask.
Averysimplepretreatmentisdoneinordertoeliminateloadcurveswitheventualerrors. Forthis,
wemeasure thestandarddeviationof thecontributionsofeachcurve tokeeponly the99%central
observationseliminating theextremestones.With this, tooflatcurves (maybeconstant) consumptions
orverywiggleonesareconsideredtobeabnormal.
Table1givesmeanaveragerunningtimesover5replicates foreachof thedifferentsamplesizes.
These figures show that our strategy yields on a linear increment on the computation timewith
respect to thenumberof timeseries. Themaximumnumberofserieswetreat, that is25millionsof
individualcurves,needsabout12htoachieve thefirst taskclustering.
243
Short-Term Load Forecasting by Artificial Intelligent Technologies
- Titel
- Short-Term Load Forecasting by Artificial Intelligent Technologies
- Autoren
- Wei-Chiang Hong
- Ming-Wei Li
- Guo-Feng Fan
- Herausgeber
- MDPI
- Ort
- Basel
- Datum
- 2019
- Sprache
- englisch
- Lizenz
- CC BY 4.0
- ISBN
- 978-3-03897-583-0
- Abmessungen
- 17.0 x 24.4 cm
- Seiten
- 448
- Schlagwörter
- Scheduling Problems in Logistics, Transport, Timetabling, Sports, Healthcare, Engineering, Energy Management
- Kategorie
- Informatik