Seite - 242 - in Short-Term Load Forecasting by Artificial Intelligent Technologies

Bild der Seite - 242 -

Text der Seite - 242 -

Energies2018,11, 1893 for larger sizes (tensofmillions). Ofcourseall theseconsiderationsdependheavilyon thespeciﬁc materialandtechnology.Werecall thatour interest isonrelativelystandardscientiﬁcworkstations. Thealgorithmweuseontheﬁrst stepof theclustering isdescribedbelow.Wethenshowtheresultsof theproﬁlingofourwholestrategytohighlightwhereare thebottleneckswhenonewishes toup-scale themethod.Weendthissectiondiscussingthesolutionsweproposed. 6.1.AlgorithmDescription Themassivedatasetclusteringalgorithmisas follows: 1. Dataserialization.Timeseriesaregiven inaverboseby-columnformat.Were-codeallof themin abinaryﬁle (if suitable),oradatabase. 2. Dimensionality reduction.Eachseriesof lengthN is replacedbythe log2(N)energeticcoefﬁcients deﬁnedusingawaveletbasis. Eventuallya featureselectionstepcanbeperformedto further reductiononthenumberof features. 3. Chunking.Data ischunkedintogroupsofsizeatmostnc,wherenc isauserparameter (weuse nc=5000 in thenextsectionexperiments). 4. Clustering.Withineachgroup, thePAMclusteringalgorithmisruntoobtainK0 clusters. 5. Gathering. Aﬁnal runofPAMisperformed toobtainK′mediods,K′ noutof thenc×K0 mediodsobtainedonthechunks.. FromtheseK′medoids thesynchronecurvesarecomputed(i.e., thesumofall curveswithineach groupforeachtimestep),andgivenonoutput for thepredictionstep. 6.2. CodeProﬁling Figure 9 gives some timings obtainedbyproﬁling the runs of our initial (C) code. Togive a clearer insight,wealsoreport thesizeof theobjectswedealwith. Thestartingpoint is theensembleof individual recordsofelectricitydemandforawholeyear.Here,wetreatover25,000clientssampled half-hourlyduringayear. The tabulationof thesedata toobtainamatrix representationsuitable toﬁt inmemorytakeabout7min. andrequiresover30Gbofmemory. Task Time Memory Disk Raw(15Gb) tomatrix 7min 30Gb 2.7Gb Computecontributions 7min <1Gb 7Mb 1ststageclustering 3min <1Gb – Aggregation 1min 6Gb 30Mb Werdistancematrix 40min 64Gb 150Kb Forecasts 10min <1Gb – Figure9. Codeproﬁlingbytasks. 6.3. ProposedSolutions Twomainsolutionsare tobediscussed, concerningthe internaldatastoragestrategyandtheuse ofasimpleparallelizationscheme. Theformer looks for reducingthecommunicationtimeof internal operationsusing serialization. The latter attacks themajor bottleneck of our clustering approach, that is theconstructionof theWERdissimilaritymatrix. The initial format (verbose, by-column) is clearly inappropriate for efﬁcient data processing. Thereareseveraloptionsstartingfromthisdata format, they implyhavingall seriesstoredas • anASCIIﬁle,onesampleper line;very fast,butdataretrievalwilldependonlinenumber; • abinaryformat (3or4octetspervalue); compression isunadvisedsince itwould increaseboth preprocessingtimeand(bya largeamount) readingtimes; 242

zurück zum Buch Short-Term Load Forecasting by Artificial Intelligent Technologies"

Short-Term Load Forecasting by Artificial Intelligent Technologies

Titel: Short-Term Load Forecasting by Artificial Intelligent Technologies
Autoren: Wei-Chiang Hong; Ming-Wei Li; Guo-Feng Fan
Herausgeber: MDPI
Ort: Basel
Datum: 2019
Sprache: englisch
Lizenz: CC BY 4.0
ISBN: 978-3-03897-583-0
Abmessungen: 17.0 x 24.4 cm
Seiten: 448
Schlagwörter: Scheduling Problems in Logistics, Transport, Timetabling, Sports, Healthcare, Engineering, Energy Management
Kategorie: Informatik