Web-Books
im Austria-Forum
Austria-Forum
Web-Books
Tagungsbände
Intelligent Environments 2019 - Workshop Proceedings of the 15th International Conference on Intelligent Environments
Seite - 40 -
  • Benutzer
  • Version
    • Vollversion
    • Textversion
  • Sprache
    • Deutsch
    • English - Englisch

Seite - 40 - in Intelligent Environments 2019 - Workshop Proceedings of the 15th International Conference on Intelligent Environments

Bild der Seite - 40 -

Bild der Seite - 40 - in Intelligent Environments 2019 - Workshop Proceedings of the 15th International Conference on Intelligent Environments

Text der Seite - 40 -

4.2. UnstructuredDataPreprocessing 4.2.1. SentenceSegmentation In order to judgewhether there are correspondingwords in the emotional dictionary in the sentence,we need to cut the sentence accurately intowords, namely the automatic segmentation of the sentence. After comparing the existing Word segmentation tools, considering the accuracy and the ease of use on the Python platform,we finally chose the JiebaChineseword segmentation [7] as ourword segmentation tool. The results of wordsegmentationexamplesare shown inTable4. 4.2.2. WordVectorization After sentencesegmentation,Word2Vec[8] isused toproducehigh-dimensionalvectors (WordEmbedding) to represent thewords, andconverts thesamples intowordsequence vectors. In the experiment, we do theword vectorization via calling the function gen- sim.models.word2vec,which takes thenews textas inputandproduces thewordvectors as output.And a sentence vector is the average of all thewords it contains. In this pro- cedure, thewhole text corpuswasmapped into a 300-dimensional vector space,where similarwordsarenearer thanothers. Table4. SentenceSegmentationExamples. Date SegmentationExample 2018/3/15 US dollar against the Canadian dollar rose above 1.3044 the highest in the last eight months 2018/3/16 Offshore Renminbi (CNH) was quoted at 6.3293 yuan against a US dollar at 04:59 Beijing time 4.3. StructuredDataPreprocessing 4.3.1. Imputation Sincetherearesomemissingvalues inthestructureddatasetof thispaper, it isacommon practice todeleteall therelevant rowsandcolumnsof thedata if there isamissingvalue, resulting in the consequence that important features lose easily. Therefore, imputation occupies a significant place in the preprocessing stage. This paper fills in themissing valueswith thepandas.DataFrame.fillna function, usingpad (padding themissingvalue with the previous non-missing value) and bfill (filling themissing valuewith the next non-missingvalue)modes. 4.3.2. DataNormalization Normalization is the standardized processing of all structured data to eliminate the di- mensional impact betweenvarious indicators.Thepurposeof this procedure is tomake the original data of the indicators in the sameorder ofmagnitudeunder comprehensive comparativeevaluation. In thispaper, theMin-Maxnormalizationmethod isused to lin- early transform the original data so that all the values aremapped between [0-1]. This Y.Duetal. /Predicting the InterbankCapitalAdequacyLevelBasedonFinancialDataAnalysis40
zurück zum  Buch Intelligent Environments 2019 - Workshop Proceedings of the 15th International Conference on Intelligent Environments"
Intelligent Environments 2019 Workshop Proceedings of the 15th International Conference on Intelligent Environments
Titel
Intelligent Environments 2019
Untertitel
Workshop Proceedings of the 15th International Conference on Intelligent Environments
Autoren
Andrés Muñoz
Sofia Ouhbi
Wolfgang Minker
Loubna Echabbi
Miguel Navarro-Cía
Verlag
IOS Press BV
Datum
2019
Sprache
deutsch
Lizenz
CC BY-NC 4.0
ISBN
978-1-61499-983-6
Abmessungen
16.0 x 24.0 cm
Seiten
416
Kategorie
Tagungsbände
Web-Books
Bibliothek
Datenschutz
Impressum
Austria-Forum
Austria-Forum
Web-Books
Intelligent Environments 2019