Page - 43 - in Intelligent Environments 2019 - Workshop Proceedings of the 15th International Conference on Intelligent Environments

Image of the Page - 43 -

Text of the Page - 43 -

larger prediction timewindow,whichused for testing, themore difﬁcult prediction and the lower accuracy. The length of the timewindow is generally chosen in conjunction with experience and actual computing needs. In this paper, backtracking timewindow is set to 28days and the prediction timewindow is 7 days. That is to say, using known informationof28daysbefore topredict targetvalueof7daysafter. In the experiment, SVMcombines text data throughword vectorswith structured data to predict the target.GBDTalgorithmcancapture the context of theword to some extent, for example, to identify whether news text will lead to the tight capital level, the text appears ”CashFlow” alongwithwords like ”abundant” and ”released” leading to the reduction of the probability of tight capital. XGBoost is equivalent to a logistic regression with L1 and L2 regularization terms, which improves the accuracy of the model. LSTMnetworks is suitable for processing andpredicting important eventswith verylongintervalsanddelaysinthetimeseries,andthenumberofnodesperhiddenlayer isset to10,andthenumberof layers isset to10,anditeration timeis5000.Perceptronis analgorithmforsupervisedlearningofbinaryclassiﬁers.Abinaryclassiﬁer isafunction which candecidewhether or not an input, representedby avector of numbers, belongs to somespeciﬁcclass. In this experiment, themodel is simpler andconsists of two full- connected layers [10].An array of keywords is also deﬁned to strengthen themodel. It is divided into two steps. The ﬁrst one is that themodel only uses structured data for trainingandtest, andtheother is toaddthenewstextdataonthebasisofstructureddata, bydeﬁning theKeywordArray. 6. ResultAnalysis In thecaseof imbalancedistributionof samples (therearevery fewredandyellowsam- ples), the error rate resulted frommodel over-ﬁtting is particularly large. The accuracy of green is very high,while the accuracy of other three categories is very low.The ex- perimental results are shown inTable 7.As can be seen from the predicted results, be- cause there are too few red and yellow samples, their predicting accuracy is 0.0,while theorangeaccuracy rate isonlyabout0.044. Table7. PredictingResultsofFour-Classiﬁcation. NO. AdequacyLevel Accuracy NO. AdequacyLevel Accuracy 1 Green 0.94615338 3 Orange 0.04444444 2 Yellow 0.00000000 4 Red 0.00000000 In the two-class capital adequacy level predict, this paper trained and tested ﬁve modelsofSVM,GBDT,XGBoost,LSTMandPerceptron, anduses tencross-validation analysis, and then used themean value as themodel accuracy rate. The experimental results are shown inTable8. Bycomparingandanalysis of thepredictingaccuracyof the aboveﬁvemodels,we can ﬁnd that the results predicted by simple perceptron aremuch better than those of othermorecomplexmodels.Themainreasonis that thecomplexityof theresearchscene inthispaper is toohigh,andtheamountofdataisnotmuch,whichleadstoover-ﬁttingof the complexmodels suchasSVM,GBDT,XGBoost andLSTM,and thegeneralization Y.Duetal. /Predicting the InterbankCapitalAdequacyLevelBasedonFinancialDataAnalysis 43

back to the book Intelligent Environments 2019 - Workshop Proceedings of the 15th International Conference on Intelligent Environments"

Intelligent Environments 2019 Workshop Proceedings of the 15th International Conference on Intelligent Environments

Title: Intelligent Environments 2019
Subtitle: Workshop Proceedings of the 15th International Conference on Intelligent Environments
Authors: Andrés Muñoz; Sofia Ouhbi; Wolfgang Minker; Loubna Echabbi; Miguel Navarro-Cía
Publisher: IOS Press BV
Date: 2019
Language: German
License: CC BY-NC 4.0
ISBN: 978-1-61499-983-6
Size: 16.0 x 24.0 cm
Pages: 416
Category: Tagungsbände