Page - 43 - in Intelligent Environments 2019 - Workshop Proceedings of the 15th International Conference on Intelligent Environments
Image of the Page - 43 -
Text of the Page - 43 -
larger prediction timewindow,whichused for testing, themore difficult prediction and
the lower accuracy. The length of the timewindow is generally chosen in conjunction
with experience and actual computing needs. In this paper, backtracking timewindow
is set to 28days and the prediction timewindow is 7 days. That is to say, using known
informationof28daysbefore topredict targetvalueof7daysafter.
In the experiment, SVMcombines text data throughword vectorswith structured
data to predict the target.GBDTalgorithmcancapture the context of theword to some
extent, for example, to identify whether news text will lead to the tight capital level,
the text appears ”CashFlow” alongwithwords like ”abundant” and ”released” leading
to the reduction of the probability of tight capital. XGBoost is equivalent to a logistic
regression with L1 and L2 regularization terms, which improves the accuracy of the
model. LSTMnetworks is suitable for processing andpredicting important eventswith
verylongintervalsanddelaysinthetimeseries,andthenumberofnodesperhiddenlayer
isset to10,andthenumberof layers isset to10,anditeration timeis5000.Perceptronis
analgorithmforsupervisedlearningofbinaryclassifiers.Abinaryclassifier isafunction
which candecidewhether or not an input, representedby avector of numbers, belongs
to somespecificclass. In this experiment, themodel is simpler andconsists of two full-
connected layers [10].An array of keywords is also defined to strengthen themodel. It
is divided into two steps. The first one is that themodel only uses structured data for
trainingandtest, andtheother is toaddthenewstextdataonthebasisofstructureddata,
bydefining theKeywordArray.
6. ResultAnalysis
In thecaseof imbalancedistributionof samples (therearevery fewredandyellowsam-
ples), the error rate resulted frommodel over-fitting is particularly large. The accuracy
of green is very high,while the accuracy of other three categories is very low.The ex-
perimental results are shown inTable 7.As can be seen from the predicted results, be-
cause there are too few red and yellow samples, their predicting accuracy is 0.0,while
theorangeaccuracy rate isonlyabout0.044.
Table7. PredictingResultsofFour-Classification.
NO. AdequacyLevel Accuracy NO. AdequacyLevel Accuracy
1 Green 0.94615338 3 Orange 0.04444444
2 Yellow 0.00000000 4 Red 0.00000000
In the two-class capital adequacy level predict, this paper trained and tested five
modelsofSVM,GBDT,XGBoost,LSTMandPerceptron, anduses tencross-validation
analysis, and then used themean value as themodel accuracy rate. The experimental
results are shown inTable8.
Bycomparingandanalysis of thepredictingaccuracyof the abovefivemodels,we
can find that the results predicted by simple perceptron aremuch better than those of
othermorecomplexmodels.Themainreasonis that thecomplexityof theresearchscene
inthispaper is toohigh,andtheamountofdataisnotmuch,whichleadstoover-fittingof
the complexmodels suchasSVM,GBDT,XGBoost andLSTM,and thegeneralization
Y.Duetal. /Predicting the InterbankCapitalAdequacyLevelBasedonFinancialDataAnalysis 43
Intelligent Environments 2019
Workshop Proceedings of the 15th International Conference on Intelligent Environments
- Title
- Intelligent Environments 2019
- Subtitle
- Workshop Proceedings of the 15th International Conference on Intelligent Environments
- Authors
- Andrés Muñoz
- Sofia Ouhbi
- Wolfgang Minker
- Loubna Echabbi
- Miguel Navarro-Cía
- Publisher
- IOS Press BV
- Date
- 2019
- Language
- German
- License
- CC BY-NC 4.0
- ISBN
- 978-1-61499-983-6
- Size
- 16.0 x 24.0 cm
- Pages
- 416
- Category
- Tagungsbände