Web-Books
in the Austria-Forum
Austria-Forum
Web-Books
Informatik
The Future of Software Quality Assurance
Page - 133 -
  • User
  • Version
    • full version
    • text only version
  • Language
    • Deutsch - German
    • English

Page - 133 - in The Future of Software Quality Assurance

Image of the Page - 133 -

Image of the Page - 133 - in The Future of Software Quality Assurance

Text of the Page - 133 -

Testing Artificial Intelligence 133 4.5 Test Data What test data to use and whether it can be created, found or manipulated depends on the context and the availability of data from production. Data creation or manipulation(likeincaseof imagerecognition)ishardtodoandsometimesuseless orevencounter-productive.Using tools tomanipulateorcreate imagesbrings inan extravariablewhichmightcreatebiasof its own!Howrepresentativeof real-world pictures is test data? If the algorithmidentifiesaspects in createddata that can only be foundin test data, thevalueof the tests is compromised. AI testerscreatea testdataset fromreal-lifedataandstrictlyseparate thesefrom training data. As the AI system is dynamic, the world it is used in is dynamic, test datawill have to be refreshedregularly. 4.6 Metrics TheoutputofAI isnotBoolean: theyarecalculatedresultsonallpossibleoutcomes (labels).To determine the performanceof the system, it is not enough to determine which label has thehighest score.Metricswill benecessary. Take, forexample, image recognition:we want to knowif a picture of a cat will be recognised as a cat. In practice this means that the label “cat” will get a higher score than“dog”. If thescoreoncat is0.43anddoggets0.41, thecatwins.But the smalldifferencebetween thescoresmight indicate faultprobability. In a search engine we want to know if the top result is the top 1 expectation of theuser,but if the top1 result isnumber2on the list, that soundswrong,but is still better than if itwerenumber3.Wewant toknowifall relevant resultsare in the top 10(this is calledprecision)or that therearenooffensiveresults in the top10. Depending on the context we need metrics to process the output from the AI system into an evaluation of its performance. Testers need the skills to determine relevantmetricsand incorporate themin the tests. 4.7 WeighingandContracts Theoverallevaluationof theAIsystem alsohas to incorporaterelative importance. Someresults aremore important thanothersas is with any testing.Thinkof results with high moral impact like racial bias. As part of designing test cases their weight for the overall evaluation should be determined based on risks and importance to users. Testers need sensitivity for these kinds of risks, being able to identify them, translating them into test cases and metrics. They will need understanding of the contextof the usage of the system and the psychologyof the users. AI testers need empathyandworldawareness.
back to the  book The Future of Software Quality Assurance"
The Future of Software Quality Assurance
Title
The Future of Software Quality Assurance
Author
Stephan Goericke
Publisher
Springer Nature Switzerland AG
Location
Cham
Date
2020
Language
English
License
CC BY 4.0
ISBN
978-3-030-29509-7
Size
15.5 x 24.1 cm
Pages
276
Category
Informatik
Web-Books
Library
Privacy
Imprint
Austria-Forum
Austria-Forum
Web-Books
The Future of Software Quality Assurance