Page - 134 - in The Future of Software Quality Assurance
Image of the Page - 134 -
Text of the Page - 134 -
134 G.Numan
In the movieRobocopofficer Murphyhad a “primedirective” programmedinto
his system: if he would try to arrest a managing director of his home company,
his system would shut down. AI systems could have prime directives too, or
unacceptableresults, likeoffensivelanguage,pornsitesordrivingintoapedestrian.
We call these “contracts”: possible unwanted results that should be flagged in the
test resultsasblocking issuesorat least begivena highweight.
Required contracts have to be part of the test set. Possible negative side effects
ofexistingcontracts shouldbepartof the test set too.
4.8 Test Automation
AItestingneedssubstantialautomation.Theamountoftestcasesrequest itandtests
need to be run repetitively with every new version. When the AI system is trained
constantly, testing is necessary, as in the case of search engines where there are
feedbackloopsfromrealdata.ButevenwhentheAIsystemisnottrainedconstantly
andversionsof thesystemarestable,achangingcontextdemandsconstant training.
Evenwhen thesystemdoesnotchange, theworldwill.
Test automationconsists of a test frameworkwhere the test cases will be run on
the AI system and the output from the AI system will be processed. Below a basic
setupof sucha test frameworkis shown.
4.9 Overall EvaluationandInput for Optimising
The product of testing is not just a list of bugs to be fixed. Bugs cannot be
fixed directly without severe regression, as stated above. The AI-system has to be
evaluatedas a whole since with the many test cases and regression,no versionwill
be perfect. Programmers want to know which version to take, if a new version is
better than a previousone. Therefore the test results should be amalgamated into a
total result:aquantitatedscore.Forprogrammerstogetguidanceintowhat to tweak
(training data, labelling, parametrisation) they need areas that need improvement.
This is as close that we can get to bug fixing. We need metrics, weighing and
contracts to achieve a meaningful overall score and clues for optimisation. Low
scoring test cases should be analysed as to their causes: is it over-fitting, under-
fittingoranyof theother riskareas?
4.10 ExampleofAITest Framework (Fig. 2)
Fromleft up tobottomand thenrightup:
1. Identifyingusergroups
2. Creatingpersonaperusergroup
back to the
book The Future of Software Quality Assurance"
The Future of Software Quality Assurance
- Title
- The Future of Software Quality Assurance
- Author
- Stephan Goericke
- Publisher
- Springer Nature Switzerland AG
- Location
- Cham
- Date
- 2020
- Language
- English
- License
- CC BY 4.0
- ISBN
- 978-3-030-29509-7
- Size
- 15.5 x 24.1 cm
- Pages
- 276
- Category
- Informatik