The need for AI Testing
AI offers huge potential across all of human activity from personal support, to helping business run efficiently and deliver products and services, to helping governments and societies manage key resources and infrastructure. The last decade or so has seen many increases in AI capabilities covering breadth and depth of human activity. The ‘AI-Human singularities’, that of AI reaching or surpassing human capability, has been reached or is on the near horizon for many attributes of human capability.
One interesting aspect of AI activity is the emergence and development of sophisticated autonomic systems which have the ability to operate autonomously in remote dynamic environments with limited intervention from human operators. Artificial Intelligence (AI) is already deeply embedded in our lives and increasingly performing functions usually associated with humans (HL paper 100, 2018; Morgan 2018; White and Bell (2011).
One are where there still needs considerable develop is with AI testing. This is evident when we look at the general practices and insights from testing systems and software projects have seen the emergence of the seven software testing principles:
- Testing shows presence of defects.
- Exhaustive testing is impossible.
- (The need for) Early testing.
- Defect clustering.
- Pesticide paradox.
- Testing is context dependent.
- Absence of error – fallacy.
These all provide key insights on what is practical and needed to test out software to ensure it meets the needs of the users and stakeholders. It is worth noting that these standards and principles are aimed at the range of computer software and systems from the very basic to very complex integrated systems. However, when examining practical guidance from these standards and principles they do not seem to include attributes of AI. For instance, current trends in software testing research and practice (Mohanty et all 2018) does not include attributes or characteristics of AI systems. There is of course testing within AI development, however these are often mostly focussed on testing the algorithms, showing the algorithms perform better than previous versions or showing that the AI comes up with ‘new’ solutions. They less often capture the full range of testing associated with normal software.
This is where Mobi come in by providing support and insights on approaches to AI testing and practices. There are some interesting approaches such use of scenarios that can drastically increase wider testing on AI systems.