One of the core ideas that I have been discussing is testing of embedded software internals. My focus has been on modal testing - testing hypothetical cases, and checking current state with past state. I'd like to describe a technique that I believe will identify if it is possible for the code to generate a string calls that violate the system rules. The analysis treats source code as a specification for generating strings of calls, and it will check those strings. This will tell me if it is possible for the procedures to generate a bad calling sequence.
However, even the restricted form of C being used in my embedded systems is a more complex language than regular expressions. How do I show that the (push down) state machine specified by the source code is a subset of the API calling rules? I think I have a reasonable technique for changing the code to allow this type of analysis.
The analysis has to be able to explore all possible calling paths and infinite loops succinctly. The tests work by scanning the procedure(s) under test as if they are a regular expression with a messy syntax. Comments, variables, numbers, calls to uninteresting things etc. are all stripped out.
The branches ' all of the branches, including if-else
, switch
, for
, while
, do
, break
, continue
, goto
- are the regular expression operations. The calls to interesting things are retained.
while
, and for
loops are treated in the same way as a regex * . (Even if they should be treated as a +)switch
is treated as an alternation (|)||
in if
's and RHS expressions ' are treated as alternationif-then-else
is treated as nested sequence and alternation, although a bit quirkycontinue
, break
, goto
are special operators that link to other nodes in the extended regex structure.Then, like the procedure generated from other regex's to test the API, it creates a series of procedures for the above nodes. It retains use of the framework of monitoring and checking that rules are followed, albeit without the checks on parameters and system variables. For instance, procedures that are entered and exited are checked that the proper calling state has been honored.
The type of tests I've described previously have 'false negative' (type II) errors. When the tests & checks said the software broke the rules, it broke the rules. Nevertheless, there may be a defect even if the test didn't say there was a problem. Perhaps there is a specific set of parameter values that cause the procedure to misbehave, and they just got missed.
The defect scanner I am outlining here would have the opposite kind of errors. It tries to see if it is possible there is a defect. If it says no, there isn't a defect. If it says there is a defect, it is likely there is truly a defect (but it might be wrong).
Side note. Science, engineering, even software development have a long history shaggy dog stories along the lines of "An outsider points at random thing, asking what about that? Someone looks, and has an 'ah hah' moment where he sees how it might be improved." It wouldn't be hard to randomly point at the code. Or point toward complex code, infrequently called code, or frequently called code, etc. But I wanted the "pointing" to be weighted by a bit more analysis, so that it is very likely to be both right and consistent when it says something is wrong.
This doesn't locate defects in the source code. Nor can it create circumstances that describe how the defect triggers the bad API call - after all, it removed the parameters and variables. It might track the path thru the original source code - which branches in the original source is assuming are taken - and that might be helpful to the programmer who has to analyze the results.
To do this it would trace each point in the original code that it would be taking. In the case of alternation, it would roll back a step.
The list of where in the code, what branches it took, might be enough for a programmer to figure how that might happen.
The tool here must be automatic and single pass, so it needs a practical approach that does nearly the same thing. (There are also methods to dynamically generate calls to explore all branch executions - e.g. constraint solvers) The values (or range of values) that are known to trigger branch points (or are empirically found to do so and fed back into tests).
I'll look at some ideas I wanted to try for performance characterization.