Testing

From Buzztrax

Jump to: navigation, search


Each good software should be tested, at any time during development, again and again.

We use the check testing framework for our unit-tests. This has been extended with a couple of useful gobject and/or project specific helpers.

First of all we try to check every public method and every constructor for each object we provide.

Currently we have unit tests suites for all libraries and the applications. We also have some integration tests. Here we test e.g. commandline args, exit codes and so on.

Finally there are tests for the documentations (completeness and spelling).

On Guadec (2005) we've did a unit testing tutorial session. You can download the slides (540 kb).

Testing ABC

A few basics about writing unit tests first:

Each Test should test one thing. Ideally there is a single assertion in the test. A helpful pattern is to think of the tests in 3 phases - Arrange, Act, Assert or Given, When, Then. We started to add comments for these 3 phases (in the AAA style) to each test.

The important thing here is that in the Arrange phase, one does not assert anything. Just create the objects and configure them. If anything fails, we'll get a segfault anyway. Also there should be a separate tests for that failure if that could actually go wrong. This procedure make it a lot easier to write tests. Also if the tests fails with an assertion, it is more obvious what broke (the thing hat we verify).

In the Act phase, the test should just do one thing.

The assertion phase is ideally just one assert. It could also check the return value and the GError in two asserts.

In our case, we have a 4th phase called cleanup too, but that is just due to using C and the need to free/unref things to not leak memory.

We still have a few tests where e.g. we loop over a set of parameters and repeat the Arrange,Act,Assert,Cleanup sequence inside the test.

Test naming scheme

At first there is one test binary for each module or component (library, application). These are prefixed with m- We use file prefixes here to distinguish from the source file names to be tested.

Each test binary is further structured into test suites, test cases and tests. Suites are prefixed by s-.

Inside the directory tests you'll find the files: m-bt-core.c, m-bt-ic.c, m-bt-cmd.c and m-bt-edit.c. The m-bt-cmd.c file is the main entry point to check all functions related to the buzztrax command application, short bt-cmd. The m-bt-core.c file is the main entry point to check all functions related to the buzztrax core lib and so on ...

If you plan to write your own application with buzztrax, please generate a new test case like m-bt-cmd. All the test suites and test cases go to an own directory, which is alike to the folder structure in src.

Each new test suite which you plan to create should have the prefix s- and then the same name as the whole file which you try to test. For example:

You plan to test methods from the song.c in the core lib. Then you should create a file named s-song.c in which you define the suite and files like t-song.c or e-song.c each containing respectively a test case and tests. The should be one suite for each module/object. In the test cases the prefix are:

  • e- for example: show how to use the component properly.
  • t- for test: try to break the component.

The t- test case is usually rather boring. There we check for handling of inputs, logical stuff (like adding a same object twice to a collection, etc.).

The tests itself should also be nicely named. That will give you the first clue when one or more tests fail. We name the test like test_bt_<obj>_<test_name>. This allows to filter tests to run using the BT_CHECKS environment variable:

 BT_CHECKS="test_bt_machine_*" make bt_core.check

Other aspects that can be tested are constrains (for which we suggest using the prefix c-). Such tests would supervise the execution of the code regarding some limits like cpu usage, memory consumption, response times and so on.

Song test file access

If you, as a developer, would like to use any song-files for your tests, access them under the following directory structure:

buzztrax
+- test (FROM HERE!)
|  +- songs
:  |  +- simple1.xml
:  :  +- simple2.xml
:  :  :

This can be done in code with following example:

 // postive test
 START_TEST(test_play4) {
       BtCmdApplication *app;
       int ret=0;
 
       app=bt_cmd_application_new();
       ret = bt_cmd_application_play(app, check_get_test_song_path("simple1.xml"));
       if (!ret) {
          fail("play does not work with a good file name");
       }
 }
 END_TEST
The
check_get_test_song_path()
method is needed to make test also work in non-srcdir builds (e.g. when doing make distcheck).

Log-output capturing

The tests capture (most) log-output and write that all to a log named by the test, e.g. /tmp/buzztrax.log. Each test-case is delimited by a line of '=' chars and each test with a line of '-'. This is useful to get in-depth information about test-failure.

The test application further controls the log-level itself. Please don't change that without reason, as we use log-parsing functions to test against expected output in some situations (methods pre-conditions).

Testing failures

At any point in development state you would verify, that your checks of wrong arguments or other kind of wrong behavior, in your software will be correct. To do such tests you need to capture the log output.

For this kind of *fail* checks we have created a helper function called:

 check_init_error_trapp()

With this function you are able the check if the following statement in your testcase produces log-output with the same message as you give it to the check_init_error_trapp function. Later in your code you can check the error trapping with:

 fail_unless(check_has_error_trapped(), NULL);

Following example shows you, how to use this. Let us create a testcase for the BtSequence:

 /* try to create a new sequence with NULL for song object */
 START_TEST(test_btsequence_obj1) {
   BtSequence *sequence=NULL;
   GST_INFO("--------------------------------------------------------------------------------");
   check_init_error_trapp("bt_sequence_new","BT_IS_SONG(song)");
   sequence=bt_sequence_new(NULL);
   fail_unless(sequence == NULL, NULL);
   fail_unless(check_has_error_trapped(), NULL);
 }
 END_TEST

In this example we try to check, if the constructor of the BtSequence class handles correct a given NULL pointer. In our code we use the

 return_val_if_fail

method to create a log output and returning a defined value. In our testcase we check the created log output from the return_val_if_fail method.

GObject Testing

Our project is heavily based on the GObject oo system. This allows us to do some generic tests:

Type sanity

We should make a common object type test helper like this:

 GTypeQuery query;
 g_type_query(BT_TYPE_MACHINE,&query);
 fail_if(query.type == 0, NULL);
 fail_if(query.class_size != sizeof(GtkMachineClass), NULL);
 fail_if(query.instance_size != sizeof(GtkMachine), NULL);

Property checks

In the common check module we have a test helper that applies some sanity checks to object properties.

 gboolean check_gobject_properties(GObject *to_check)

Chaining up

If classes override dispose() / finalize() methods they should also chain up.

  klass=BT_XXX_GET_CLASS(self);
  parent_class=g_type_class_peek_parent(klass);
 
  parent_finalze=G_OBJECT_CLASS(parent_class)->finalize;
  G_OBJECT_CLASS(parent_class)->finalize=intercept;
 
  g_object_unref(self);

Test Coverage

The idea is to find out which code is covered by tests and which code is not. On the base of that new tests can be added and dead code can be eliminated.

GCov + LCov

GCov is part of gcc suite. In order to use it, you need to do three things:

  1. Build all code with CFLAGS="-fprofile-arcs -ftest-coverage" (configure with --enable-coverage=yes")
  2. Run make check
  3. Analyse coverage (make coverage)

We now have included a make target that generates the coverage report. You need to have lcov installed. To see how it looks like, check our current coverage report.

Below are some detail on how to manually get coverage information.

To analyse coverage of a source file do e.g.:

  cd ./src/lib/core
  gcov -p -f -o.libs/ song.c

Repeat this for every source file. Afterwards look at the *.gcov files. Lines marked with "#####" indicate code that has not been executed.

For a little report use:

  for file in *.c; do
    gcov -p -f -o.libs/ $file;
  done | grep "in file" | grep -v "include" | sort -nr

For better reports, we have integrated lcov on top of that. It generates html pages that give a good overview and allow to view details. You need lcov >= 1.6. To generate the report use the following commands:

  mkdir ./coverage
  lcov --directory . --zerocounters
  -$(MAKE) check
  lcov --directory . --capture --output-file ./coverage/buzztrax.info
  genhtml -o ./coverage --num-spaces 2 ./coverage/buzztrax.info

As reported in this blog one needs to disable ccache when running gcov:

  CC=/usr/bin/gcc ./autoregen.sh

Valgrind VCov

Valgrinds exp-vcov tool could become a nice alternative to gcov. The main advantage is that one does not need to rebuild the code.

vcov needs to be checked out from a svn branch svn://svn.valgrind.org/valgrind/branches/VCOV. Then one needs to reset the VEX snapshot in that branch.

  svn log -r PREV
------------------------------------------------------------------------
r10643 | njn | 2009-07-28 03:39:43 +0300 (Di, 28. Jul 2009) | 3 Zeilen
 
Merged all the changes from the trunk between r7367:10642, updated VCov for
various changes, and fixed a few other minor things.
 
------------------------------------------------------------------------
 cd VEX
 svn up -r{2009-07-28}
 cd ..

One the tool is build, it can be tried without installing:

  ./vg-in-place --tool=exp-vcov /path/to/buzztrax/bin/buzztrax-cmd
  ./VCOV/vc_annotate vcov.out /path/to/sources/buzztrax/src/ui/cmd/*.{c,h} >vcov.ann

BCov

bcov also allows to get coverage without recompilation. It includes a lcov style report generation too.

Right now it does not work with checks fork-mode (the fork'ed test would stop because of the breakpoints). As a workaround, one can run it as:

CK_FORK=no make coverage

Unfortunately it is also failing later in the test runs :/ Finally it does not even handle libraries yet.

Spell checking

The aspell tool can be quite helpful to fix spelling errors. It can even be applied to sources!

I'm still looking for a way to supply an extra word list to aspell, that contains symbol names from the sources. That would help checking sources and api docs.

Spell checking c-sources

We are trying to make a local dictionary to ease checking sources. To build it do:

 make tags dict

Then you can check sources

 aspell -c --mode=ccpp --lang=en -p=buzztrax.aspell_dict ./src/lib/core/machine.c

Spell checking po files

There is no explicit po mode. Maybe we can use comment mode?

I don't think that comment mode would work good enough (remember that there are always both english and translated messages). acheck seems to be able to check .po files. --SvenHerzberg 16:08, 2 Apr 2006 (CEST)

Spell checking xml api documentation

 aspell -c --mode=sgml --lang=en ./docs/reference/bt-core/bt-core-docs.sgml

Spell checking xml user documentation

 aspell -c --mode=sgml --lang=en ./docs/help/bt-edit/C/bt-edit.xml.in

See a working implementation from SvenHerzberg.

Test UI Application

The major problem that remains is to make the test run invisible. Just hiding the window won't do the trick, as then probably thing like 'mapping widget to the screen' won't occur.

As a side effect of doing controlled GUI tests, we can use such a tests to invoke one dialog page after the other and then produce up-to-date screen-shots (see gnome-screen-shooter panel applet for how-to save png images).

Invisible X Server

The idea to avoid this is to use a virtual framebuffer based X server display. One can get such a display by running:

  /usr/X11R6/bin/Xvfb -ac :9 -screen 0 1024x786x16

Then gtk application needs to be redirected to this display. While one usualy would do this by:

  ./my-app  --display=:9.0

we need to do it programmatically. See gtk+/demos/gtk-demo/changedisplay.c for information about how to do it.

To find a free display_id look in /tmp/.X11-unix/X*. Then create and shut down the server as follows:

 GPid pid;
 gulong flags=G_SPAWN_SEARCH_PATH|G_SPAWN_STDOUT_TO_DEV_NULL|G_SPAWN_STDERR_TO_DEV_NULL;
 GError *error=NULL;
 gchar *argv[]={
   "Xvfb",
   "-ac",":9","-screen","0","1024x786x16",
   NULL
 }
 
 if(!(g_spawn_async(NULL,argv,NULL,flags,NULL,NULL,&pid,&error))) {
   GST_ERROR("error creating virtual x-server : \"%s\"", error->message);
   g_error_free(error);
 }
 // ...
 kill(pid, SIGBRK);
 g_spawn_close_pid(pid);

To use such a display with gtk we need to do:

 display_manager = gdk_display_manager_get();
 display = gdk_display_open(":9");
 gdk_display_manager_set_default_display(display_manager,display);
 
 gdk_display_close(display);

unfocused windows

As another idea, couldn't we set some window-manager hints, so that these windows don't grab the focus. Unfortunately:

 gtk_window_set_focus_on_map(GTK_WINDOW(main_window),FALSE);
 gtk_window_set_accept_focus(GTK_WINDOW(main_window),FALSE);

Does not work, as it seems too late for it (show_all()) has already been called.

event recorders

We need to check out a few new testing technology:

  1. the Linux Desktop Testing Project
  2. Gerd - a Gtk+ Event Recorder. The gtk+-2 version is in gnome cvs

Can we use these for gui-app testing?

Suggestion: Check out: GNU Xnee, an event recorder/replayer

Monitored testing

The idea is to run the tests again and monitor a certain aspect.

Valgrind support to test cases

See Valgrind API and Valgrind Manual.

Currently we have a valgrind target in the makefiles. So doing make valgrind runs the tests under valgrind (memcheck).

Static Code Analysis

LLVM clang

The llvm project comes with a static code analyzer called clang that works as a gcc wrapper.

 scan-build ./config.status --recheck
 make clean
 scan-build make

Mozilla Dehydra

The mozilla project provides a static code analyzer called dehydra that works as a gcc plugin.

  make CFLAGS="-fplugin=/usr/lib/gcc/i686-linux-gnu/4.5/plugin/gcc_dehydra.so -fplugin-arg=/usr/share/dehydra/libs/dumptypes.js"

Future testing

GLib features

GType debugging:

 g_type_init_with_debug_flags()

GMem debugging:

 g_mem_set_vtable(glib_mem_profiler_table);
 //...
 g_mem_profile();

Enabling/Disabling Tests

We disable tests that we can't fix right now using an ifdef preprocessor directive using a standard form. This way we can use a script to collect them as a nice list (a testing TODO).

The format looks like below:

 /*
  * tests if ...
  *
  * needs this feature implemented first (see : http://www.xyx.org/bugid=2875)
  */
 #ifdef __CHECK_DISABLED__
 BT_START_TEST(test_xyz) {
 //... test here ....
 }
 BT_END_TEST
 #endif

The problem is to collect all lines above #ifdef __CHECK_DISABLED until BT_START_TEST(test_xyz) below. Sounds like we need a little perl-script for this. The shell one-liner I found so far is:

 grep -r -A6 -B1 -Hn --include="*.c" --color=auto "#ifdef __CHECK_DISABLED__" .

It still has the problem that the it just grabs the 6 lines above the #ifdef :(.

The list is now available via make todo in the tests subdir.

File I/O testing

It would be interesting to have a fuse based test-file system that simulates all kind of errors. This includes:

  • read-only files
  • files go away in the middle of reads
  • files containing garbage (fuzzing)

There is petardfs project that looks like it fits the needs here.

Array bound checking

http://gcc.gnu.org/wiki/Mudflap_Pointer_Debugging

Buildbot

We've been running a [buildbot] that does latest builds+test of modules every 6 hours. Right now we lack of clients to run this. One idea would be to use VirtualBox instances and fire them up in an automated way.

Documentation and Links

Support Us

Collaboration

GStreamer Logo
Become a Friend of GNOME
Linux Sound Logo
MediaWiki
Valgrind
GNU Library Public Licence
GNU Free Documentation License 1.2