Dienstag, 1. Dezember 2009

Bisecting mesa - bug 446632


Bug 446632 is responsible for the segfaulting blender on start-up on machines with ATI graphics cards running Ubuntu Karmic.

Analysis showed that the segfault originated in the mesa library. The code of the mesa contains the OpenGL implementation under Linux, and is used by Blender and various other programs.

During the pre-release process of Karmic, various builds of the mesa package have been made and are still available online. Tests showed that the last build without the bug was 7.6.0~git20090817.7c422387-0ubuntu8. The next one was 7.6.0-1ubuntu1.

To determine which patch between those two releases is responsible for the bug, a process called git-bisecting is used. For this, you give git the id of a version with and without the bug. Git chooses a version in the middle. You check it out, compile it and test it. After that you tell git if this version was good or bad. After that git chooses another version halfway between the last good and the first bad one. This process is repeated until you find the bad commit.

Sounds simple enough.... but it raises the following questions:
  • What Git repository does Ubuntu use for the mesa package?
  • Which commits correspond to the above mentioned builds?
  • And once you have the source, how do you compile, package and use it?
Within the Ubuntu project there is no git repository that describes the way from 7.6.0~git20090817.7c422387-0ubuntu8 to 7.6.0-1ubuntu1. Therefore we use the upstream git repository at git://anongit.freedesktop.org/mesa/mesa (browseable: http://cgit.freedesktop.org/mesa/mesa) on which the Ubuntu version is based.

7c422387 is the commit id within the freedesktop repository (well, not quite. The real commit ID is 7c4223876b4f8a78335687c7fcd7448b5a83ad10, but the first few digits are usually sufficient to find it).

The last commit of the 7.6.0 branch in this repository has the label mesa_7_6

The way to compile the source is described later in this post. As you will see, packaging is not necessary. The compiled drivers can be used directly.

"Git"ting started

You need git-core - and also download gitk (which is not really necessary, but makes a nice graphical representation).

sudo apt-get install git-core gitk
choose a directory and download the entire repository (in this tutorial I use my home directory).

cd ~
git clone git://anongit.freedesktop.org/mesa/mesa

This will create the subdirectory mesa, and a subdirectory .git, that contains the content of the cloned repository.

Be patient. After counting the elements to be transferred it takes some time before the actual download begins. All in all around 100 MB.

The code that we are going to compile needs some additional source files:

sudo apt-get build-dep mesa
sudo apt-get install libx11-dev libxt-dev libxmu-dev libxi-dev


Further preparations

make clean
./autogen.sh
./configure --prefix=/usr --mandir=\${prefix}/share/man \
--infodir=\${prefix}/share/info --sysconfdir=/etc \
--localstatedir=/var --build=i486-linux-gnu --disable-gallium --with-driver=dri \
--with-dri-drivers="r200 r300 radeon" --with-demos=xdemos --libdir=/usr/lib/glx \
--with-dri-driverdir=/usr/lib/dri --enable-glx-tls --enable-driglx-direct --disable-egl \
--disable-glu --disable-glut --disable-glw CFLAGS="-Wall -g -O2"
make clean removes the "debris" from previous compilations. But we haven't created any yet... Do it anyway - it's good practice :-)

./autogen.sh verifies that all prerequisites are met. If anything is missing, it will complain.

./configure sets up what is compiled, where and how.

During the tests remove -O2 (under CFLAGS). This disables compiler optimisations. The resulting code is a bit larger and a little bit slower, but it is easier to use during debugging.

--with-dri-drivers="..." determines which drivers are compiled. As the original bug only affects ATI machines, we only need the drivers we use. That saves a lot of compile time. If yours is not among them, check out ~/mesa/src/mesa/drivers/dri/ and add it.

You can find out which driver you are using with:
xdriinfo driver 0

Verify the good build

We know that build 7c4223876b4f8a78335687c7fcd7448b5a83ad10 still works with Blender. So let's check it out, compile it and test it. If Blender does not crash, we know that the process so far is working correctly.

git checkout 7c422387
make

We could enter the entire ID, but the first few digits are usually sufficient.

make should finish without errors. Now we start Blender using:

LD_LIBRARY_PATH="~/mesa/glx" LIBGL_DRIVERS_PATH="~/mesa/src/mesa/drivers/dri/radeon" Blender

LD_LIBRARY_PATH and LIBGL_DRIVERS_PATH make Blender (and only Blender, or any other program you specify) use the just compiled libraries. No need to reboot or to restart X. No effects to the remaining programs.

Please note, that you may need to replace the radeon part of the driver path with r200 or r300 depending on the driver you use.

Blender should run correctly.

Bisecting

We now officially start the bisecting process:
git bisect start
... and tell git that this was a "good" build.
git bisect good


Checking out the bad build

As we can not pinpoint which git version corresponds to first bad Ubuntu build (7.6.0-1ubuntu1) we simply start at the newest commit in the mesa_7_6 branch:
git checkout mesa_7_6

This replaces the files in the mesa directories and its subdirectories (except .git) with the new ones.

We compile it:
make
and test it:
LD_LIBRARY_PATH="~/mesa/glx" LIBGL_DRIVERS_PATH="~/mesa/src/mesa/drivers/dri/radeon" Blender

This time Blender should crash. We notify git:
git bisect bad

With this command, git chooses a commit approx. in the middle:
Bisecting: 482 revisions left to test after this (roughly 9 steps)
[ee066eaf6d0dd3c771dc3e37390f3665e747af2a] llvmpipe: Allow to dump the disassembly byte code.

The make, test, bisect process is repeated until git displays the first bad commit.

bfbad4fbb7420d3b5e8761c08d197574bfcd44b2 is first bad commit
commit bfbad4fbb7420d3b5e8761c08d197574bfcd44b2
Author: Pauli Nieminen
Date: Fri Aug 28 04:58:50 2009 +0300
r100/r200: Share PolygonStripple code.
:040000 040000 1b1f09ef26e217307a5768bb9806072dc50f2a14 eb20bf89c37b2f59ce2c243b361587918d3c9021 M src

As an interesting side note, the driver from this commit does crash Blender, but not with a segfault. There is even an output on the console: "drmRadeonCmdBuffer: -22".

The next commit in this branch 4322181e6a07ecb8891c2d1ada74fd48c996a8fc makes Blender crash the way we have come to know.

The previous commit (e541845959761e9f47d14ade6b58a32db04ef7e4) would be a good candidate to keep Blender running until mesa is fixed:

git checkout e541845959761e9f47d14ade6b58a32db04ef7e4
make
LD_LIBRARY_PATH="~/mesa/glx" LIBGL_DRIVERS_PATH="~/mesa/src/mesa/drivers/dri/radeon" Blender

Ackowledgements
Tormod Volden for creating and updating https://wiki.ubuntu.com/X/BisectingMesa and various other info.

References
https://bugs.launchpad.net/ubuntu/+source/mesa/+bug/446632
http://bugs.freedesktop.org/show_bug.cgi?id=25354
https://wiki.ubuntu.com/X/Bisecting
https://wiki.ubuntu.com/X/BisectingMesa
https://launchpad.net/~xorg-edgers/+archive/ppa
http://www.kernel.org/pub/software/scm/git/docs/user-manual.html
http://cgit.freedesktop.org/mesa/mesa

Keine Kommentare: