April 20th, 2010 @ 19:15

A while back, I used clutter (a very nice and simple animation toolkit that basically let’s you easily work in a 3D environment with 2D objects) to do a little photo slideshow with a lot of customisations, but I never even showed it to the person it was aimed at because the whole thing was not satisfying enough (it either took ages to start or was not smooth and it was not easy to put a decent soundtrack when you can’t synchronize video & audio).

A simple solution would have been to do the rendering once and then just do the postproduction. I had quickly looked for a way to use a direct output of the animation to gstreamer (since there is gstreamer input support for clutter, this pretty much made sense), but there was none. Another option would have been to use a capture software, like Xvidcap, but this stuff is too heavy for my poor laptop. Consequently, I just gave up back then.

What I had completely overlooked is that clutter uses OpenGL for the rendering, so that all I had to do was to dump each frame myself using glReadPixels or using things like Yukon to do the dirty stuff. After a quick googling, I found this clutter mailing list thread about capturing the clutter output to a video file, which mentions the clutter_stage_read_pixels function, which does all the glReadPixels magic and even puts it in a more standard format. It also points to gnome-shell’s recorder stuff, which does the glReadPixels stuff and outputs it to a gstreamer pipeline, plus some extra fancy things (since they are doing screencasts of gnome-shell features, they draw the mouse cursor on top of each frame). So all I have to do now is put things together :)

One of the bad things I figured is that clutter_stage_read_pixels calls clutter_stage_paint, so mixing the gnome-shell recorder approach with clutter_stage_read_pixels results in a bad infinite loop if you don’t protect the whole thing. Even though this means painting things twice, I guess this is a much easier approach than having to use python-opengl or something along the line.

Another bad thing I encountered was that the Python bindings for clutter_stage_read_pixels are broken at the moment (pyclutter 1.0.2). The first problem is that the argument parsing part seems to be broken. Changing the PyArg_ParseTupleAndKeywords to a simple PyArg_ParseTuple gets things “working”, and gdb indicates a segfault in a PyDict_Check of the keywords argument :

Program received signal SIGSEGV, Segmentation fault.
0x00000032d34ecd9c in _PyArg_ParseTupleAndKeywords_SizeT (args=(0, 0, 500, 200), keywords=, format=
0x7ffff000d9ac "dddd:ClutterStage.read_pixels", kwlist=0x7ffff022f6c0) at Python/getargs.c:1409
1409 (keywords != NULL && !PyDict_Check(keywords)) ||
(gdb) bt
#0 0x00000032d34ecd9c in _PyArg_ParseTupleAndKeywords_SizeT (args=(0, 0, 500, 200), keywords=
, format=
0x7ffff000d9ac "iiii:ClutterStage.read_pixels", kwlist=0x7ffff022f6c0) at Python/getargs.c:1409

After asking on #clutter, ebassi immediately caught the problem, there was a missing “kwargs” bit in the python binding override definition, so that the kwargs were never actually passed to the C wrapper which was getting garbage instead.

The other problem was that the returned data was empty. This was simply due to the fact that the buffer returned by the C function was interpreted as a NULL-terminated string, which is wrong when you get such binary data. The fix was simply to specify the length to read to fill the string.

Both issues are now fixed in pyclutter git, and should be available on the next stable release.

The remainder of the port was pretty straightforward. The only problem was that I had no experience with gstreamer, which wasted me quite a lot of time. Here are a few things I discovered :

  • The --gst-debug-level command line argument is really really useful, especially on levels 3 and 4, it outputs a lot of valuable information on what’s going on and what’s not working.
  • The whole caps story is really important. After spending an hour trying to figure why my pads wouldn’t negotiate their caps, I found out that they couldn’t because I had a wrong cap (the endianness one), and after a few more hours I figured that I had to set the caps on each buffer, and that I actually only had to set caps on buffers.
  • Timestamps are not magically inferred (at least not without extra gstreamer elements) and should be set by hand using the buffer.timestamp python property (this is not quite well documented in the Python bindings documentation imho).

Well, that’s pretty much it. I used a Clutter python demo from fedora-tour and here is the result : Clutter Stage Recorder demo. The whole source is available below :)

Read the rest of this entry »

April 4th, 2010 @ 00:04

Might not have mentioned it until now, but since about one year and a half I’ve been a photography addict. Basically my girlfriend got a DSLR for Christmas, and I got one (a Canon EOS450D) for my birthday two months later. After having fun with my f/1.8 50mm lens and its nice depth of field effects and after being seen as a paparazzi with my 250mm telezoom, I thought I’d try something larger : panoramas.

I’m using Hugin (with autopano-sift-C for the keypoint detection and matching ; nona for the photometric/geometric remapping and enblend for the merging). Might sound complicated, but it’s basically just 3 clicks and a lot of processing time (if you don’t get into the details).

My first try was at the Carnegie Museum in Pittsburgh. No tripod allowed, so the camera was handheld. It looks quite beautiful imho, apart of the weird line at about 3/4 of height. I should probably rerun the stitcher.

Carnegie Museum Panorama

My next try was a panorama of CMU campus showing, among other things, the Fence, the University Center and the Gates building. I took the pictures at about 1pm, with a pretty nice sun and a few clouds. I first tried using the whole 120 pictures set, but it resulted in a bunch of geometric errors, mostly the leftmost flag and the pathway being broken :

CMU Panorama - ETOOMUCHPICS

I then selected a core of pictures, stitched them, and then added a few other to fix the missing parts, for a total of 21 base pictures. This resulted in the following panorama :

CMU Panorama (without sky)

I cropped the bottom part of the result to drop a broken pathway and a bunch of useless grass, plus another bunch of missing (i.e. black) parts. But still, it’s far from perfect, huge parts of the sky are missing, and there are some unpleasant bits (like the white line in the sky near one the trees). Luckily, I had an image restoration course (well, it was a more generic vision course, but it addressed this among other things), so that I know that there are some pretty efficient inpainting (the process of creating texture based on the surrounding pixels) methods, and nicely one of those is implemented as a GIMP plugin, the GIMP Resynthesizer. I first tried the manual way, setting the plugin parameters myself, but I found out it wasn’t quite the right way to do it :

WTF Gimp Inpainting ?

Then I discovered an option called “smart remove selection” which sets everything automagically, and it worked (though I still guess there are some pretty bad memory issues or so, since it was picking textures from outside of the selected radius). Using that option, I generated the missing sky parts and removed the ugly bits, and here we go :

CMU Panorama

Nice, heh ?

I should probably also mention that I also tried the pure CLI way (i.e. not running hugin GUI) and that it works great : autopano-sift-c takes a bunch of pictures and produces the keypoints and matches, autooptimizer optimizes the resulting homographies, pto2mk creates a Makefile which produces the final panorama (by running nona and enblend/enfuse). It’s not that much documented (took me a while to figure out that I could avoid the GUI and run most of the expensive computations remotely), but it works flawlessly.

March 31st, 2010 @ 20:01

Keyboard shortcuts are always a great matter of debate, and the whole problem is that most often they are chosen based on assumptions of the end user layout.

For instance, take this metacity commit : Change default cycle_group keybinding to Alt-grave. This change looks perfectly harmless, right ? Well not quite. It’s most likely based on the assumption that the end users has a qwerty keyboard layout (and it makes perfectly sense there). But let’s take an azerty layout. Grave is on the é/7 key, which is even farther from alt or tab than F6 is (well, not much I agree, but it might be even worse on other layouts). Is it really worth doing such a change then ?

Let’s also note that this also triggers a bad bug which gets alt+7 and alt+shift+7 to trigger the binding as well, while alt+grave is actually alt+altgr+7. This has been keeping me from nicely switching to my window n°7 in irssi for months (great thing that this window holds a really low traffic channel…).

All in all, I guess that the real problem is not that this change was made, but rather than we might need a system to have layout-dependant keybindings, or maybe hardware-location-based keybindings (i.e. that the key above the Tab key would trigger this keybinding independently of the layout).

Initially published on Mar 24, 2010 @ 8:22

Update : this change has been reverted for the GNOME 2.30 release. Even though I’m happy that the problem is “fixed”, it’s sad that the underlying problem (Alt+Shift+7 triggering Alt+`) is still there.

March 26th, 2010 @ 18:12
YouTube Preview Image
The ramp between the new Gates Hillman Center and the Purnell Center for the Arts at night
March 26th, 2010 @ 03:43
Sense : this picture makes none
No clue on where this image first appeared, so I won’t be able to give proper credit.

This image makes me think of the video clip of Takin’ Back My Love by Enrique Iglesias and Ciara. I fail to see the logic behind the different attacks : she drops his paintings and clothes into the swimming pool, while he barely spills the milk on the floor. WTF ?

By the way, I had been wondering for a while how to make links which directly point to a specific time in a Youtube video. Well, now I know.

March 26th, 2010 @ 01:28

After migrating a bunch of stuff from one (about to expire) server which ran lenny to a new one running squeeze, a friend’s blog, which is powered by Dotclear, appeared heavily broken. His posts appeared empty, though they were still there and the titles were right, but nothing else (no url, no author, no date, no content). After a little bit of investigation, I figured that the problem was that squeeze is running PHP 5.3, and that my friend’s version of Dotclear obviously didn’t support it. Checked the Dotclear website, found out that since PHP 5.3 support is planned for the upcoming Dotclear 2.2, the latest version (Dotclear 2.1.6 — which my friend already had, actually) did not support it. Checked the PHP website to find the 5.3 release date : 30 June 2009. Wow.

Looked a little deeper in the Dotclear forums, and I found a patch which is actually a workaround for the problem. This workaround has been available since the 20th of July, 2009, and the Dotclear developers won’t include it even in the 2.1 branch because it’s a workaround and not a real fix :

Le patch n’est pas appliqué parce que, tu le dis toi-même, c’est une rustine. Ça peut paraître vieux jeu, mais nous préférons garder un code propre et régler les problèmes correctement.

Which translates to “The patch hasn’t been applied because it’s only a workaround. It might sound overaged, but we’d rather keep a clean code and fix the problems correctly”. Well, it sounds like a great plan, which would be fine if it did not took them ages to produce that clean code :) (arguably, since the patch is easily available, my point might is pretty much void, but still, it’s not official — the average user grabbing the latest official release and installing it on their hosting which provides php 5.3 might easily get confused).

March 25th, 2010 @ 23:45

There’s an outstanding bug right now which makes that cvCanny edge detector function in OpenCV currently segfaults on x86_64 systems. This post is an open attempt to track my debugging process :)

  • Bug encountered. I know it’s x86_64 specific since I ran the same code on an i686 machine a few hours ago (with a home compiled OpenCV, though).
  • Googled it : found reports on both OpenCV and RedHat bugtrackers.
  • Installed debug symbols, ran under gdb : all values I may need are optimized out.
  • Fetched OpenCV source, compiled it in debug mode.
  • CvCanny works great in debug mode.
  • Recompiled in release (optimized) mode to check if it is a distro-specific bug (both reports are from Fedora users).
  • Woha, release mode compilation is so slow :( But bug confirmed : it segfaults again. Time to instrument the code.
  • Filled cvCanny function with printfs and fflushs to track the function execution. Looks like it tries to access an element at index -514. Hugh. What’s even more frightening is that it successfully achieves that on another array.
  • After running the same instrumented code on my i686 machine, it appears that the indexes are right and that the same indexes are accessed without any problem in optimized mode in the i686 build.
  • Reading the code tells me that the accesses at negative indexes are legit since the array origin is shifted from the actual allocated memory blob start. Well, that’s good, since it explains why it works well in debug mode or on i686 setups, but that’s pretty bad because it’s going to be awful to narrow down.
  • Ok, doing the access by hand (i.e. doing _map[-514] instead of _map[j - mapstep]) works. This is getting crazy. Doing k = j – mapstep and accessing _map[k] segfaults as well. Huh.
  • After an hour of heavy fprintfs, I figured that long k = j – mapstep; gave me a k which wasn’t the int value (-514) but rather the unsigned int value (4294966782), while doing int k = -514; long k2 = k; printf (“%d %ld\n”, k, k2); in a very simple code gives out -514 -514, even with -O3 or -O5 and all the options used for OpenCV release build. Since we are working with 64 bits pointers (i.e. of the size of long integers), this is probably the issue : when accessing _map[k], it unreferences the value at _map + k, which fails since it unreferences _map + 4294966782 instead of _map – 514.
  • Doing volatile int k = j – mapstep; and accessing _map[k] works, and cvCanny runs great now. Though this isn’t a real fix, just a workaround. There’s most likely a compiler bug underneath.
  • Posted a summary of my findings and the workaround on the bug report on the OpenCV tracker.

Patch against latest svn (it should apply nicely to the 2.0.0 release as well) :

Index: cvcanny.cpp
===================================================================
--- cvcanny.cpp	(révision 2908)
+++ cvcanny.cpp	(copie de travail)
@@ -239,7 +239,8 @@
                 {
                     if( m > _mag[j-1] && m >= _mag[j+1] )
                     {
-                        if( m > high && !prev_flag && _map[j-mapstep] != 2 )
+                        volatile int k = j - mapstep;
+                        if( m > high && !prev_flag && _map[k] != 2 )
                         {
                             CANNY_PUSH( _map + j );
                             prev_flag = 1;
@@ -253,7 +254,8 @@
                 {
                     if( m > _mag[j+magstep2] && m >= _mag[j+magstep1] )
                     {
-                        if( m > high && !prev_flag && _map[j-mapstep] != 2 )
+                        volatile int k = j - mapstep;
+                        if( m > high && !prev_flag && _map[k] != 2 )
                         {
                             CANNY_PUSH( _map + j );
                             prev_flag = 1;
@@ -268,7 +270,8 @@
                     s = s < 0 ? -1 : 1;
                     if( m > _mag[j+magstep2-s] && m > _mag[j+magstep1+s] )
                     {
-                        if( m > high && !prev_flag && _map[j-mapstep] != 2 )
+                        volatile int k = j - mapstep;
+                        if( m > high && !prev_flag && _map[k] != 2 )
                         {
                             CANNY_PUSH( _map + j );
                             prev_flag = 1;
March 25th, 2010 @ 01:46

Today I’ve been looking at opencv-python for a quick project (I’d like to practice OpenCV a little bit). Installed the opencv-python package on Fedora 13, headed to the samples directory (/usr/share/opencv/samples/python/), started running one of them and… boum, segfault. Tried another one (the inpainting one), and it worked. A third one… segfault. Most of the samples in there segfaulted, mostly with SWIG errors about wrong parameters, always mentioning int64 (I’m using a x86_64 kernel & distribution).

After half an hour of failure on trying to get opencv.i686 work alongside my x86_64 python, I went back to the OpenCV website to check if there was some known heavy problems with x86_64 systems and… I discovered that :

Starting with release 2.0, OpenCV has a new Python interface. (The previous Python interface is described in SwigPythonInterface.)

Wait wait wait, you mean that during all this time I was running the OLD, pre-2.0 Python interface ? Why the hell does the opencv-python 2.0 package provides both the new and the old interfaces ? (well, I know the answer : backwards compatibility). Meh :( Anyway, I wish the old samples would get ported to the new interface… At the moment there’s no sample using it at all :/

March 24th, 2010 @ 00:53

Since Sunday I have been testing Picasa web face recognition on my set of pictures. After an hour of initial processing, I was presented an interface showing a list of cluster of faces, which I have to check, remove false-positives and name. While it seemed to work great for the very first clusters (which correctly grouped about 50 to 100 faces of the same persons), it quickly appeared that the whole thing was not that great.

Here are a few rants :

  • It seems to be heavily influenced by the face expression and angle (i.e. it’ll often make two clusters of faces of the same person depending on whether it looks tilted to the left or to the right).
  • It doesn’t reconsider the clustering after the initial processing : I’m pretty sure that after tagging a bunch of clusters of the same person, it could easily merge the remaining clusters into a single one.
  • It keeps giving me “communication errors”. I’m used to the “click and it’s immediately there” scenario with google services, and I have to say that this service is definitely not a good example. About two thirds of my actions result in such an error, which takes about 10 seconds to arrive ; successful actions also take several seconds (5 to 10) to complete, which is not really efficient when you have pictures of about 800 different persons which gives clusters of 1/2 faces you just want to ignore and that it takes about 20 second per cluster to get it actually ignored.

I know this is still in development, so the actual recognition problems are ok, but meh, the communication errors are really really annoying…

Update (03/25/10) : I don’t know if they fixed the problem or if it was just pure luck, but I was able to tag 1500 faces without a single problem in about half an hour. Yeepee !

March 22nd, 2010 @ 01:56

For a project midterm presentation, we were asked to produce a bunch of slides explaining our project architecture and implementation choices. Apart of the obvious things (libraries in use, network protocol…), I had no real clue on what I could put in, so I thought I’d just throw some UML-like diagrams and that it would be fine. The only detail was : how to produce these diagrams ?

Since the project code was written in Python, all the inheritance relations were already held by the code and could be introspected, so that it was theoretically possible to automatically produce the inheritance diagram. And it actually is, and is implemented by things like the Epydoc (a documentation generator for Python code) parser, as well as the diagram generation, which Epydoc also implements. The only thing is that I wasn’t satisfied by the Epydoc diagrams since they were limited to the inheritance relationships, while I was also willing to include usage relationships and display only the main methods and variables of my objects.

I thus wrote Umlpy, a UML-like class diagram generator for Python code, which depends on Epydoc (for the parser) and python-graphviz (for the graph generation, it produces nicely spaced graphs and can output jpg, png or pdf files, and probably more). It handles the aforementioned requirements through docstrings parsing and introspection. Check the Umlpy README file for more documentation on how to use it. It took me about 10 minutes to get to the result I was expecting (it’s basically about adding a little docstring for usage relationship, and copy-pasting a docstring on methods or variables you want to see on the diagram.

This wouldn’t be complete without the mandatory screenshot, and this example code results into this diagram :
Umlpy result example