BBTouch,code,multitouch

CIImage and the bitmap problem25 Jul

Sooo...

CIImages, they're the new hotness. They go well with a smattering of CIFilters (also high on the hotness scale). However.. the thing about CIImages (and filters) is that they wait until you use them for something (like rendering) before they actually do the render. This is pretty cool if all you want to do is see what is going on (ie your final destination is a view of some kind, which I imagine it is for most applications).. however, if you are like me, and want to get at the raw bits, then it can be a bit harder.

well.. actually it isn't hard to GET to the bits 'n' bytes, but it is slooooow. (or so it seems from my various experiments)

I have tried all the ways i can think of to freeze-dry a CIImage into a useable byte buffer:

I started simple and just made an NSBitmapImageRep with - (id)initWithCIImage:(CIImage *)ciImage.
I generated a CGBitmapContext and then drew the CIImage into it.
I generated an NSGraphicsContext with a bitmap, and then drew into it's CIContext.
I tried using an offscreen version of the NSViews that i was rendering the CIImages into (the ones that render so fast when you can SEE them) and then using - (void)cacheDisplayInRect:(NSRect)rect toBitmapImageRep:(NSBitmapImageRep *)bitmapImageRep to get the bits out of them.
And I tried all sorts of crazy-ass combinations of all of the above.

Sadly, every single one of them seems to take at least 15-30x as long as it takes to render into a visible view. (in fact, they all take such a similar amount of time, that I am pretty sure that they are all doing the same thing under the hood). I am by no means an expert on the new CIImage/CIFilter stuff, but i am presuming that this all has to do with where the image processing is taking place. and I am also presuming that in the case of a visible view, all those bits are out on the graphics processor, and the minute i try to get my grubby paws on them, they have to be moved all they way back to the main processor, hence the terrible soul-crushing overhead.

(as a reference: on my 2.33 Ghz macbookpro with a shiteload of RAM, the CIImages if left to their own devices in a poorly programmed NSView subclass will render out to the screen in about 400us. once I try to make that data available to the application, it takes more like 25000us. Which is too slow)

There are still a few more options, mostly involving rendering the CIImage into an OpenGL texture and trying to get at the bits that way. (which i may try this weekend)

and the last method, which would be the holy grail of methods would be to somehow distill the blob detection algorithm into a form that could be compiled into a CIFilter kernel. and then find some way to spit out the blob tracking info... but that is probably impossible.

sigh..

in any case, I have put down the idea of switching to a fully CIImage backed algo for the BBTouch stuff (at least for the time being). I just cant get it to go fast enough. So! if anyone knows of any good ways of getting the byte buffer out of a CIImage in a speedy manner, i would love to know it.

code,CoreImage

CIFilter infinite Extent problems24 Jul

I am going to post this here in hopes that others who have had this issue can find it easily. I spent many many hours today fiddling with custom image units (ie CIFilters) and this one stumped me for quite some time.

OK, here is the thing. If you follow apple's (generally very good) tutorial on creating custom filters, you g through this rough timeline:

1) total confusion.. what the hell is this kernal thing?
2) general understanding .. ahh haa!
3) coding .. this is so EASY! it is great!
4) validating .. WTF?
5) fiddling and finding all the syntax errors by hand in your kernal
6) validating: PASS! yay!
7) loading into QC for testing.. this is soo exciting!!!
8) Cannot Render: Bounds: Infinite (must crop to render).. WTF?!?!
9) spiral into deep deep dark place because after many hours of scouring the docs you still cannot figure out how in the name of Zues's BUTTHOLE do you get a nice filter like ALL THE REST that render without cropping!!

so, here is the solution:

After many hours of looking for clues (and spending an inordinate amount of time trying to figure out how the ROI had to do with the output extent (hint, it doesnt) i stumbled across the fact that you can pass extra info to the kernel via three special kernel keys:
kCIApplyOptionExtent, kCIApplyOptionDefinition, and kCIApplyOptionUserInfo.

the one that we care about for today is kCIApplyOptionExtent. and you use it like so:

in your - (CIImage *)outputImage method:

NSArray * outputExtent = [NSArray arrayWithObjects:
    [NSNumber numberWithInt:0],
    [NSNumber numberWithInt:0],
    [NSNumber numberWithFloat:[inputImage extent].size.width],
    [NSNumber numberWithFloat:[inputImage extent].size.height],nil];

return [self apply:_BBSubtractionCompositeFilterKernel, src, bg,
   kCIApplyOptionExtent, outputExtent, nil];

and that is it. that will set the output extent to be the same as the inputImage extent. You can, of course set it to whatever you want, but i imagine the general case is that you want to get out something just as big as you sent in.

I hope this helps someone!

BBTouch,meta

BBOSC updates24 Jul

I just sent some minor changes to the BBOSC repository (as well as the BBTouch one too.. ) to fix some memory leakage with the BBOSC stuff.. no leaks now.

BBOSC repository:
http://code.google.com/p/bbosc/
BBTouch repository:
http://code.google.com/p/opentouch/

i really need to make a nice link to these repositories on the sidebar or something so i dont have to keep including them in each post :-)

BBTouch,code,iPhone,multitouch

Some thoughts on BBTouch optimization23 Jul

SO, the old programming lesson, never optimize before it's time, is a good lesson. However, i think that BBTouch has reached a milestone, and is in need of some optimization.

To this end I have been applying the various instruments that XCode provides to identify the bottlenecks in the code, and then the good old standby of just printing out the time before and after various events to see if i can reduce the amount of time spent in the tight loops.

For BBTouch, at this point, there are two big bottlenecks:

first, when the data comes out of the sequence grabber, there is lots of buffer moving and color-space converting going on before you get the 'final' NSImage that is used for the blob detection. this (according to shark) takes up about %12 of the processing time of the app during blob detection.

second, the blob detecting itself. There is an earlier post on that subject so i wont go into it much here. but sufficed to say that I have optimized the blob detector (at least as it applies to the data structures i am currently using) to probably within %85- %90 of what i can squeeze out of it. (the last %15 i would have to trade off code maintainability too much to make it worth the small performance increase, and I dont think it is worth that). in any case, shark tells me that BBTouch is spending about %14 of it's time just blob detecting, which is fairly intensive, but not too bad considering that it has to look at every single pixel of every image. so we will be happy with that for now.

(there is also quite a bit of processor being wasted just drawing the raw video into the config view, but that goes away when you close the config window, so I am going to ignore that for now)

SO! that leaves us with the fairly large loss of %12 of our time just getting the bits out of the sequence grabber. In a perfect world, i would just pass the pointer to the start of the SG pixel buffer on up to the blob detector, and there would be NO time lost in the shuffling of data. Unfortunately we need those bits to be in a certain order for the blob detector to work at peak efficiency. and that order is a planar, packed 8-bit greyscale image. (ie a single stream of bytes, each one represents a pixel, and if the image is 640 pixels wide then the 641st byte is the first pixel of the second row)

Why is this? In order for this particular algorithm to work properly, the area we are blob detecting in (known to BBTouch as the ROI, or region of interest) needs to be bounded by at LEAST a single pixel-width line of 'bad pixels' (in the generic case, black pixels, or 0x00 valued bytes) otherwise the algorithm will go out of bounds and not work. so that is important.

the other reason is simplicity in adressing the memory block. if we have an unpacked non-planar image format, then we have to do a bunch of extra math to figure out where each pixel is in the byte-soup, and over 220,000 pixels, a few extra adds and multiplies add up to lots of extra lag.

So! to that end: I am playing around with CIImages. CIImages are really great for lots of reasons, and bad for a few reasons. But the good outweighs the bad in this case. First, the CIImage is never rendered until the very last second when it is 'needed' to be. this is great. Currently the SG buffer gets rendered into a CGImageRef, then re-drawn as an 8-bit greyscale CGImageRef, then those bytes are stuffed into an NSBitmapImageRef which is then attached to an NSImage object. I dont actually know how many times that the entire pixel buffer gets copied in that case, but it takes about 4000 microseconds (us). And all this happens before I even start thinking about blob detecting (which also takes about 4000us for a 640x480 image. (dont forget, at 30 fps, you only have 33000us to mangle data and stuff before the next frame comes barging in, and if you take all that time just detecting blobs, then the other apps who are, say rendering exciting openGL worlds based on your MT input will have no processing power to do that. so time is of the essence.)

OK, so Here are some positive results: I replaced all the crap in the SG with a single 'createCGIMage' call (which jams the pixels from the SG into a CGImageRef format, but doesnt actually do anything (ie it doesnt render it right away like an NSImage does) and then, I wrap that in a nice CIImage. (also not rendered, so the data hasn't actually gone anywhere, just the pointer changing hands)

Of course, i need the image to be in 8 bit planar greyscale. CIImage doesnt do that, but it comes close. I can use the CIMaximumComponent filter to make an ARGB image that is kinda planar.

(ie the data format for a regular ARGB might look like: A:255 R:128 G:17 B:92, after the filter it looks like A:255 R:128 G:128 b:128. this is the maximum, also known as the 'value' or the 'brightness', which is all i care about for blobs. this 'faux' planar is good because i really just need to look at a single byte (say, the red component) to get the value that I need to check against, so it is almost a planar stream)

what does all this cost? i run a big ole CIFilter on the image buffer, and then stick it into a CIImage?
well, that bit goes down from 4000ms to around 45us. No rendering of the buffer. god i love apple.

"But wait!" you say! you have to render it to a pixel buffer at some time, otherwise you cant blob detect. this is true. and rendering that out to a nice bitmap takes about 100us. so now we are up to about 145us instead of 4000us. pretty good. all thanks to apple's Core Image framework. neato!

Now, the downside ( you knew it was coming ). the downside is that the image is no longer in a nice 8 bit format, it is still in ARGB. I still need to alter the blob detecting code to handle the different bit format. (I havent done this just yet) I am guessing that the extra stuff to deal with this will add 1000us to my blob detection processing. this should (theoretically) still yield about a 2500us gain in time. (from about 8000us for the SG render + blob to about 5500us render+ blob) which is still a hefty 36% decrease in processor time) this will hopefully free up a nice chunk of about 28000us for any other apps on the machine.

(not to mention if one would want to port this to a slower architecture, like the iphone, say, if you wanted to detect blobs with the built-in camera and then send the TUIO information via wifi to your other machine that is in control of a projector or something.. wouldn't that be neat? a $200 fully contained tuio generating camera platform.. hmmm)

anyhow, why am i writing all this? mostly because I needed a break from the bits and bytes. Maybe some other cocoa nerd will find it useful, who knows?

I will keep you all updated on the progress in any case.

well, we could get away with a non-packed data block (ie the ends of each row have junk padding data to make it fit a specific buffer size) but then we just have to add a few more cycles to each loop iteration just to figure out the address for each pixel.

BBTouch,code,multitouch,openSoundControl

TUIO support is Here!22 Jul

And it even seems to work!

It is gratifying as a developer and code architect (not to toot my own horn, but, tooting my own horn :^) to be able to effectively use a chunk of my own code to add a fairly complex feature in just a few hours. I started fresh on TUIO about 4 hours ago, and because the BBOSC stuff is really easy to deploy and BBTouch is designed to be easily extended, I was done with the main bit of code about 2 hours ago! (spent the next 2 hours fiddling with processing, taking screengrabs, and committing the code to the google repository, and now writing this, which is bringing em up to the 5 hours mark oh well..) (it helps that the TUIO spec is fairly straightforward, and that it closely parallels how the BBTouch internals are structured)

Anyhow, BBTouch now supports the generation of TUIO style events via OSC and UDP. w00t!

OK, here are the details:

First up, since BBOSC is all ObjC 2.0 i went ahead and updated the objects I was stuck into on the BBtouch side as well (mostly just @properties and that sort of thing) the upside is that I saved lots of time implementing, the downside is that it is all OSX 10.5 only. So if you are still on OSX 10.4, and you really really want to run this, it should be fine, but it will take you a bit of elbow grease to replace all the @properties and @synthesize with actual methods. If i am totally screwing you with this change, then let me know and I will try to find time to downgrade all the code. but really, if you are into this stuff, then you are a geek like me and you should be running the bleeding edge OS :-)
Also, i included all the BBOSC files in the BBTouch project, so you wont need to deal with downloading the BBOSC stuff separately. (you really only need to do that if you want to build your own app on top of BBOSC)

Second: I updated the UI in BBtouch to add a TUIO config panel. Now you can just add the hostname and port (defaults to localhost/127.0.0.1 and port 3333 which seems to be the 'default' TUIO stuff), and turn on the 'generate TUIO events' button and it will do just that. (as long as you are also detecting blobs)

Note: if you want to change the host and port, you need to shut off the TUIO event stream first.

Oh, and a quick note about BBTouch performance: I dont have any trouble running it on my macbookpro in full size (640x480) but of you find yourself lagging a bit, you can close the config window and that will save a bunch of processing (since it wont need to be drawing all the raw video and whatnot). this leave BBTouch basically just a menu bar, but you can always get the config window back from the View Menu -> Show Config Window. (i noticed about a %20 drop in processor usage when I closed that window, although most of it is the raw video, so you can just shut that off too) Another Note: now that i have had a look at the processor usage %75!! yikes.. i guess the next thing is to do another optimization pass through the code. that is wayyy too high for what it is doing.. next week maybe :-)

I used the Processing/TUIODemo sketch to test with. (found on the reactivision software page) I plan on testing with some of the others as well, but I don't have time today, and I know you guys are chompin at the bit to get this :-)

As always the code is available on google code: http://code.google.com/p/opentouch/ you will need XCode to compile it.
If you already have some older code checked out, you should be able to update your project and get all the changes automatically.

I will leave you with the above image, a lame finger-painting of mine during testing.. :-)

EDIT: here is a zipped binary of the app for anyone who doesn't want to compile it from the source
bbtouchapp.zip (Universal, but 10.5 only)

About

meMy full name is Ben Britten Smith.

I go by Ben Britten because Ben Smith is a bit too common and using my full name is a mouthful.

I live in Melbourne, Australia and service clients all over the globe.

Contact

Have some questions?

Feel free to contact me directly at support@benbritten.com with any questions you might have about any of the applications I support.

Thanks!

PHVsPjxsaT48c3Ryb25nPndvb19hYm91dDwvc3Ryb25nPiAtIGFib3V0LXdpZGdldDwvbGk+PGxpPjxzdHJvbmc+d29vX2FkX2JlbG93X2ltYWdlPC9zdHJvbmc+IC0gaHR0cDovL2JlbmJyaXR0ZW4uY29tL3dwLWNvbnRlbnQvdGhlbWVzL3ZpYnJhbnRjbXMvaW1hZ2VzL2FkNDY4LmpwZzwvbGk+PGxpPjxzdHJvbmc+d29vX2FkX2JlbG93X3VybDwvc3Ryb25nPiAtIGh0dHA6Ly93d3cud29vdGhlbWVzLmNvbTwvbGk+PGxpPjxzdHJvbmc+d29vX2FsdF9zdHlsZXNoZWV0PC9zdHJvbmc+IC0gYmVuYnJpdHRlbi5jc3M8L2xpPjxsaT48c3Ryb25nPndvb19ibG9ja19pbWFnZTwvc3Ryb25nPiAtIGh0dHA6Ly9iZW5icml0dGVuLmNvbS93cC1jb250ZW50L3RoZW1lcy92aWJyYW50Y21zL2ltYWdlcy9hZDMzNi5qcGc8L2xpPjxsaT48c3Ryb25nPndvb19ibG9ja191cmw8L3N0cm9uZz4gLSBodHRwOi8vd3d3Lndvb3RoZW1lcy5jb208L2xpPjxsaT48c3Ryb25nPndvb19ibG9nPC9zdHJvbmc+IC0gdHJ1ZTwvbGk+PGxpPjxzdHJvbmc+d29vX2Jsb2djYXQ8L3N0cm9uZz4gLSAvY2F0ZWdvcnkvYmxvZy88L2xpPjxsaT48c3Ryb25nPndvb19jYXRfbWVudTwvc3Ryb25nPiAtIGZhbHNlPC9saT48bGk+PHN0cm9uZz53b29fY29udGFjdDwvc3Ryb25nPiAtIGNvbnRhY3Q8L2xpPjxsaT48c3Ryb25nPndvb19jdXN0b21fY3NzPC9zdHJvbmc+IC0gPC9saT48bGk+PHN0cm9uZz53b29fY3VzdG9tX2Zhdmljb248L3N0cm9uZz4gLSBodHRwOi8vYmVuYnJpdHRlbi5jb20vZmF2aWNvbi5pY288L2xpPjxsaT48c3Ryb25nPndvb19mZWF0cGFnZXM8L3N0cm9uZz4gLSA1NDk8L2xpPjxsaT48c3Ryb25nPndvb19mZWVkYnVybmVyX3VybDwvc3Ryb25nPiAtIDwvbGk+PGxpPjxzdHJvbmc+d29vX2dvb2dsZV9hbmFseXRpY3M8L3N0cm9uZz4gLSA8L2xpPjxsaT48c3Ryb25nPndvb19ncmF2YXRhcjwvc3Ryb25nPiAtIHRydWU8L2xpPjxsaT48c3Ryb25nPndvb19sYXlvdXQ8L3N0cm9uZz4gLSBkZWZhdWx0LnBocDwvbGk+PGxpPjxzdHJvbmc+d29vX2xvZ288L3N0cm9uZz4gLSA8L2xpPjxsaT48c3Ryb25nPndvb19tYW51YWw8L3N0cm9uZz4gLSBodHRwOi8vd3d3Lndvb3RoZW1lcy5jb20vc3VwcG9ydC90aGVtZS1kb2N1bWVudGF0aW9uL3ZpYnJhbnRjbXMvPC9saT48bGk+PHN0cm9uZz53b29fbmF2X2V4Y2x1ZGU8L3N0cm9uZz4gLSAyLDgyLDU0OSw1NTMsNTY3LDUzMiw1MzQsNTM3LDgzMjwvbGk+PGxpPjxzdHJvbmc+d29vX3Nob3J0bmFtZTwvc3Ryb25nPiAtIHdvbzwvbGk+PGxpPjxzdHJvbmc+d29vX3Nob3dfYWQ8L3N0cm9uZz4gLSBmYWxzZTwvbGk+PGxpPjxzdHJvbmc+d29vX3Nob3dfbXB1PC9zdHJvbmc+IC0gZmFsc2U8L2xpPjxsaT48c3Ryb25nPndvb19zdGVwczwvc3Ryb25nPiAtIDEuLCAyLiwgMy48L2xpPjxsaT48c3Ryb25nPndvb190YWJiZXI8L3N0cm9uZz4gLSBmYWxzZTwvbGk+PGxpPjxzdHJvbmc+d29vX3RoZW1lbmFtZTwvc3Ryb25nPiAtIFZpYnJhbnRDTVM8L2xpPjwvdWw+