I talked before about the blob detection algo that I am using. One thing I am trying very hard to do is keep it as simple (and as a result as fast) as possible. As of the last post on the blob tracker, I was doing a very simple single-pixel at a time (ie no convolving or kernal matrix multiplying etc) comparison to get basically a binary image (ie ‘good’ pixels and ‘not good’ pixels) This works very well, and I haven’t abandoned it (yet). However, I have noticed that it could get bogged down if there are too many ‘good’ pixels. (ie the recursion gets too deep and the stack overflows and everything is bad etc..)
One of the big problems is that my surface size is fairly small (for testing, i plan on making a bigger one) and the scale of the blob images greatly effects the performance of the algorithm. (at a camera rez of 640×480, if i put my whole hand on the surface, it would bet a blob of around 60k-80k pixels, almost a quarter of the surface. this is where the simpleness of the algo i am using falls apart)
Not to mention there is quite a bit of noise in the image at 640×480 (ie the raw image). The noise then resolves as ‘good pixels’ some of the time, and can bog down the recursion as well, giving lots of false positives.
The solution? It looks like many people are using some sort of noise-reduction in the form of a blur of some sort. Some more than others. This is good, but again, I want to keep it as simple as possible, so lets have a quick think about what the blur is doing.
Most of the blur algorithms are doing a convolution with a weighted kernel matrix of varying sizes (see the wikipedia article on gaussian blur for a good example). They basically take the pixel they are looking at and then average it with all the surrounding pixels (out to a certain radius) at diminishing weights the further from the center. This make all the pixels look slightly more like all their neighbors. So, you take a 640×480 image, and blur it and have a nice noiseless and slightly more blurry 640×480 image.
There is another (easier, and IMHO better) way to accomplish nearly the same thing. That is to simple scale the image down. This has a similar effect as the blurring but results in fewer pixels (ie take a four pixel square, and find the average color and make it a single pixel. ) This has the effect of reducing noise and ’smoothing’ the image, as well as significantly reducing the amount of pixels that you need to look at. The downside is, of course, that you lose precision. However, I would argue that precision of finger-sized blobs isn’t all that important anyhow. Mostly we are taking blobs that are hundreds or thousands of pixels in size and finding the ‘center’ of the blob anyhow, then take that and scale it out to match the output resolution. (not to mention that in order to reduce the ‘jitter’ problem you have to throw away some precision as well)
Lets have a quick look at how this all works out: Looking at the X plane only for a moment, my camera can resolve 640 pixels across 350mm of surface. this is roughly 0.5mm precision. If i were to run a nice big gaussian convolve like the one in the wikipedia article, my noise would vanish, but i would also be giving up some of that nice precision. giving me a precision of somewhere in the 1 - 1.5mm area. 1.5mm precision is pretty damn good for my meaty digits. But I can get the same precision with a 1/3rd scale image and no extra blurring. (thus making my blob tracker nine times faster, roughly)
When using the surface with the new, crappier 1.5mm precision, i cant tell any difference in smoothness or usability. (in fact, I went ahead and made it 1/4th the size, so the 640 x 480 image gets scaled down to 160x 120 without noticing any degradation in ‘use performance’. (the very rough equivalent of a pretty aggressive blurring) at 1/4 scale, my algo is ripping fast. (plus, scaling the image is almost ‘free’ the way I am grabbing the camera image. you have to give the sequence grabber a frame size to grab the camera image into, so it is using the hardware to scale the image to the size I ask for, so in order to get a 1/4 scale image I simple have to ask for a smaller image and it doesnt cost me any extra cycles (at least I cant seem to measure the speed difference between grabbing a 640×480 image and grabbing a 160×120 image. this probably means that my grabbing code is inefficient at native rez, but oh well)
At 1/4 scale most of the finger blobs are less than 100 pixels dense. (usually around 40ish) this seems to me to be a good size for a finger pointing device, and gives me nice smoothness and a good feel on the surface (as far as moving things around and stuff) I think that when I move to a larger surface, I will increase the rez to keep the finger blobs in roughly that same range of pixel density. My fingers, with a fairly light touch are about 10mm x 15mm blobs on the surface. so, too keep the blobs in the 40ish pixel range, (roughly 5 pixels x 8 pixels) for a light touch. each pixel should be 2mm. (basically what I came up with above) so, at 640 x 480, my surface would be gigantic. ( ie 1280mm x 960mm, or 4.2ft x 3.1ft for those of you in the US)
Though it is of note that at full rez, I am back to the problem of noise and all that, so really i would need to do some sort of smoothing/blurring to make it work, thus losing precision, so basically my algorithm is limiting how big I can make the surface (without going to multiple cameras in any case.) But I am OK with that. It is easy to add multiple cameras to the computer and then all I would need to add is a layer that stitches the 4 images into a single image and then it can pass through the same blob algo.
Latest Comments
RSS