Earlier in the week I was working on a barcode library that successfully recognized a barcode from a picture. The code was pretty rough, but the first cut worked for the first barcode picture I took so I was pretty satisfied.
I then took another picture from a different book, and of course it failed to decode. So that meant revisiting the code and the algorithms I was using to try to improve the recognition rate.
Here’s the new barcode:
When I ran it through the luminance algorithm, this is what I got back:
So how do we improve the “recognizability of this thing? Well the first thing I see is that the useful range, that is the range of threshold values that might yield something meaningful, is pretty small. The threshold is essentially the Y value on the above chart. If you draw horizontal lines for each y value across the chart, you see that almost the entire bottom half is going to be “black”, meaning low luminance and essentially worthless data. So if our overall range is 255 (each point is a byte value) then instead of using 0-255 for our useful data, just eyeballing it we’re probably only using about 100-255.
My next step was to “scale” all of the values. Basically subtract off the actual range of values (the difference from the minimum and maximums across the range) and then each by a factor to spread that new range across the 0-255 set. Graphically the newly ranged data looks like this:
This is much better, but it’s still got a whole lot of ugly “noise” at the transitions between bars and spaces. These are the long but thin peaks and valleys you see at the edges of the apparent data bits. I decided that I’d decrease that effect by doing a “nearest neighbor” average across the data. What this means is that instead of using the raw luminance value of each pixel, I’d use the average of each pixel along with the ones just to the left and right of it. Running that algorithm yields a luminance graph that looks like this:
It looks much, much better – at least visually. I ran it through the existing recognizer algorithm, and while it got further into it – actually pulling a few digits out – it still failed to recognize. It seems that I have a better filtering algorithm, but my decoding algorithm is still a bit lacking. We’ll look at what I did to address that in the next article.