Lab 5: Revisiting CBIR with Variances

Casey Smith

*note* The CS server doesn't like serving images, so images may have to be "refreshed" (Netscape) or "shown" (IE) individually.
*note* This report is written at the very end of the semester after I've completed all my finals and am ready to move on. I cannot be held responsible for the informality of my language.

Abstract

For my final project, I decided to extend the abilities of the CBIR system from lab 2. As I mentioned in my section on possible future work, it seemed to me that better results could be obtained by looking at the variance in position for each "bucket" in the histogram instead of just looking at how much was in the bucket. For example, it seemd to me that a picture of a small yellow object shouldn't match a field of daisies. They may have the same amount of yellow, but one picture has that yellow concentrated while the other has flecks of yellow all over the place. It wasn't as easy as I thought, but it is possible and my results are promising, if not already happy.

The Method

The first step was to create a variance matrix to go with the histogram. Basically, the variance matrix is just another histogram. The buckets in the variance matrix correspond directly to the buckets in the histogram, representing the variance in position of the pixels that belong in each color bucket. In order to avoid gigantic intermediate numbers in calculating the variance, I scaled x and y to go from 0 to 1 instead of using pixel numbers. This wasn't really necessary; it just made debugging (which there was a lot of) easier.

Next was the task of comparing two variance matrices. I originally tried using an L2 norm on them as if they were normal histograms. This didn't really work. Fist of all, many of the buckets were empty: a picture of grass and trees doesn't contain any strong reds at all and may lack some greens. The way my system was set up, these buckets had a variance of zero. I tried ignoring buckets with variances of zero, and that seemed to help a little, but not much. Finally, I thought up a complete hack of a cop-out solution, but it works well. I thought to myself, "I want something that ignores buckets with zero variance in either matrix and rewards similar variances without penalizing different variances." So I wrote just that. The distance between any two variance matrices starts off as 1. Then, if the percentage difference between two corresponding buckets is less than some threshold, I divide the difference by two. (Actually, I divide the smaller variance by the larger variance and test to see if the value is above a certain threshold, but I don't know a good catch phrase for that like "percentage difference," and they're conceptually the same.) I didn't really have a reason for picking 2. I just like it, and it doesn't affect the results. As a threshold, I picked 0.5 for the same reason. It does affect the results, but 0.5 was the first value I tried, and I liked what I got.

I didn't end up using the histogram at all for matching. I found that it didn't really matter. The most important thing for matching is that the two images have a lot of color buckets in common, and this system captures that: the distance measure can only be decreased when both corresponding buckets have data. This system further improves that by also conditioning that the spacial distribution of the color in the images (which, to some degree, is constrained by the number of pixels in that bucket, capturing something like an L2 norm in histogram space) be similar.

Results

Below are some examples of how well it works versus the histogram matching system from Lab 2. In each table, the pictures are arranged in order of goodness of match, and the numbers indicate the distance measure between each image and the first image.

Yellow Objects

I first wanted to see if I could successfully match objects with interesting colors without being distracted by backgrounds like the Lab 2 system was.

Lab 2 System:

0.0000000.0202270.0242030.0244140.0257340.0304980.0343260.0406270.0419560.048634

The New ChromoDispersion Scaling System:

0.0000001.49012e-089.53674e-071.90735e-063.8147e-063.8147e-067.62939e-061.52588e-050.000122070.000122070.000122070.00012207

Here, we can see how much better the new system worked. The old method was dominated by the distance measure in how filled the dark buckets were, but the new system isn't distracted by that. It returns pictures of things with yellow of about the same size as the tape measure.

A Flower Scene

I had noticed that there were several pictures of this flower in the database, but the old method was never able to match them. They're at slightly different exposures, and they have slightly different backgrounds, which confused the old system.

The Lab 2 System:

0.0000000.0092910.0097900.0110840.0143300.0145950.0147280.0151780.0157620.015997

The New ChromoDispersion Scaling System:

0.0000008.47033e-221.0842e-191.38778e-173.55271e-153.55271e-151.42109e-141.42109e-142.84217e-145.68434e-14

Once again, the new system performs much better. In the lab 2 system, the method is first driven by the gray stone, then by the combination of gray and red, then by gray and green, then by gray and red, etc. It never matches other pictures of the same thing very well because of the differences in exposure and background elements. The new system, because of its thresholding, isn't driven by an overwhelming presence of gray. That's just one bucket, and the others are just as important. Thus, the first three matches are of the same flower, and the next three matches are from the same courtyard.

Problems

Well, clearly this system is not quite ideal. It's just a hack that happens to work. The main problem is that the degree of match measurements are virtually meaningless except for ordering purposes. They look like nice decimal numbers, but they're just discrete states (fractions of 2). Basically, the system could be rewritten to just have integer match values where high numbers were good. Each integer would indicate a match of variances in the variance matrices. The reason I did it this way was that I wanted to be able to use a weighted sum of both variance and the L2 norm of the histogram, and using fractions of 2 gave me a similar scale as the L2 norm. That having been said, I still like the results. The flower matching would have been very difficult without the thresholding, so it clearly has some benefit.