Content-based Image Retieval
*note* The CS server doesn't like serving images, so images may have to
be "refreshed" (Netscape) or "shown" (IE) individually.
Abstract
For this lab, I developed a system for content-based image retrieval. First,
a database of histograms is created. Then, a sample image is provided, and
its histogram is matched to the histograms in the database to find similar
pictures. Histogram comparisons are done in both RGB and rg color spaces
using either histogram interseciton or L2 norms. Also, a "central object
mode" can be specified so that central and peripheral histograms can be
compared with user-specified weightings. Histograms are scaled to their
image sizes at all times.
Detailed description
Histogram Generation
In order to generate the histograms, the color space is divided into an
arbitrary number of "buckets." For these analyses, RGB space was divided into
an 8x8x8 bucket array and rg space was divided into a 16x16 bucket array.
Color values are scaled to fit into the buckets evenly across the 0-255 range.
Each pixel in an image is visited and classifed to a bucket, incrementing
that buckets counter by 1.0/(number of pixels). This scaling ensures that
images of varying sizes will be compared fairly. In the database file
creation, each image in the image directory is visited, and its filename
and histogram array is stored in the database file. Database files for the
670 images ranged between 1-5MB (a wee bit smaller than the 640MB required to
store the image files), depending upon color mode (RGB vs. rg) and comparison
mode (central object vs. whole image).
Histogram Comparison
Histograms were compared by two methods. Histogram intersection yields a
"goodness of match" value where 1.0 is a perfect match and 0.0 is the worst
match possible. L2 Norms calculate a distance measure between histogram
values where 0.0 is a perfect match and larger numbers are worse matches.
In comparing a sample image, the histogram of the image is compared with the
stored histograms and given a score. The histograms which recieve a score
within a specified percent of the second best score (the image itself is the
best score)
are listed as the output of the program. Whem the program is working in
central obect mode, weights for the importance of the central histogram
comparision and the perimeter histogram comparison can be specified.
Results
The different methods were good for matching different types of pictures.
Required image: pic.0002.ppm
The results for each method will be shown first, follwed by a discussion.
RGB space using histogram intersection
 |  |  |  |  |  |  |  |  |  |
| 1.000000 | 0.694855 | 0.688632 | 0.666263 | 0.661578 | 0.653281 | 0.649271 | 0.637827 | 0.636102 | 0.628577 |
RGB space using L2 norm
 |  |  |  |  |  |  |
| 0.000000 | 0.019080 | 0.019808 | 0.020634 | 0.023295 | 0.027127 | 0.027809 |
rg Space using histogram intersection
 |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| 1.000000 | 0.887408 | 0.876715 | 0.871185 | 0.866415 | 0.861346 | 0.859195 | 0.855505 | 0.852072 | 0.848746 | 0.842139 | 0.835629 | 0.832956 | 0.832089 | 0.821353 | 0.810666 | 0.801703 | 0.801059 | 0.799899 |
rg Space using L2 norms
 |  |  |  |  |  |  |
| 0.000000 | 0.004466 | 0.004943 | 0.005253 | 0.006174 | 0.006186 | 0.006653 |
Discussion of pic.0002.ppm
All of the methods gave fairly decent results. They all found at least one
image of the blue-shirted professor lecturing. However, none of them returned
pic.0003.ppm which was in the same room but didn't include the blue-shirted
professor. It's my guess that the lack of blue in the image caused it not
to be returned.
RGB space using histogram intersections found a lot of matches with
similar "goodness of match" values. Looking at the images, a clear match of
colors can be found between the first one and all the others, but most of
them have to be regarded as garbage. When searching for pictures of a
classroom, we don't want to find pictures of fire extinguishers, even if
both images have a lot of reds, yellows, silvers, and whites. While it
returned 100% of the images of the blue-shirted professor, it also returned
60% garbage.
RGB space using the L2 norm didn't fare as well as the histogram
intersection method. It only found one picture of the professor (and it was
the worst match of all) and returned 71% garbage. Interestingly only
one of the garbage images was different, so it looks like L2 norm kept the
"bad" matches while throwing out the "good."
rg Space using histogram intersection returned a whole mess of images
with similar match values. However, it did the best of all the methods
because it returned two of the good matches as the top two, and the
other match as the fith match. It found all blue-shirted professors
but also returned 79% garbage. However, since all the good matches were
at the beginning of the list, this may be an indication that the threshold
for matches should be smaller for rg space than for RGB space. With the
threshold set just right, the number could be reduced to 33% garbage.
rg space using L2 norms didn't do as well as the histogram intersection.
It found all but one of the good matches, but they were evenly spaced out in
the results. 57% garbage was returned.
Another image: grass matching
Here, I selected an image of young grass in bright sunlight to compare how
RGB and rg spaces match colors. Since in the previous exampl the histogram
intersection method
proved more useful than the L2 norm, I didn't bother compiling L2 norm
results.
RGB
rg
 |  |  |  |
| 1.000000 | 0.802637 | 0.745392 | 0.728497 |
Discussion of grass
The interesting thing to note here is that RGB returned pictures of grass
all with the same illumination: bright sunlight with a color-rich exposure.
rg, however, was able to find the last image which is also grass, but
appears to be under different illumination. Neither method returned any
real garbage (although the third rg image is on the edge), but it's also
impossible to judge what images were missed--the database is full of
summer pictures with grass on them.
Inside the Art gallery
The wood floor inside the art gallery provides a large degree of
consitency between all photographs taken there.
RGB
 |  |  |  |  |  |  |
1.000000 | 0.785504 | 0.773602 | 0.773114 | 0.711465 | 0.648532 | 0.644464 |
rg
 |  |  |  |  |  |  |  |  |  |  |
| 1.000000 | 0.805481 | 0.799850 | 0.786429 | 0.774591 | 0.764597 | 0.728226 | 0.727774 | 0.713638 | 0.702652 | 0.653180 |
Discussion of the art gallery
As you can see, this worked amazingly well. I increased the matching threshold
until I had a miss in rg, which is the last image on the right. RGB space
matching didn't return as many images before the matching threshold was
reached, most likely because it's sensitive to slight changes in illumination
and camera exposure which don't affect rg space comparisons. With the lowered
threshold (lowered to 20%), RGB returned no garbage images and found 7 of the
12 images taken in the gallery. rg returned 10 of the 12 gallery images
before returning a garbage image, but it's hard to really call it garbage:
the color of the wood table looks just like the floor of the gallery, and the
cream-colored walls are very similar, as well. Neither method was able
to find the three following images:
Discussion
The content-based image retrieval system here presents a simple method
of finding similar images. Calculation of histograms is a simple and
fast process, and the storage space required is minimal. For most types
of pictures, the returned pictures are good matches for the submitted image.
However, for images with a variety of colors, the results can be less
helpful. For instance, the pictures of the blue-shirted professor matched
pictures of chairs or fire extinquishers just as well as other pictures
of the blue-shirted professor. To a human, a classroom is clearly not
a dessert tray, but in RGB space histogram comparisons, they're very similar.
Thus, histogram matching is only particularly useful when the image is
dominated by a few colors.
Compared to object recognition, histogram-based matching is extremely
simple. No complex feature vectors are required (a histogram may be a
multi-valued array, but each value represents the same thing), and
tranformations such as scales, translations, and rotations are completely
ignored. Where object recognition can be a resource- and time-consuming
process, histogram matching is exceedingly fast. For the database of 670
images, finding matches takes much less than a second, and I'll admit that
my code isn't even particularly optimized. If it were running on a
nice server with multiple processors and a RAID system, something very
similar to this could be used as a search engine for images on the internet.
The time consumed in comparisons only grows linearly with the number of
images in the database, regardless of image size. Creating the histograms
for the database, however, does take quite a while--about 5 minutes for this
database. This can be done offline, though, so the user wouldn't see the
time.
Extensions
As an extension to the lab, I experimented with histogram intersection and
L2 norms in both RGB space and rg space and directly compared the results.
I also implemented a "central
object" comparison. For this, the image is divided into two regions: the
center and the perimeter. The division is arbitrarily set so that the
center is exactly 1/4 the dimensions of the image, but this can be changed.
Separate histograms are stored in the the database. Comparisons are
performed by attaching user-specified "weights" to the perimeter and
center comparisons. The results are very encouraging, although not unexpected.
The idea is that if you have a picture focused on something yellow, and you
want to find more pictues focused on something yellow, all that needs to
be compared is the color at the center of the image.
Here, the center of the object is weighted at 1.0, and the perimeter at 0.0:
 |  |  |  |  |  |
| 1.000000 | 0.575049 | 0.570850 | 0.546533 | 0.525537 | 0.496387 |
The second to last image shows up because the table color is close to the
mask, and the book is close to the chair color.
Here, yellow objects are found:
 |  |  |  |  |
| 1.000000 | 0.810059 | 0.801270 | 0.711572 | 0.689063 |
Here, objects with dark perimeters are found. Note that two of the matching
images were even taken on the same table. This was done in RGB space because
the black table would match anything gray in rg space.
 |  |  |  |  |  |  |
| 1.000000 | 0.820895 | 0.800101 | 0.796253 | 0.759258 | 0.722337 | 0.717038 |
Possible Future Work
Another thing that could be useful would be to record the variance of
position in image space of each of the buckets in the histogram. This way,
an image with a green blob would not match an image with green speckles.
This would require twice the storage space (basically another histogram for
each image, except that the second histogram would contain variances rather
than occurrences), but the storage space is small, so it wouldn't be much
of a problem.