next up previous
Next: Picasso Up: Systems Previous: NETRA

Photobook


  
Figure 19: Photobook result, query at far left.
./images/photobook.jpg

Developer

Vision and Modeling Group, MIT Media Laboratory, Cambridge, MA.

URL

http://vismod.www.media.mit.edu/vismod/demos/photobook/index.html. A demo is available at http://vismod.www.media.mit.edu/cgi-bin/tpminka/query?vistex,,,10.

References

[PPS96].

Features

Photobook implements three different approaches to constructing image representations for querying purposes, each for a specific type of image content: faces, 2D shapes and texture images. The first two representations are similar in the way that they offer a description relative to an average of a few prototypes by using the eigenvectors of a covariance matrix as an orthogonal coordinate system of the image space. First a prepocessing step is done in order to normalize the input image for position, scale and orientation. Given a set of training images, $\Gamma_1,\Gamma_2,\ldots,\Gamma_M$, where $\Gamma_i$ is a $n \times n$ array of intensity values, their variation from the average, $\Psi=\frac{1}{M}\sum_{i=1}^{M}\Gamma_i$, is given by $\Phi_i=\Gamma_i-\Psi$. This set of vectors is then subjected to the Karhunen-Loève expansion, the result being a set of M eigenvectors uk and eigenvalues $\lambda_k$ of the covariance matrix $C=\frac{1}{M}\sum_{i=1}^{M}\Phi_i\Phi_i^T$. In representing a new image region, $\Gamma$, only M1<M eigenvectors with the largest eigenvalues are used, thus the point in the eigenimage space corresponding to the new image is $\Omega=(\omega_1,\omega_2,\ldots,\omega_{M_1})$, where $\omega_k=u_k^T(\Gamma-\Psi)$, $k=1,\ldots,M_1<M$.
In a texture description, an image is viewed as a homogeneous 2D discrete random field, which by means of a Wold decomposition, is expressed as the sum of three orthogonal components. These components correspond to periodicity, directionality and randomness.
In creating a shape description, first a silhouette is extracted and a number of feature points on this are chosen (such as corners and high-curvature points). This feature points are then used as nodes in building a finite element model of the shape. Solving the following eigenvalue problem $K\phi_i=\omega_i^2M\phi_i$, where M and K are the mass and stiffness matrices, respectively, the modes of the model are computed. These are the eigenvectors, $\phi_i$, which are next used for determining a feature point correspondence between this new shape and some average shape.

Querying

To perform a query, the user selects some images from the grid of still images displayed and/or enters an annotation filter. From the images displayed, the user can select another query images and reiterate the search.

Matching

The distance between two eigenimage representations, $\Omega_1$ and $\Omega_2$, is $\epsilon_{ij}^2=\Vert\Omega_i-\Omega_j\Vert^2$. Two shapes are compared by calculating the amount of strain energy needed to deform one shape to match the other.

Indexing

Prior to any database search, a few prototypes that span the image category are selected. For any image in the database, its distance to the average of the prototypes is computed and stored for future database search. At query time, the distance of the query image to the average is computed and the database is reordered according to this.

Result presentation

Images in the database are sorted by similarity with the query images and presented to the user page by page.

Applications

The face recognition technology of Photobook has been used by Viisage Technology in a FaceID package, which is used in several US police departments.

 
next up previous
Next: Picasso Up: Systems Previous: NETRA
Remco Veltkamp
2001-03-08