||Reviewer: 1||Reviewer: 2|
|Dear Dr. Larkin:
Thank you for submitting your manuscript "A Coherent Model of Fingerprint Structure: Fingerprint Image as Hologram." We have now received the detailed reviews of your paper (see below). Unfortunately they are not positive enough to support publication of the paper in [Journal]. Although we recognize that you could possibly address some of these specific criticisms in a revised manuscript, the overall nature of the reviews is such that the paper would not be able to compete for our limited space. We hope that you find the comments helpful in preparing the manuscript for submission to another journal.
We are grateful that you gave [Journal] the opportunity to consider your work.
|This is a very
interesting paper showing that demodulation concepts for
pattern analysis and encoding can be usefully applied to
There is a lot of material here that is mathematically interesting and rich in its own right (e.g. the whole idea of describing patterns as the amplitude and frequency modulation of an underlying coherent process), in addition to the biometric and forensic significance of the work.
I recommend publication after certain issues and paradoxes have been addressed, and some minor corrections made.
a model for compressing fingerprint images. The basic
model used here was proposed by the authors in their
2001 paper (ref.6). Experimental results are shown on a
single image (Fig 4) with a compression factor of 239
which is better than the compression obtained using the
FBI WSQ standard. However, the authors do not discuss
how the reconstructed image fidelity compares with the
fidelity obtained using the WSQ standard.
is an apparent paradox in the four panels of Figure
3. By far the "least busy" (or lowest entropy)
panel, which should therefore be the most compressible,
is the continuous phase (Psi_C) function in the third
panel. It looks like it could be represented in
about 10 or
20 bytes. Yet paradoxically, it is allocated the highest information cost of all four panels -- 534 bytes, or about half of the total. This needs to be explained; e.g.: since the gradient of this function determines the ridge orientation and frequency, its representation must have exceptional fidelity. Some further comment about relative compressibility might also be in order for the spiral phase (fourth) panel, since its "apparent busy-ness" has much to do with the arbitrary cuts of the (actually cyclic) phase-unwrapping process.
|The real challenge in fingerprint recognition (or for that matter in biometric matching, say, face) is not the compression ratio but robust matching in the presence of noise and distortion. In other words, the challenge is to reduce the false accept and false reject error rates in matching a query to the stored template. It is only in the forensics applications that the fingerprint image is actually stored in compressed form. In other security applications involving biometrics, say, access control or computer log in, features extracted from the image (minutiae in case of fingerprints) and not the image itself that is stored. The reason why forensics applications store the fingerprint image is because the final match/no-match decision is often made by a forensic expert who likes to visually compare the query image with the stored template image. That is why image fidelity of the reconstructed image is crucial.|
|2. If the
methodology proposed is applied to actual recognition of
real fingerprints (rather than mere compression), then
it is likely to be vulnerable at exactly the crucial
point which motivates the use of classical
minutiae-based methods (e.g. ridge counts between
That is the issue of obtaining a representation that is invariant to the deformations in rolled prints due to the "squishiness" of the soft tissues of the finger. Depending on the force vector when fingers are pressed down or rolled, the overall ridge flow pattern can be severely deformed. That will greatly affect the phase functions in Eqt (5), and finding a deformation-invariant representation by this method is extremely unlikely (unlike classical, boring, ridge-counting methods).
|Authors state (page 10) that "the broad similarity of the two images is clear and most of the minutiae are captured by the compressed representation". This is not acceptable in forensics applications where there is a need for high image fidelity. While the authors correctly point out that "..fidelity can be traded against the compression and demodulation parameters", authors need to conduct additional experiments on a large public domain database (e.g., those available from NIST) to determine the loss in matching performance at different levels of compression. It might also be a good idea to ask a trained latent print examiner (forensics expert) to visually assess the fidelity of the reconstructed image.|
claims and comments about the compression achieved
compared with the FBI WSQ standard (100-fold versus
15-fold) are "comparing apples with oranges" since WSQ
gives a much higher quality fidelity than seen in Figure
4. Without equating over some measure of image
fidelity (even just MSE), such comparisons are
Demodulation transforms have been applied to biometric
patterns before, and possibly the following paper should
be cited as well:
"Demodulation, predictive coding, and spatial vision." Journal of the Optical Society of America, A (1995), vol. 12 (4), pp. 641-660.
|5. The citation
date for the Alan Turing reference  looks about 50
years too late! And book references  and 
include a "pp."
|6. The quoting of Occam on page 3 is glib, and should be removed (for reasons that William of Occam himself would approve of).|
|7. A bit
more clarity about the meanings of the functions
embedded within Eqt (5) would help many readers.
For example, it looks like
a(x,y) alone could do all the work of capturing f(x,y) so all the rest could just be forgotten about! And in Eqt (4), it looks as if b(x,y) would like to be complex valued, not real-valued, as it is in the other equations. Finally, the paper is a bit poor on results; it would be nice to see some more data showing the utility of the method for compression, analysis, recognition, and synthesis.