Yet another rejection for Kieran Larkin circa 2006

Perhaps it goes to show that you can't please everyone? 
The comments from Reviewer 1 are spot-on.
Reviewer 2 uses the ploy of requesting more detail in a paper that was already at the length limit for a correspondence. 

With a little amendment the paper was published in an excellent optics journal and is now freely accessible here.

Reviewer: 1 Reviewer: 2
Dear Dr. Larkin:

Thank you for submitting your manuscript "A Coherent Model of Fingerprint Structure: Fingerprint Image as Hologram."  We have now received the detailed reviews of your paper (see below). Unfortunately they are not positive enough to support publication of the paper in [Journal].  Although we recognize that you could possibly address some of these specific criticisms in a revised manuscript, the overall nature of the reviews is such that the paper would not be able to compete for our limited space.  We hope that you find the comments helpful in preparing the manuscript for submission to another journal.

We are grateful that you gave [Journal] the opportunity to consider your work.


Senior Editor
This is a very interesting paper showing that demodulation concepts for pattern analysis and encoding can be usefully applied to fingerprints.
There is a lot of material here that is mathematically interesting and rich in its own right (e.g. the whole idea of describing patterns as the amplitude and frequency modulation of an underlying coherent process), in addition to the biometric and forensic significance of the work.

I recommend publication after certain issues and paradoxes have been addressed, and some minor corrections made.
Authors propose a model for compressing fingerprint images. The basic model used here was proposed by the authors in their 2001 paper (ref.6). Experimental results are shown on a single image (Fig 4) with a compression factor of 239 which is better than the compression obtained using the FBI WSQ standard. However, the authors do not discuss how the reconstructed image fidelity compares with the fidelity obtained using the WSQ standard.

1.  There is an apparent paradox in the four panels of Figure 3.  By far the "least busy" (or lowest entropy) panel, which should therefore be the most compressible, is the continuous phase (Psi_C) function in the third panel.  It looks like it could be represented in about 10 or
20 bytes.  Yet paradoxically, it is allocated the highest information cost of all four panels -- 534 bytes, or about half of the total.  This needs to be explained; e.g.: since the gradient of this function determines the ridge orientation and frequency, its representation must have exceptional fidelity.  Some further comment about relative compressibility might also be in order for the spiral phase (fourth) panel, since its "apparent busy-ness" has much to do with the arbitrary cuts of the (actually cyclic) phase-unwrapping process.
The real challenge in fingerprint recognition (or for that matter in biometric matching, say, face) is not the compression ratio but robust matching in the presence of noise and distortion.  In other words, the challenge is to reduce the false accept and false reject error rates in matching a query to the stored template. It is only in the forensics applications that the fingerprint image is actually stored in compressed form. In other security applications involving biometrics, say, access control or computer log in, features extracted from the image (minutiae in case of fingerprints) and not the image itself that is stored. The reason why forensics applications store the fingerprint image is because the final match/no-match decision is often made by a forensic expert who likes to visually compare the query image with the stored template image. That is why image fidelity of the reconstructed image is crucial.

2.  If the methodology proposed is applied to actual recognition of real fingerprints (rather than mere compression), then it is likely to be vulnerable at exactly the crucial point which motivates the use of classical minutiae-based methods (e.g. ridge counts between minutiae).
That is the issue of obtaining a representation that is invariant to the deformations in rolled prints due to the "squishiness" of the soft tissues of the finger.  Depending on the force vector when fingers are pressed down or rolled, the overall ridge flow pattern can be severely deformed.  That will greatly affect the phase functions in Eqt (5), and finding a deformation-invariant representation by this method is extremely unlikely (unlike classical, boring, ridge-counting methods).
Authors state (page 10) that "the broad similarity of the two images is clear and most of the minutiae are captured by the compressed representation". This is not acceptable in forensics applications where there is a need for high image fidelity. While the authors correctly point out that " can be traded against the compression and demodulation parameters", authors need to conduct additional experiments on a large public domain database (e.g., those available from NIST) to determine the loss in matching performance at different levels of compression. It might also be a good idea to ask a trained latent print examiner (forensics expert) to visually assess the fidelity of the reconstructed image.

3.  The claims and comments about the compression achieved compared with the FBI WSQ standard (100-fold versus 15-fold) are "comparing apples with oranges" since WSQ gives a much higher quality fidelity than seen in Figure 4.  Without equating over some measure of image fidelity (even just MSE), such comparisons are incommensurable.

4.  Demodulation transforms have been applied to biometric patterns before, and possibly the following paper should be cited as well:
"Demodulation, predictive coding, and spatial vision."  Journal of the Optical Society of America, A (1995), vol. 12 (4), pp. 641-660.

5. The citation date for the Alan Turing reference [19] looks about 50 years too late!  And book references [1] and [2] include a "pp."

6.  The quoting of Occam on page 3 is glib, and should be removed (for reasons that William of Occam himself would approve of).

7.  A bit more clarity about the meanings of the functions embedded within Eqt (5) would help many readers.  For example, it looks like
a(x,y) alone could do all the work of capturing f(x,y) so all the rest could just be forgotten about!  And in Eqt (4), it looks as if b(x,y) would like to be complex valued, not real-valued, as it is in the other equations.  Finally, the paper is a bit poor on results; it would be nice to see some more data showing the utility of the method for compression, analysis, recognition, and synthesis.

fringe dweller Get on back