Tuesday, December 13, 2011

Image recognition using phase-only correlation


I have been working on image matching program which based on frequency analysis.Phase correlation principle is used here.

The concept is like this

  • Get different frequency information from image using DFT
  • Do phase correlation of source and templates frequencies (got from source and template images, both are n-dimensional)
  • After phase only correlation do inverse Fourier and find the peaks in the real part. that peaks shows the matching positions..
To really understand how this works , you need to know how DFT works. Not just that magic equation.
How it could transform the signal(here image) to freqency data ?infact we can make an intutive represenation of DFT equation in mind. Here it is in without any formulas.

You need to know the vector projetion operation first. If you don't know that my humble opinion is you better learn some basic vector maths right now. (vectors are everywhere.. Beware!!).
In DFT we have an image, which can be represented as an N-Dimensional vector.Then we take the dot product of this N-dimensional vector with an another N-Dimensional vector , lets call it 'Sin' vector, we also take dot product with an another vector, lets call it 'Cos' vector

that is (ImageN . Sin) + i (ImageN . Cos )    : ". is Dot product between vectors "

(Dot product is actually gives the distance in terms of vector, you can conceive it as projecting the image data to n-dimensional sine and cosine vectors which gives the real and imaginary parts). We do this operation for a number of frequencies, so the Sin,Cos Vectors changes giving different n-dimensional vectors.That's how you get the output N complex numbers (Now look refer the original DFT equation)
So what we just said is, we project(DOT) the image data to a number of vectors which are obtained from different sin,cos signals. Another point worth to remember  is Sin and Cos vectors are always ortho normal ). I hopes i just wrote enough theory for DFT so that you can understand it intutively.Visualization is very important.

So back to matching thing., we do DFT of source and template image( both are same size, template can be padded with zero ).  After this apply phase correlation using equation

    (a+ib)(p+iq)* / (| (a+ib)(p+iq)* | )

  • (a+ib):  complex number got from source DFT (so for NxN dimensional image , there will be N*N complex numbers,represented as 2D array NxN.)
  • (p+iq): complex number got from template DFT, (p+iq)* is conjugate operation.


What this equation does is it gives high values for signals where where peaks and bottoms correctly
aligns each other.


Now we do inverse DFT on the phase only correlated datas, and apply some threshold on the real part inorder to get the peaks which indicates the matched positions

Images from my tests.

Original image (after edge detection)


Template image (template image size must be same as source, so pad remaining area with zeros)



Result after phase only Correlation, see the peaks, You can see some invalid peaks also there (that is another story). I did this with mathematics. With 'mathematica' or 'matlab' implementation is easy. But i had to spend months to really understand the theory.



























This method has advantage of being small invariant to smaller rotation and scale. But drawback is it needs higher time for processing due to DFT.This algorithm doesn't consider any shape information for matching, so this can give false results when too much edges present in source image.

That's all for now,this was a quick post. Good Luck with your projects.

No comments: