Skip to content
robertwgh edited this page Jul 3, 2018 · 5 revisions

ezSIFT: an easy-to-use standalone SIFT library


Project overview

The SIFT (scale-invariant feature transform) algorithm has been considered as one of the most robust local feature detector and description. Many open-source SIFT implementation rely on some 3rd-party libraries. These dependencies make the installation, compilation and usage not easy.

The ezSIFT library provides a standalone and lightweight SIFT implementation written in C/C++. The ezSIFT is self-contained, and does not require any other libraries. So it is easy to use and modify. Besides, the implementation of the ezSIFT is straightforward and easy to read.

The implementation of a few functions in this library refers to the implementations from OpenCV and VLFeat. OpenCV, http://opencv.org/ VLFeat, http://www.vlfeat.org/

If you use any code from the ezSIFT library in your research work, we expect you to cite this project:

For those who are interested in GPU implementations of the SIFT algorithm on mobile devices, the following two papers can provide more details:

  1. Blaine Rister, Guohui Wang, Michael Wu and Joseph R. Cavallaro, "A Fast and Efficient SIFT Detector using the Mobile GPU", IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2013. (GPU implementation using OpenGL ES; performance benchmarked on Google Nexus 7, Samsung Galaxy Note II, Qualcomm Snapdragon S4 pro, NVIDIA Tegra 2.)

Function interface

namespace ezsift {
    int sift_cpu(const ezsift::Image<uchar> &image, list<ezsift::SiftKeypoint> & kpt_list, 
                bool bExtractDescriptors);
}

image: input image which is defined as an object of Image class. Image is defined as follows:

namespace ezsift {
    template <typename T> 
    class Image
    {
    public:
        int w; //image width;
        int h; //image height.
        T * data; //stores the raw pixal value; sift_cpu() requires "data" to store an vector of grayscale uchar values.
    }
}

The Image class provides basic image reading/writing functions for pgm and ppm images. If you want to work with more general image formats (such as jpeg, png and so on), you can easily find an image compression library and help you read images. If the original image is a color image, you need some preprocessing to convert the image to grayscale format. kpt_list: a list of detected keypoints. It contains the position, scale, orientation and feature descriptor of each keypoint. bExtractDescriptors: indicates if you want to extract the feature descriptor. If bExtractDescriptors==true, the generated kpt_list will contain feature descriptors. If bExtractDescriptors=false, no feature descriptor is generated, and the kpt_list only contains keypoint information (position, scale, orientation).


ezSIFT usage and examples

Two application examples are included in the source code package:

  • feature_detection.cpp
  • image_match.cpp

They show how to use the ezSIFT library. Basically, you just include ezSIFT.h, then you can call ezsift::sift_cpu() function to detect and extract feature points.

Example 1: keypoint/feature detection

    using namespace ezsift;

    Image<uchar> image;	
    image.read_pgm("input.pgm");

    bool bExtractDescriptor = true;
    list<SiftKeypoint> kpt_list;
    // Perform SIFT computation on CPU.
    sift_cpu(image, kpt_list, bExtractDescriptor);
    // Generate output image
    draw_keypoints_to_ppm_file("output.ppm", image, kpt_list);
    // Generate keypoints list
    export_kpt_list_to_file("output.key", kpt_list, bExtractDescriptor);

Input (left), result image (right) (The input image courtesy of Affine Covariant Features Project at Oxford Univ.)

Example 2: Feature matching

This example is a simple demonstration of feature matching, based on brute-force matching. If you want more accurate matching results, you should consider other matching algorithms.

    using namespace ezsift;

    Image<uchar> image1, image2;
    image1.read_pgm("file1.pgm");
    image2.read_pgm("file2.pgm");

    // Detect keypoints
    list<SiftKeypoint> kpt_list1, kpt_list2;
    bool bExtractDescriptor = true;
    sift_cpu(image1, kpt_list1, bExtractDescriptor);
    sift_cpu(image2, kpt_list2, bExtractDescriptor);

    // Match keypoints.
    list<MatchPair> match_list;
    match_keypoints(kpt_list1, kpt_list2, match_list);

    // Draw result image.
    draw_match_lines_to_ppm_file("output_file.ppm", image1, image2, match_list);
    printf("Number of matched keypoints: %d\n", match_list.size());

Input images:

(The input images courtesy of Affine Covariant Features Project at Oxford Univ.)

graf1

Output images: graf1-keypoints

matching


Compare with Lowe's implementation, OpenCV, and VLFeat.

The links for the implementations under comparison:

  1. Lowe's implementation: http://www.cs.ubc.ca/~lowe/keypoints/
  2. OpenCV: http://opencv.org/
  3. VLFeat: http://www.vlfeat.org/

Instead of using images containing real-world scenes, we generated a few images with different simeple shapes. The reason I choose these images is that they show very strong/sharp contours which are naturally good keypoints for image matching. A good SIFT detector should be able to detect most of them. By using these images, we can easily tell the performance difference among the SIFT implementations.

Comparison test 1

Input image:

Keypoint detection results:

The results from ezSIFT and Lowe's are close, and the performance is better than the other two implementations. Apparently, some key features are missing in OpenCV and VLFeat results. (red squares indicate the missing keypoints).

Rotated image:

Again, ezSIFT and Lowe's implementation generate similar keypoints, which have better performance than the other two implementations. Some keypoints are missing in OpenCV and VLFeat results (red squares in the figure). Moreover, when taking a close look, we can notice that many small scale features on the boundary of the blobs are missing in result images of the OpenCV implementation.

Feature matching results:

The keypoint matching algorithm is not the focus of this SIFT library. So, the brute-force matching is used due to its simplicity to demonstrate the correctness of the SIFT keypoints and descriptors. Using other matcher may generate better matching results.

The ezSIFT and Lowe's have found comparable number of matches, and all matches are correct. While, the OpenCV and VLFeat generate many false matches. Please notice that feature matching accuracy depends on the settings of matching threshold, as well as the detector parameter settings. Here, for all implementations, we use the default settings coming with the softwares. If you put time to fine-tune the parameters, the results from all these implementations may improve.


Comparison test 2

Input image:

Keypoint detection results:

ezSIFT, Lowe's, and VLFeat have similar features. Specifically, the ezSIFT and VLFeat implementations generate almost the same features. While, some key features are missing in the OpenCV results. And OpenCV generates some random features.

Rotated image:

For this image, all four implementations generate most of the major keypoint features. Lowe's and OpenCV generate slightly more features than ezSIFT and VLFeat.

Feature matching results:

The keypoint matching algorithm is not the focus of this SIFT library. So, the brute-force matching is used due to its simplicity to demonstrate the correctness of the SIFT keypoints and descriptors. Using other matcher may generate better matching results.

Similar to the previous comparison test, Lowe's and the ezSIFT give better matching results. OpenCV and VLFeat have some false matches. Again, please notice that feature matching accuracy depends on the settings of matching threshold, as well as the detector parameter settings. Here, for all implementations, we use the default settings coming with the softwares. If you put time to fine-tune the parameters, the results from all these implementations may improve.

References

  1. David Lowe, Demo Software: SIFT Keypoint Detector, http://www.cs.ubc.ca/~lowe/keypoints/.
  2. David G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, 60, 2 (2004), pp. 91-110.
  3. Method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image, David G. Lowe, US Patent 6,711,293 (March 23, 2004). Provisional application filed March 8, 1999. Asignee: The University of British Columbia.
  4. VLFeat, http://www.vlfeat.org/.
  5. OpenCV, http://opencv.org/.
  6. OpenSIFT, http://robwhess.github.io/opensift/.

Patent Notice: The following patent has been issued for methods embodied in this software: "Method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image," David G. Lowe, US Patent 6,711,293 (March 23, 2004). Provisional application filed March 8, 1999. Asignee: The University of British Columbia. For further details, contact David Lowe (lowe@cs.ubc.ca) or the University-Industry Liaison Office of the University of British Columbia.