Abstract:
Binary descriptors have been widely used for real-time image retrieval and correspondence matching. However, most of the
learned descriptors are obtained using a large deep neural network (DNN) with several million parameters, and the learned
binary codes are generally not invariant to many geometrical variances which is crucial for accurate correspondence matching.
To address this problem, we proposed a new learning approach using a lightweight DNN architecture via a slack of multiple
multilayer perceptions based on the network in network (N1N) architecture, and a restricted Boltzmann machine (RBM). The
latter is used for mapping the features to binary codes, and carry out the geometrically invariant correspondence matching
task. Our experimental results on several benchmark datasets (e.g., Brown, Oxford, Paris, INRIA Holidays, RomcPatchcs,
IIPatches, and CIFAR-10) show that the proposed approach produces the learned binary descriptor that outperforms other
baseline self-su per vised binary descriptors in terms of correspondence matching despite the smaller size of its DNN. Most
importantly, the proposed approach does not freeze the features that are obtained while pre-training the N1N model. Instead, it
line tunes the features while learning the features needed for binary mapping through the RBM. Additionally, its lightweight
architecture makes it suitable for resource-constrained devices.