GLTree Pro Version: a fast nearest neighbour library for Matlab and C++

February 15, 2010 by Luigi Giaccari · 2 Comments
Filed under: Algorithms, Computational Geometry, Geometry, Graphics 
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)
VN:F [1.8.8_1072]
Rating: 10.0/10 (1 vote cast)

Contents

  1. Introduction
  2. The algorithm
  3. Performances
  4. Release
  5. Matlab
  6. C++
  7. Examples

Introduction

The nearest neighbor search problem arises in numerous fields of application, including:

  • Pattern recognition – in particular for optical character recognition
  • Statistical classification- see k-nearest neighbor algorithm
  • Computer vision
  • Databases – e.g. content-based image retrieval
  • Coding theory – see maximum likelihood decoding
  • Data compression – see MPEG-2 standard
  • Recommendation systems
  • Internet marketing - see contextual advertising and behavioral targeting
  • DNA sequencing
  • Spell checking - suggesting correct spelling
  • Plagiarism detection
  • Contact searching algorithms in FEA
  • Similarity scores for predicting career paths of professional athletes.
  • Cluster analysis – assignment of a set of observations into subsets (called clusters) so that observations in the same cluster are similar in some sense, usually based on Euclidean distance

The algorithm

The reason that pushed me to develop this algorithm was the need of more speed in nearest neighbors search. The first implementation of GLTree was coded in the  September 2008, and a first, rough, but very fast version was uploaded on Matlab File Exchange as C++ code to compile into MEX file in November. These version are still available and  open-source:

After using this library for about a year I released the potential and I decided to make a step further. In this page a faster and low memory version is proposed. Many more functions have been developed like: a nearest neighbour filter radius search and a cuboid search and many more can be developed.

Performances

This are the timings for a NNGraph in 2D and 3D on my 2.4Ghz laptop. Test is run on uniform randomly points in 0-1 range. Given a set of N random points, for each one finds his nearest neighbour.

START NNGraph Test:
 
2D
 NNGraph of   10  points took: 0.0001 s
 NNGraph of  100  points took: 0.0001 s
 NNGraph of 1000  points took: 0.0004 s
 NNGraph of 10000  points took: 0.0039 s
 NNGraph of 100000  points took: 0.0629 s
 NNGraph of 1000000  points took: 1.1897 s
 
3D
 NNGraph of   10  points took: 0.0014 s
 NNGraph of  100  points took: 0.0001 s
 NNGraph of 1000  points took: 0.0007 s
 NNGraph of 10000  points took: 0.0075 s
 NNGraph of 100000  points took: 0.1219 s
 NNGraph of 1000000  points took: 2.4760 s
 
TEST SUCCESFULLY COMPLETED !!!

You can run this test by downloading the demo version (linux and win32):

Download GLTreePro Demo

Download GLTree Pro demo

For very sparse data, like logspace data, GLTree becomes slower. It may be convenient using others algorithm. For more see this post:

A comparison of GLTree with Kd-Tree and Bd-Tree

Release

  • 15/02/2010 GLTreePro_v1.0

Matlab

This release includes the following MEX files:

  • 2D
    • K-Nearest Neighbor Search
    • K-Nearest Neighbour Graph
    • Nearest Neighbour Filter
    • Radius Search
    • Cuboid Search
    • BuildTree
    • DeleteTree
  • 3D
    • K-Nearest Neighbor Search
    • K-Nearest Neigbour Graph
    • Nearest Neighbour Filter
    • Radius Search
    • Cuboid Search
    • BuildTree
    • DeleteTree

We support Win32 and Linux platform. The Full library is no more available for public domain.

C++

Under Construction. You need it? tell me and I may speed up the development.

Examples

See the examples for 2D:

VN:F [1.8.8_1072]
Rating: 10.0/10 (1 vote cast)
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)

Popularity: 1% [?]

Share and Enjoy:
  • Print
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogplay
  • Live
  • PDF
  • Technorati
  • Twitter
  • Yahoo! Bookmarks
  • Add to favorites
  • email
  • MySpace
  • RSS

GLTree Matlab Example: Nearest Neighbor Filter Function

VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)
VN:F [1.8.8_1072]
Rating: 0.0/10 (0 votes cast)

Go to main post

KNNSearch2D query a GL-tree for k nearest neighbor(kNN)

SYNTAX

[cutdist,ndel,idel]=KNNFilter2D(p,ptrtree,’r',r);% radius mode
[cutdist,ndel,idel]=KNNFilter2D(p,ptrtree,’f',f);% consolidator mode

INPUT PARAMETERS

  • p: [2xN] double array coordinates of reference points
  • ptrtree: a pointer to the previously constructed  GLtree.Warning if the pointer is incorrect it will cause a crash, there is no way to check this in the mex routine, you have to check it in your script.
  • ‘r’ or ‘f’: string. Specify the RADIUS mode or the CONSOLIDATOR mode.
  • r or f: double parameter for the mode.

OUTPUT PARAMETERS

  • cutdist: in the dataset no couple of points will be closer than cutdist. In the radius mode cutdist==radius.
  • ndel: number of points removed from the tree
  • iddel: ids of points removed from the tree

GENERAL INFORMATIONS

  • -RADIUS mode: if 2 points are closer than r only one will survive.
  • -CONSOLIDATOR mode: the nearest neighbour graph is computed and the mean distance between points is evaluated. The cutdistance is set to meandist/f.
  • -points removed form the tree are not physically deleted (memory deaollocated), they are just skipped during search.

For question, suggestion, bug reports
giaccariluigi@msn.com

Author: Luigi Giaccari

N=300;%reference points
 
Cuboid=[.2 ,.5 ,.2 ,.5 ]';
 
p=rand(N,2);%reference points
 
fprintf('RANDOM POINTS GENERATED\n\n')
 
fprintf('BUILDING THE DATA STRUCTURE:\n')
tic
ptrtree=BuildGLTree2D(p');
fprintf('\tGLTree built in %4.4f s\n\treturned pointer %4.0f:\n\n',toc,ptrtree);
 
fprintf('START FILTER:\n')
tic
[idc]=CuboidSearch2D(p',Cuboid,ptrtree);
fprintf('\t %4.0f Points found in range\n\n',length(idc));
 
fprintf('DELETING THE TREE\n\n')
DeleteGLTree2D(ptrtree);
 
fprintf('TEST SUCCESFULLY COMPLETED !!!\n\n')
 
%plot the NNG
 
figure(1)
title('Cuboid Search','fontsize',14);
axis equal
hold on
x=[Cuboid(1),Cuboid(1),Cuboid(2),Cuboid(2),Cuboid(1)];
y=[Cuboid(3),Cuboid(4),Cuboid(4),Cuboid(3),Cuboid(3)];
h1=plot(p(:,1),p(:,2),'g.');
h2=plot(x,y,'b-');
h3=plot(p(idc,1),p(idc,2),'r.');
legend([h1,h2,h3],'Reference points','Cuboid','Inside points');

RESULTS

Points inside the cut radius are found


And deleted..
Go to main post

VN:F [1.8.8_1072]
Rating: 0.0/10 (0 votes cast)
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)

Popularity: 1% [?]

Share and Enjoy:
  • Print
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogplay
  • Live
  • PDF
  • Technorati
  • Twitter
  • Yahoo! Bookmarks
  • Add to favorites
  • email
  • MySpace
  • RSS

GLTree Matlab Example: Radius Search Function

February 15, 2010 by Luigi Giaccari · Leave a Comment
Filed under: Computational Geometry, Optimization 
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)
VN:F [1.8.8_1072]
Rating: 0.0/10 (0 votes cast)

Go to main post

RSearch2D query a GL-Tree for Radius search

SYNTAX

idc=RSearch3D(p,qp,ptrtree,r);

INPUT PARAMETERS

  • p: [2xN] double array coordinates of reference points
  • qp: [2x1] double array  coordinates of query points. NOTE: differently from NNsearch this function does not support multiple query points.
  • ptrtree: a pointer to the previously constructed  GLtree.Warning if the pointer is uncorrect it will cause a crash, there is no way to check this in the mex routine, you have to check it in your script.
  • r: double value, radius to perform the search.

OUTPUT PARAMETERS

  • idc: [?x1] column vector, each rows contains an index of a point found in the range described by the radius.

GENERAL INFORMATIONS

-Conditions for a point to be found in the radius is <=, so
points with distance from the query=r will be returned.

For question, suggestion, bug reports
giaccariluigi@msn.com

Author: Luigi Giaccari

EXAMPLE

function test()
 
N=200;%reference points
 
k=3;%k neigh
 
p=rand(N,2);%reference points
 
qp=rand(1,2);%query point
r=.1;%radius for search
 
fprintf('RANDOM POINTS GENERATED\n\n')
 
fprintf('BUILDING THE DATA STRUCTURE:\n')
tic
ptrtree=BuildGLTree2D(p');
fprintf('\tGLTree built in %4.4f s\n\treturned pointer %4.0f:\n\n',toc,ptrtree);
 
fprintf('START RADIUS SEARCH:\n')
tic
[idc]=RSearch2D(p',qp',ptrtree,r);
fprintf('\t %4.0f Points found in range\n\n',length(idc));
 
fprintf('DELETING THE TREE\n\n')
DeleteGLTree2D(ptrtree);
 
fprintf('TEST SUCCESFULLY COMPLETED !!!\n\n')
 
%plot the NNG
 
figure(1)
title([num2str(k),' Neighbours'],'fontsize',14);
axis equal
hold on
h1=plot(p(:,1),p(:,2),'g.');
h2=plot(qp(:,1),qp(:,2),'b+');
h2=circle(qp,r,30,'-b');
h3=plot(p(idc,1),p(idc,2),'r.');
legend([h1,h2,h3],'Reference points','Circle','Inside points');
 
end
 
function H=circle(center,radius,NOP,style)
%---------------------------------------------------------------------------------------------
% H=CIRCLE(CENTER,RADIUS,NOP,STYLE)
% This routine draws a circle with center defined as
% a vector CENTER, radius as a scaler RADIS. NOP is
% the number of points on the circle. As to STYLE,
% use it the same way as you use the rountine PLOT.
% Since the handle of the object is returned, you
% use routine SET to get the best result.
%
%   Usage Examples,
%
%   circle([1,3],3,1000,':');
%   circle([2,4],2,1000,'--');
%
%   Zhenhai Wang <zhenhai@ieee.org>
%   Version 1.00
%   December, 2002
%---------------------------------------------------------------------------------------------
 
if (nargin <3),
 error('Please see help for INPUT DATA.');
elseif (nargin==3)
 style='b-';
end;
THETA=linspace(0,2*pi,NOP);
RHO=ones(1,NOP)*radius;
[X,Y] = pol2cart(THETA,RHO);
X=X+center(1);
Y=Y+center(2);
H=plot(X,Y,style);
axis square;
end

Go to main post

VN:F [1.8.8_1072]
Rating: 0.0/10 (0 votes cast)
VN:F [1.8.8_1072]
Rating: 0 (from 0 votes)

Popularity: 1% [?]

Share and Enjoy:
  • Print
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Blogplay
  • Live
  • PDF
  • Technorati
  • Twitter
  • Yahoo! Bookmarks
  • Add to favorites
  • email
  • MySpace
  • RSS

Next Page »