Dramatically faster Matlab with QLA (beta)
Filed under: Algorithms, Linear Algebra, Numerical Analysis

We recently released the beta version of QLA: The Quick Linear Algebra Library.
QLA provides fast linear algebra in MATLAB, requiring only a regular CPU. For a minuscule sacrifice in accuracy, you get a mammoth increase in speed.
* How fast? Speedups over built-in MATLAB functions range from 10x to 1,000x+ on benchmarks.
* How easy? Just add a ‘q’ in front of your normal function calls – e.g., ‘qsvd(A)’ instead of ’svd(A)’.
* Includes SVD, linear systems, least squares, PCA, and more.
QLA is currently in free public beta. Download and find more details here:
http://massiveanalytics.com
You can also join the QLA discussion here:
http://www.accelereyes.com/forums/viewforum.php?f=15
Enjoy!
Michael Holmes / Massive Analytics
John Melonakos / AccelerEyes
Popularity: 1% [?]
The fastest way to sort N numbers: sortN library

Contents
Have you ever thought about the fastest way to sort N numbers?
Last week I wrote a post about the fastest way to sort 3 numbers, this week I show you a tricky way to optimize the sort of N numbers.
The problem of sorting an array of numbers is probably the most studied in computer science. The efficiency of a sort algorithm depends essentially on the number of comparisons and swaps it needs to reach the solution. This post contains an optimized sort algorithm for small buckets (to my knowledge the fastest).
Currently most of sorting algorithms uses quick sort for large buckets and insertion sort for smaller ones. Both these algorithms uses an optimized way to swaps and compare numbers. The sortN library is a further optimization of this operations. It does first all the needed comparisons and then, only in the end, it swaps values to reach the sorted solution only in the end. It is in practice an enhanced Insertion sort algorithm.
The sortN library
The library is written in C++, but it can be written in any code, and supports arrays up to 7 elements. It can be applied , in theory, to array of any size but we will see this can not be realized in practice, only small arrays can be sorted.
The sortN libary can be considered enhanced insertion sort algorithm.
It contains a list of optimized function to sort 3,4,5,6,7 values. Each function is optimized to sort an N number of values.
Here is the approach to the solution:
N numbers can be disposed in Nc combinations where Nc is .
We may think to write a recursive code that reach the sorted solution in the way algorithms for permutations does.
The permutations represents all the possible solutions, depending on the value of each comparison (between 2 values of the array) a permutation is chosen or refused.
This allows, depending on the results of the comparisons, to build a decisionary tree that unequivocally leads to the permutation that represents the solution. Once we know the solution we can optimize swaps to reach it.
Notice that no swaps must be done done before we know the exact solution. This can be considered an enhanced insertion sort algorithm.
The number of comparisons possible are great even for a relative small array, we can not think about writing by hand the code, we need some function to do job for us. That’s what the sort library does, it prints the enhanced insertion sort algorithm.
There is a small disadvantage: the problem has size . It becomes huge for relative small buckets like 8 numbers. The sort8 function requires a 10MB file, I waited10 minutes for the compiler to build it, but no success, then I give up and decide to use the library till sort7 function.
The sortN function
You probably got confused, so here it is the code that I hope will clarify the concept.
The sort N function is a MATLAB script that prints the enhanced insertionsort algorithm for N values.
It recursively finds all the solutions and it prints optimized swaps.
function PrintSortNFunction(fid,N) %Prints a c++ sort N function. %fid is the file identifier where the function will be written. %N is the number of points the function will sort, for example: % %PrintSortNFunction(fid,4);%print the sort4 function %function definition fprintf(fid,'void sort%1.0f(double* a){\n',N); fprintf(fid,'double temp;\n'); x=1:N; PrintIS(x,1,2,fid); fprintf(fid,'}\n'); %The normal insertion sort: % void insertionSort(double* x,int length) % { % % % % double key; % int i; % % for(int j=1;j<length;j++) % { % key=x[j]; % i=j-1; % while(x[i]>key && i>=0) % { % x[i+1]=x[i]; % i--; % } % x[i+1]=key; % } % } end %print the isertion sort operations, takes in input the i an j paramater % (see code above). The x vector goes from 1 to N as matlab notation, the % prints instead goes from 0:N-1 as C++ notation function PrintIS(x,i,j,fid) N=length(x); %fprintf('%1.0f %1.0f\n',i,j); % we reached the end of the array if j>N%print solution fprintf(fid,'//'); for i=1:length(x) fprintf(fid,'%1.0f',x(i)-1); end fprintf(fid,'\n'); PrintSwaps(x,fid);%print swaps return end %if (x[i]>key) key=x(j); x0=x;%make a copy if i>=1 %if we are in the middle of the array ii=x(i);jj=x(j); fprintf(fid,'if (a[%1.0f]<a[%1.0f]){\n',ii-1,jj-1); %modify the solution x(i+2:j)=x(i+1:j-1); x(i+1)=key; %go forward PrintIS(x,j,j+1,fid); fprintf(fid,'}\n'); fprintf(fid,'else{\n'); PrintIS(x0,i-1,j,fid); fprintf(fid,'}\n') ; else %we have reached the beginning of the array %modify the solution x(2:j)=x(1:j-1); x(1)=key; %go forward PrintIS(x,j,j+1,fid) ; end end %prints swaps combination to go from the base array 1:N to the % sorted solution (permutation) function PrintSwaps(Perm,fid) N=length(Perm); %We first have to store a temp value %Perm=N:-1:1];test for debugging Base=1:N;%to record the change due to swaps swaps=zeros(N*3,2); c=1; for n=1:N i=Base(n); j=Perm(n); if j==i %do nothing already in place continue; end %we have to make Base(n) equal to Perm(n); i=find(Base==j); temp=Base(i);%phisically swap (i and n) Base(i)=Base(n); Base(n)=temp; %print swap %-1 stays for temp value swaps(c,1)=-1;swaps(c,2)=n; swaps(c+1,1)=n;swaps(c+1,2)=i; swaps(c+2,1)=i;swaps(c+2,2)=-1; c=c+3; end %detecting duplicate values in swaps (they do not need to be copied in %temp) swaps(c:end,:)=[];%remove unused values ind=false(c-1,1); for i=2:c-2%loop trough candidate swaps if swaps(i,1)==swaps(i+1,2) && swaps(i,2)==swaps(i+1,1) ind(i)=true;ind(i+1)=true; end end %remove duplicate values from swaps swaps(ind,:)=[]; for k=1:size(swaps,1) i=swaps(k,1);j=swaps(k,2); if i==-1 fprintf(fid,'temp=a[%1.0f];\n',j-1);%store the temp value elseif j==-1 fprintf(fid,'a[%1.0f]=temp;\n',i-1);%copy the temp value else fprintf(fid,'a[%1.0f]=a[%1.0f];\n',i-1,j-1);%keep copying end end end
Perfomances
I have compared the sortN functions with insertion sort and stl sort here are some results:
This program shows the perfomaces of the sortN library against
stl sort and insertions sort
Enter the number of buckets
10000000
BUCKET SIZE=3
STL sort of 3 points required 260.3610 ms
insertionSort of 3 points required 240.9796 ms
SortN sort of 3 points required 172.1386 ms
IS vs stl= 7.44406 %
SortN vs stl= 33.8846 %
SortN vs IS= 28.5671 %
BUCKET SIZE=4
STL sort of 4 points required 434.7305 ms
insertionSort of 4 points required 353.5941 ms
SortN sort of 4 points required 287.3990 ms
IS vs stl= 18.6636 %
SortN vs stl= 33.8903 %
SortN vs IS= 18.7206 %
BUCKET SIZE=5
STL sort of 5 points required 594.5557 ms
insertionSort of 5 points required 563.9854 ms
SortN sort of 5 points required 401.6870 ms
IS vs stl= 5.1417 %
SortN vs stl= 32.4391 %
SortN vs IS= 28.777 %
BUCKET SIZE=6
STL sort of 6 points required 883.9322 ms
insertionSort of 6 points required 735.4656 ms
SortN sort of 6 points required 517.2404 ms
IS vs stl= 16.7962 %
SortN vs stl= 41.4842 %
SortN vs IS= 29.6717 %
BUCKET SIZE=7
STL sort of 7 points required 1162.6356 ms
insertionSort of 7 points required 930.5345 ms
SortN sort of 7 points required 992.6320 ms
IS vs stl= 19.9634 %
SortN vs stl= 14.6223 %
SortN vs IS= -6.67331 %
You can run your own test by downloading the code below. As you can see for 7 numbers it becomes unadvantageous (dunno Why).
Downloads
The whole library can be build by downloading this:
You can download the library (up to sort7 here):
The quickest sort algorithm?
I don’t know, if you find something faster let me know.
What I known is that the sortN library can be further optimized. For example it can be made recursive (avoiding to write thousands lines of code), or it it can be sped up in the swaps: for each solution find the minimum number of swaps to allows to pass from the original vector to the sorted one (currently I can only avoid to copy values in “temp” twice).
Popularity: 1% [?]
Quick Delaunay Triangulation
Filed under: Algorithms, C, Computational Geometry, FEM, Geometry
Contents
QuickDel
It has always been a dream of mine to code a Delaunay triangulator. Last week I had 3 free days and I did it. I actually thought it was more complicated, as it often happens, when you are outside the problem it seems bigger than it really is.
In this post you will find a Delaunay triangulation algorithm. My idea is to start a project which purpose is to create a robust an efficient triangulation/meshing tool.
The first release will be Delaunay triangulation function QuickDelMex
Below a comparison with DelaunayTri from the computational geometry toolbox.
Number of points QuickDel Built-in Matlab speed-up
10 0.0002 s 0.0142 98.69% s
100 0.0002 s 0.0094 97.61% s
1000 0.0016 s 0.0099 84.13% s
10000 0.0149 s 0.0707 78.96% s
100000 0.1856 s 1.2044 84.59% s
1000000 2.5094 s 14.1102 82.22% s
t=QuickDelMex(p) (version 0.3)
returns the Delaunay triangulation of the 2D point set p;
INPUT:
- * p: double array of size 2xN. The list of unique points [x1 y1,x2 y2];
OUTPUT:
- * t: double array of size ntx3. It contains the triangles of the Delaunay triangulation. Each rows contains 3 indexes of the points forming the triangle. Indexes are sorted counterclockwise; first points is flagged as one.
NOTES:
- * -This function is a prototype so don’t trust too much, please check the results and in case send bug report.
- * -This function is optimized for uniform random dataset, sparse datasets may slow down the algorithm.
- * -This function may not return the whole convex hull. Thin boundary triangles may be discarded.
- *- v0.3 is alittle slower but much more robust than v0.2.
Author: Luigi Giaccari (giaccariluigi@msn.com)
see also DelaunayTri delaunay delaunayn
Downloads
Matlab
You can download MEX Files of QuickDel here (win32 and linux):
A test to evaluate performances is also available on Matlab file exchange (Download Now)
C++
A Win32 executable is available here (v0.2):
it asks for the number of points to be triangulated. Once the triangulation has been done it prints a text file (a matlab m-file) which allows to visualize the triangulations inside Matlab.
I can also build a win32 .dll, but I was too lazy to do it, if you want it please contact me.
Acknowledgements
- This function uses the robust predicates from Jonathan Richard Shewchuk
Contacts
Please send bug reports/suggestions at: giaccariluigi@msn.com
This utility can be greatly enlarged, please compile the following survey.
Popularity: 2% [?]



















































