Adit Cookbook Pages

A QUESTION OF SORTS

In a hurry? You can jump to the code for Bubble, Selection and Shell sorts.

Once upon a time (well back in the 60’s and 70’s) there was a lot of time and energy devoted to sorting data. Hot shot programmers vied to write the fastest sort in the least lines of code and sorting data files was a crucial element of "data  processing", as it was called then. Just remember, computing started out without the  benefit of databases and indexed files and memory was impossibly expensive. As time  passed, mainframe systems and mid range machines came equipped with effective sort routines that programmers got used to calling when they were required. You will even find a sort utility in the \COMMAND\ subdirectory of your windows development machine.

The other day we had to organise two long streams of data into the same  order to do some complex comparisons. We were worried that any attempt to retain the data in memory and sort it there would push the memory limits of the clients target machine and the Microsoft supplied sort was deemed limited and anyway who wants to show such a lack of style as shelling out to a DOS routine? So the older members of the team began thinking about sorts for the first time in many years.

Most programmers can put together a simple bubble sort when pushed, although the availability of sorted lists and the sort facilities of MsFlexGrid have rather meant that even this simple algorithm may be unknown to newer programmers.

To explain what we are up to you will find three functions below, each demonstrating a different sorting method. Each of these basic methods have variations and I am sure that a few minutes spent in contemplation could come up with some improvements and they can almost always be improved upon if you have some idea of the nature of the  data they are to be applied to.

Some minor points though before we get started. These functions are to demonstrate the approach. We would not implement code just like this. We hold strongly to the view that arguments should be passed to a function By Value and that any results should be returned by the function.

If you are working in VB3 then you will need to make some minor changes to the code but the algorithm should be clear. Oh and one more point, these examples assume that the sort is being applied to Long Numbers changes for  other data types being straightforward to apply.

Bubble Sort

Private Function BubbleSort( ArrayToSort as variant) as Boolean
   Dim SortLoop as Long
   Dim HoldValue as Long
   Dim ChangeMade as Boolean

   Changemade = True
   Do  While ChangeMade
       ChangeMade = False
       For SortLoop = LBound(ArrayToSort) to (Ubound(ArrayToSort) - 1)
           If ArrayToSort(SortLoop) > ArrayToSort(SortLoop + 1) Then
               HoldValue = ArrayToSort(SortLoop)
               ArrayToSort(SortLoop) = ArrayToSort(SortLoop + 1)
               ArrayToSort(SortLoop + 1) = HoldValue
               ChangeMade = True
           End If
       Next SortLoop
   Loop
   BubbleSort = True
End Function

The bubble sort is called a  bubble sort as high values slowly work their way "up" the array until they reach  their correct level. The one above could be adapted to sort in reverse order of course. Bubble sorts are very effective when applied to relatively small arrays and where the data  is reasonably well ordered before sorting. This technique tends to be a bit slow if the array size is large or the data is badly out of order.

I can remember once writing a  bubble sort that worked in both directions at the same time ( a sort of sink sort as well  as a bubble sort) to shave a few precious seconds of the run time but I can’t remember if it was really worth the effort.  

Selection Sort

Private  Function SelectionSort( ArrayToSort as variant) as Boolean
Dim OuterLoop as Long, InnerLoop as Long
Dim LowValue as Long, HoldValue  as Long

   For OuterLoop =  Lbound(ArrayToSort) to (Ubound(ArrayToSort) - 1)
       LowValue = OuterLoop
       For InnerLoop = (OuterLoop + 1) To Ubound(ArrayToSort)
           If ArrayToSort(InnerLoop) < ArrayToSort(LowValue) then
               LowValue = InnerLoop
           End If
       Next InnerLoop
       HoldValue = ArrayToSort(LowValue)
       ArrayToSort(LowValue)  = ArrayToSort(OuterLoop)
       ArrayToSort(OuterLoop) = HoldValue
   Next OuterLoop
   SelectionSort = True
End Function

A selection sort run time is proportional to the length of the array to be sorted and will usually outperform a bubble  sort.

Shell Sort

Private  Function ShellSort( ArrayToSort as Variant) as Boolean
Dim SortLoop as Long, ShellPoint as  Long
Dim  HoldValue as Long, SortPoint as Long

   ShellPoint = Lbound(ArrayToSort)
   Do Until ShellPoint >  Ubound(ArrayToSort)
       ShellPoint = (3 *  ShellPoint) + 1
   Loop

   Do Until ShellPoint =  Lbound(ArrayToSort)
       ShellPoint = ShellPoint / 3
       For SortLoop = (ShellPoint + LBound(ArrayToSort)) To Ubound(ArrayToSort)
           HoldValue =  ArrayToSort(SortLoop)
           SortPoint = SortLoop
           Do While (ArrayToSort(SortPoint - ShellPoint) > HoldValue)
               ArrayToSort(SortPoint) = ArrayToSort(SortPoint - ShellPoint)
               SortPoint = SortPoint - ShellPoint
               If SortPoint < ShellPoint Then
                   Exit Do
               End If
           Loop
           ArrayToSort(SortPoint) = HoldValue
       Next SortLoop
   Loop
   ShellSort = True
End Function

In most instances the shell  sort will run faster than the Bubble Sort and the Selection Sort. The run time is  proportional to - well you can work it out for yourself. The shell sort is definitely my sort of choice and it adapts well to sorting user defined types and complex keys.

There are, of course, lots of other sorting methods and variations on  these themes but I think that these examples will provide food for thought. Please feel free to contribute an example of any alternative algorithm to this page.

You can download the source code for all these sorts together with a form for testing the relative performance of each sort method.  The zip file is 4k. Visit the Free Code section of our Downloads Page. With modern gigahertze processors the time differences may need to be accentuated by increasing the size of the test data set to be sorted. The download file also includes the “QuickSort” algorithm for those who are fans of iteration.

So how did we tackle the issue of sorting files? Well, we applied the  Shell Sort method to selected parts of each data record and used the results of the sort  to re-order the file records.  If you want to see more click here

Google
 
Web www.adit.co.uk
www.aditsite.co.uk