Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Apr 18th, 2006, 2:08 PM   #1
MicDareall
Programmer
 
Join Date: Dec 2005
Posts: 67
Rep Power: 0 MicDareall is an unknown quantity at this point
Sorting algorithms problem.......

aight my assignment is to implement both quick sort and bubble sort, time them and make a graph of the change in time it takes both according to size of array/list of elements........... I have it done and it works on lists upto but not incuding lists with 100,000 elements..... it freezes up when trying to sort them with both algorithms at 100,000........

heres the code I have........



import time
import random
import sys
sys.setrecursionlimit(1000000)


##Main function call to run tests on 2 equal arrays of size num_elements
        ##time arrays through both algorithms
        ##and print both time of execution and sorted array to screen
def run_sort_tests(num_elements):
    num_elements=num_elements
    num_array=[]
    num_array2=[]
    num_array,num_array2=create_arrays(num_array,num_array2,num_elements)
    print "\nUnsorted Array:\n",num_array,"\nend unsorted\n\n"

######Time Bubble  and print array/time
    start = time.clock()
    bubbleSort(num_array, num_elements)
    end = time.clock()
    total_time = end - start

    #uncomment to see sorted array    
    print num_array
    print "\nBubbleSort time for ",num_elements," elements: ",total_time,"\n"

######Time Quick Sort print array/time
    start = time.clock()
    quickSort(num_array2,0,num_elements-1)
    end = time.clock()
    total_time = end - start
    
    #uncomment to see sorted array
    print num_array2
    print "\nQuickSort time for ",num_elements," elements: ",total_time,"\n"

##Function to create 2 equal arrays of size num_elements
def create_arrays(array1,array2,num_elements):
    array1=[0]*num_elements
    for i in range(0,num_elements):
        array1[i] = random.randint(0,1000)
    array2=array1
    return array1,array2


##Bubble sort algorithm
def bubbleSort(nums,elements):
    i=elements-1
    j=1
    while i>=0:        
        j=1
        while j<=i:           
            if nums[(j-1)]>nums[j]:
                temp=nums[j-1]
                nums[j-1]=nums[j]
                nums[j]=temp
                
            j+=1
        i-=1
    
#Partition Function for use within quick sort algorithm
def partition(nums, L, R):
    pivot = nums[R]
    pivot_index=R
    j = L                         
    i = R+1
    
    loop=1
    while loop:
        j+=1
        while nums[j]<pivot:                        
                             

            if j == i:                  
                loop = 0                     
                break
            j +=1
        i -=1  
        while nums[i]>pivot:                        
                                 
            if i == j:                  
                loop=0              
                break
            i -=1  
        temp=nums[j]
        nums[j]=nums[i]
        nums[i]=temp
        if i>=j:
            break
        
    temp=nums[j]
    nums[j]=nums[i]
    nums[i]=temp   
    nums[pivot_index]=nums[i]
    nums[i]=pivot                      
    return i                                 

##Quick Sort Function algorithm
def quickSort(nums, L, R):
    if L < R:                           
        piv = partition(nums, L, R)    
        quickSort(nums, L, piv-1)       
        quickSort(nums, piv+1, R)
    return
  





##run 3 tests
run_sort_tests(10)
run_sort_tests(100)
run_sort_tests(1000)
run_sort_tests(10000)
run_sort_tests(100000)


raw_input("")



any help is greatly appreciated..................
MicDareall is offline   Reply With Quote
Old Apr 18th, 2006, 3:33 PM   #2
Dietrich
Professional Programmer
 
Dietrich's Avatar
 
Join Date: Feb 2005
Posts: 434
Rep Power: 4 Dietrich is on a distinguished road
When you use
array2=array1
You are not making array2 a true copy of array1, you are making an alias, both array1 and array2 point to the same object. That means when you sort array1 then array2 is also sorted now!

To make a true copy of a simple list you can use
array2=array1[:]

There are a lot of strange things in your code, is this a direct translation from C?
__________________
I looked it up on the Intergnats!
Dietrich is offline   Reply With Quote
Old Apr 18th, 2006, 4:12 PM   #3
Dietrich
Professional Programmer
 
Dietrich's Avatar
 
Join Date: Feb 2005
Posts: 434
Rep Power: 4 Dietrich is on a distinguished road
One more thing, don't even attempt to display an array/list of 100,000 elements on your console, that will take forever!!!

Just to test if it's sorted limit it to maybe the first 100 elements with this little trick:
print num_array[:100]
__________________
I looked it up on the Intergnats!
Dietrich is offline   Reply With Quote
Old Apr 18th, 2006, 4:30 PM   #4
Arevos
Programming Guru
 
Arevos's Avatar
 
Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 4 Arevos is on a distinguished road
Presumably it freezes up because sorting 100'000 numbers takes up a good amount of processing time. Remember that Bubblesort is O(n^2) time, so a list of 100'000 would need around 10'000'000'000 iterations. Have you tried leaving it for a while and seeing if it comes back with anything? (Although you might have a long time to wait!)

The other option is to make your program a little more efficient. Your algorithms would suit a language like C fine, but you don't need to write so much code for a language like Python.

Take your create_arrays function. You've made a little mistake here, though it's a classic one beginners make:
array2=array1
This piece of code doesn't do what you think it does. Variables (such as array2) are like the ISBN numbers on books. An ISBN number uniquely identifies a book, so you can find it via a library or bookshop. But if I copy out an ISBN number twice on a piece of paper, that doesn't mean I have two books - the ISBN numbers still both point to the same book.

It's the same with objects in Python. You have a variable, or reference, to an object (for instance, array1 points to an array), but copying the reference (into array2) is not the same as copying the array itself. So array1 and array2 point to exactly the same array!


Fortunately, creating arrays (or rather lists) in Python is made a lot easier through the use of list comprehensions. I'll give you a few examples.
# The range function outputs a sequence of incremented numbers
range(5)                     # Outputs [0, 1, 2, 3, 4]

# We can use list comprehensions to double all the numbers in the list
[x * 2 for x in range(5)]    # Outputs [0, 2, 4, 6, 8]
Using list comprehensions, we can make a function that creates a random list of numbers in one line:
def random_list(n, lower = 0, upper = 1000):
    return [random.randint(lower, upper) for i in xrange(n)]
Note that I use xrange here - it's more efficient than range for large numbers.

To create a copy of a list, an easy trick is to use a slice. There's lots of information about slices on the Python homepage and elsewhere. For now, just remember that array1[:] is a quick way of copying a list:
array1 = random_list(1000)
array2 = array1[:]
The bubblesort algorithm you have can be shortened a little with for loops:
def bubblesort(list):
    for i in xrange(len(list) - 1, 0, -1):
        for j in xrange(1, i + 1):
            if list[j - 1] > list[j]:
                temp = list[j - 1]
                list[j - 1] = list[j]
                list[j] = temp
    return list
The use of range and xrange is fairly self-explanatory, so I won't go into details that help(range) can already answer. The quicksort algorithm can be shortened quite a lot through effective use of list comprehensions:
def quicksort(list):
    if len(list) > 1:
        pivot = list.pop(len(list) / 2)
        less_than = [x for x in list if x <= pivot]
        greater_than = [x for x in list if x > pivot]
        return quicksort(less_than) + [pivot] + quicksort(greater_than)
    else:
        return list
Note that the above quicksort effectively destroys the original list and returns a new one. In an ideal world I'd take a copy of the list before using it:
def quicksort(x):
    list = x[:]
    if len(list) > 1:
        pivot = list.pop(len(list) / 2)
        less_than = [x for x in list if x <= pivot]
        greater_than = [x for x in list if x > pivot]
        return quicksort(less_than) + [pivot] + quicksort(greater_than)
    else:
        return list
But that would probably just waste time.

The above quicksort is very Python-efficient. For 100'000 list entries, it took the quicksort function a mere 2.68 seconds to sort. The bubblesort is taking a lot longer, so long that even after 5 minutes it was still going. At first this seemed surprising, but I did some quick sums and the result I get is that bubblesort with 100'000 random elements should be on the order of 20'000 times slower than quicksort on the same. O(n^2) is a lot bigger than O(n log n) for large numbers.

More knowledgable forum members should feel free to correct me

Last edited by Arevos; Apr 18th, 2006 at 4:41 PM.
Arevos is offline   Reply With Quote
Old Apr 18th, 2006, 4:54 PM   #5
Dietrich
Professional Programmer
 
Dietrich's Avatar
 
Join Date: Feb 2005
Posts: 434
Rep Power: 4 Dietrich is on a distinguished road
Talking

Arevos,
just a question, I see you are using "list" as a variable name. Is that good practice, since "list" is a Python keyword?

I tested out your wonderfully coded sorts (using xrange is a good idea!), bubblesort does change the original list, your first quicksort does not!
__________________
I looked it up on the Intergnats!

Last edited by Dietrich; Apr 18th, 2006 at 5:14 PM.
Dietrich is offline   Reply With Quote
Old Apr 18th, 2006, 5:40 PM   #6
Arevos
Programming Guru
 
Arevos's Avatar
 
Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 4 Arevos is on a distinguished road
Quote:
Originally Posted by Dietrich
Arevos,
just a question, I see you are using "list" as a variable name. Is that good practice, since "list" is a Python keyword?
List isn't a keyword, but a type. No, probably not, but in practise it doesn't matter, unless you plan to use the list type in your function.
Quote:
Originally Posted by Dietrich
I tested out your wonderfully coded sorts (using xrange is a good idea!), bubblesort does change the original list, your first quicksort does not!
The quicksort does change the original list, since it pops an item from it. Ideally, copies of the list should be taken for each function, but since this is a speed test, I decided not to do this.
Arevos is offline   Reply With Quote
Old Apr 18th, 2006, 6:25 PM   #7
Kakao
Newbie
 
Join Date: Feb 2006
Posts: 9
Rep Power: 0 Kakao is on a distinguished road
I have been reading the Python Cookbook and in the 4.1 Recipe the recommended technique to copy a list is to use the list function:

Quote:
Originally Posted by Python Cookbook 4.1 Recipe
You can use other, inferior ways exist to create copies, namely building your own. Given a list L, both a "whole-object slice" L[:] and a list comprehension [x for x in L] do happen to make a (shallow) copy of L, as do adding an empty list, L+[ ], and multiplying the list by 1, L*1 . . . but each of these constructs is just wasted effort and obfuscation—calling list(L) is clearer and faster. You should, however, be familiar with the L[:] construct because for historical reasons it's widely used. So, even though you're best advised not to use it yourself, you'll see it in Python code written by others.
Kakao is offline   Reply With Quote
Old Apr 18th, 2006, 6:29 PM   #8
Dietrich
Professional Programmer
 
Dietrich's Avatar
 
Join Date: Feb 2005
Posts: 434
Rep Power: 4 Dietrich is on a distinguished road
Talking

Thanks Arevos,
tested the len() of the list and low and behold there is an item missing!

The bubblesort with 100,000 x 100,000 iterations would be very slow, particularly if the disk as to be used for memory too! At least my disk light flickers a lot!
__________________
I looked it up on the Intergnats!
Dietrich is offline   Reply With Quote
Old Apr 18th, 2006, 8:43 PM   #9
MicDareall
Programmer
 
Join Date: Dec 2005
Posts: 67
Rep Power: 0 MicDareall is an unknown quantity at this point
I talked to my proffessor after posting this and he said everything that you guys said..... he explained how pythons virtual machine has massive overhead and that will make it run longer because it eventually starts writing back and forth from disk ......So I took out the print statements except for printing the times.........I commented out bubblesort all together....I copied the array..... Im only running the test on 100,000 now..... and Im waiting.......... welll......... now after like a 1/2 hour/hour python just shuts down...... if I run it in Idle it just quits and closes everything......... If I run it in a command prompt it shuts down the command prompt window........ now this is just confusing the shit out of me.......
MicDareall is offline   Reply With Quote
Old Apr 19th, 2006, 1:34 AM   #10
Arevos
Programming Guru
 
Arevos's Avatar
 
Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 4 Arevos is on a distinguished road
For some reason your implementation of quicksort segfaults on my machine for large numbers if I whack up the recursion limit, and fails due to a recursion overload exception otherwise.

Your code is recursing to a greater depth than is necessary, even for a list of 100'000. Could it be that you're code isn't ending correctly and gets stuck in an infinite recusion?

Python runs slower because it's a bytecode interpreted language, rather than a compiled one. I shouldn't think that writing back and forth from disk would have any effect, and Python doesn't do that significantly more than any other language. The only time I can think of it writing to disk is to cache libraries.

Regardless, Python is slower for a computer to execute. Generally speaking, it's around 10 times slower than C (less so with optimisation modules like psycho). The advantage of Python that it also takes less time to program. It's somewhat of a trade-off between developer efficiency and computational efficiency.
Arevos is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 6:08 PM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC