![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
|
|
#1 |
|
Newbie
Join Date: Oct 2006
Posts: 23
Rep Power: 0
![]() |
# def find_items(url): # try: # url = urllib.urlopen(url).read() # except: # return None # while True: # web_page = 'some url' # fs_item = url.find(r'some text') # fs_item += 9 # fe_item= url.find(r'"',fs_item) # item.append(web_page[:]+url[fs_item:fe_item]) # ps_s = url.find(r'"some text">Next</a></td>') # if ps_s is -1: no_next = 1 # else: no_next = 0 # if not no_next: # ps_s-=25 # ps_e = url.find(r'"',ps_s) # next_page = '' # next_page= next_page + web_page[:] + url[ps_s:ps_e] # for items in range(1,25): # ss_item = url.find(r'some text',fe_item) # if ss_item is -1: break # ss_value = ss_item # char_to_kill = url[ss_item-1:ss_item] # if url[ss_item-1:ss_item] is char_to_kill: ss_item = url.find(r'some text'ss_value+9) # if ss_item is -1: # if no_next is 1: # return item # <------ i want to exit the loop here! # else: # find_items(next_page) # else: pass # ss_item+=9 # se_item = url.find(r'"',ss_item) # item.append(web_page[:]+url[ss_item:se_item]) # fe_item = ss_item # return (item) # <------------- or here but i cannot |
|
|
|
|
|
#2 |
|
Programming Guru
![]() Join Date: Apr 2005
Posts: 1,825
Rep Power: 5
![]() |
"if ss_item is -1: break" only breaks out of the for loop. The while loop is still being executed, and therefore remains in an infinite loop.
I'd suggest, in these situations, to print out the relevant values, as they are assigned, to see what's happening. Furthermore, I'm not sure where you're defining "item", but you should define it in the local scope. |
|
|
|
|
|
#3 |
|
Newbie
Join Date: Oct 2006
Posts: 23
Rep Power: 0
![]() |
I define item in the global scope because when i call the function again it will clear the list but i need to break out of the while loop also...... please explain what you intended!
|
|
|
|
|
|
#4 |
|
Programming Guru
![]() Join Date: Apr 2005
Posts: 1,825
Rep Power: 5
![]() |
If you want to break out of both loops at "if ss_item is -1: break", then use return and not break.
Your necessities for defining item in the global scope seems irritational. You have a better alternative somewhere that you could be using. However, I guess that's just a nitpick with where this program's situated at the time being. |
|
|
|
|
|
#5 |
|
Newbie
Join Date: Oct 2006
Posts: 23
Rep Power: 0
![]() |
Correct however when i execute a return it won't exit the function it continues the function as if nothing happend! any help there!
That last return won't return the value! |
|
|
|
|
|
#6 |
|
Programming Guru
![]() Join Date: Apr 2005
Posts: 1,825
Rep Power: 5
![]() |
The last return won't return the value because it's never reached! When you break out of the for loop, the program remains in the while loop. Since you break out of the for loop with a "break" statement, it never even gets a return.
Try it yourself, throw in a print "ss_item" right before the line "if ss_item is -1: break". You'll see that it infinite loops as it breaks out of the for loop, then re-enters it. Every time never reaching a return statement. Please learn to debug your programs effectively, it's a useful ability. |
|
|
|
|
|
#7 |
|
Newbie
Join Date: Oct 2006
Posts: 23
Rep Power: 0
![]() |
I have removed the while loop and i allways use a debugger but the issue still remains! it executes the return item twice and then continues on going with the loop!
|
|
|
|
|
|
#8 |
|
Programming Guru
![]() Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 5
![]() |
It's unlikely your program is doing what you think it is doing. If the function is not returning, then the return statement is never executed. Perhaps you could provide the code to your altered program? The one without the while loop?
|
|
|
|
|
|
#9 |
|
Newbie
Join Date: Oct 2006
Posts: 23
Rep Power: 0
![]() |
item = []
def find_items(url):
try:
url = urllib.urlopen(url).read()
except:
return None
web_page = '<sometext>'
fs_item = url.find(r'<sometext>')
fs_item += 9
fe_item= url.find(r'"',fs_item)
item.append(web_page[:]+url[fs_item:fe_item])
ps_s = url.find(r'"<sometext>">Next</a></td>')
if ps_s is -1: no_next = 1
else: no_next = 0
if not no_next:
ps_s-=25
ps_e = url.find(r'"',ps_s)
next_page = ''
next_page= next_page + web_page[:] + url[ps_s:ps_e]
for items in range(1,25):
ss_item = url.find(r'<sometext>',fe_item)
if ss_item is -1: break
ss_value = ss_item
char_to_kill = url[ss_item-1:ss_item]
if url[ss_item-1:ss_item] is char_to_kill: ss_item = url.find(r'<<sometext>',ss_value+9)
if ss_item is -1:
if no_next is 1:
break
else:
find_items(next_page)
else: pass
ss_item+=9
se_item = url.find(r'"',ss_item)
item.append(web_page[:]+url[ss_item:se_item])
fe_item = ss_item
return itemOk this is the code i have now the issue is that i am stuck on the last return! it executes twice and the continues with ss_item+=9 now to answer a few questions.... 1.This issue is only a problem if there is a next page ( no_next is set to 0) so the function has to be called again! 2. In the webpage there is two of the same objects therefore the first one checks for errors and the second one ends the program for example to find aa in the string "aa is aa" you will find it twice! |
|
|
|
|
|
#10 |
|
Programming Guru
![]() Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 5
![]() |
The problem is that your program never reaches a return statement. The first return statement is here. I've highlighted important lines in red:
# for items in range(1,25): # ss_item = url.find(r'some text',fe_item) # if ss_item is -1: break # ss_value = ss_item # char_to_kill = url[ss_item-1:ss_item] # if url[ss_item-1:ss_item] is char_to_kill: ss_item = url.find(r'some text'ss_value+9) # if ss_item is -1: # if no_next is 1: # return item # <------ i want to exit the loop here! The second return statement is outside the while loop. Since this is an infinite loop, and since you never break out of it, the second return statement is never reached either. There is also a fair amount of code that is redundant. For example: # char_to_kill = url[ss_item-1:ss_item] # if url[ss_item-1:ss_item] is char_to_kill: ss_item = url.find(r'some text'ss_value+9) Also, it's usual to use "==" instead of "is" when testing for equality. The "==" operator checks whether something is equal to something else. The "is" operator checks to see whether two variables refer to the same object. To use an analogy, the phrase "Tom and Bob are the same person", would be an example of "is", whereas "Tom is either a clone of Bob, or the same person", would be an example of "==". It's a little complex, but the rule of thumb is to use "==" by default. It may save you from a few unexpected results. |
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Combining languages | titaniumdecoy | Other Programming Languages | 12 | Jul 13th, 2006 2:03 PM |
| libraries | matko | C | 1 | Jan 22nd, 2006 2:12 PM |
| Php Postgresql Class | Pizentios | Show Off Your Open Source Projects | 15 | Jun 28th, 2005 9:55 AM |
| Jackpot game | zorin | Visual Basic | 3 | Jun 10th, 2005 1:19 PM |
| airport Log program using 3D linked List : problem reading from file | gemini_shooter | C++ | 0 | Mar 2nd, 2005 4:12 PM |