Quote:
|
Originally Posted by zem52887
I'm unfamiliar with that bit of code Cerulean, could you elaborate a bit as to what it does?
|
You already use it (or at least, it was in your code at one point - it seems to have vanished from your finished version). You can use sleep like this:
import time
time.sleep(10) # sleep for 10 seconds
Or like this:
from time import sleep
sleep(10) # sleep for 10 seconds
Note the difference between "import" and "from ... import ...".
The same holds true for any module:
import some_module
some_module.some_function()
from some_module import some_function
some_function()
Quote:
|
Originally Posted by zem52887
Also, is there a way to have the program check the cache and skip any company links already found there thus adding only new data to an HTML as opposed to building a new data.html from scratch beginning with the data I already have?
|
Maybe something like:
for company_url in get_company_urls(company_index):
if cache.has_key(company_url):
file.write(get_company_data(company_url))
print get_company_data(company_url)
# And remember to pause so the server isn't overloaded:
sleep(1)