View Single Post
Old May 30th, 2006, 4:00 PM   #216
Arevos
Programming Guru
 
Arevos's Avatar
 
Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 5 Arevos is on a distinguished road
Well, currently get_company_urls returns a long list of URLs, like so:
["http://...", "http://...", ...]
If we want to return whether the company is public, we'd need a list of URLs and a value telling us whether that company was public:
[ ("http://...", True), ("http://...", False), ...]
Where True means the company is public, and False means it is not.

This method of encoding a pair of values, rather than a single one, is called a tuple. In this case, we have a list of tuples, which each tuple being two values long. The first value is the URL, the second value is whether the company is public.

To make use of this new return value, the company_urls for-loop would have to be modified slightly, and the get_company_data function would need to take in an extra argument:
        for company_url, is_public in get_company_urls(company_index):
            print get_company_data(company_url, is_public)
            sleep(1)
I think the first step to achieving this is to rewrite get_company_urls so that instead of a list comprehension, it uses the map function instead. See here for my brief explanation of different ways of handling list data.


Of course, I should also mention that it might be easier to find some way of gauging whether a company is public or not from the company page. That way you wouldn't have to alter get_company_urls, only extend get_company_data.

However, this may not be possible, as Yahoo! might not give information on whether a company is public or not on the company data page - but it's worth checking out, in case it is possible.
Arevos is offline   Reply With Quote