![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
#11 |
|
Hobbyist Programmer
Join Date: May 2006
Posts: 127
Rep Power: 3
![]() |
ah I like the sound of the last post... I could either upload the spreadsheet I'm working or attach it in an email if you'd like to take a look. The formatting can be changed I just tried to set it up to fit the most information on a page. My superior just wants the information presented in a clean organized fashion, thanks for the responses guys
|
|
|
|
|
|
#12 |
|
Hobbyist Programmer
Join Date: May 2006
Posts: 127
Rep Power: 3
![]() |
okay I uploaded what I'm currently working on to esnips, I'm not sure if you need to be a member to access it, but I don't think so, let me know if you have any problems with the link
http://esnips.com/web/zem52887sBusinessFiles |
|
|
|
|
|
#13 |
|
Programming Guru
![]() Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 4
![]() |
Be careful; I said "easier", not "easy". The screen scraping appears fairly straightforward, as all the pages occupy the same format, but sometimes things have a nasty habit of turning out to be more difficult than they first appeared.
I'll take a look at the spreadsheet now, though... |
|
|
|
|
|
#14 |
|
Hobbyist Programmer
Join Date: May 2006
Posts: 127
Rep Power: 3
![]() |
hah yeah I understand thanks for the disclaimer I don't want to get my hopes up too soon.
My boss just came over and asked how it was going so I told him what I was trying to accomplish to which he responded "if you get it done, you'll get a great recommendation" if I can get this done, I will definately find a way to compensate whoever I can given my meager $10usd/hr intern salary. again formatting is flexible, if I could at least get the stuff into excel I don't mind having to change the formatting and what not if it saves me time. I seriously am acquiring muscle memory, I'm convinced I was moving my fingers as if a keyboard were in front of them in my sleep. |
|
|
|
|
|
#15 |
|
Programming Guru
![]() Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 4
![]() |
Okay. It all appears fairly straightforward. The only real problem is the bullet-pointed list, though there may be a way of handling that in a macro, or with some special formatting. Python and Beautiful Soup are fairly good tools for tackling screen scraping like this. This problem seems interesting, so I'll knock up a few examples to set you on the right track when I get home from work in around an hour's time.
Meanwhile, you can download Python and have a play with it. Python's a programming language that's reasonably easy to get to grips with (as programming languages go). There's quite a few tutorials for beginners listed on the Python site, and the interactive interpreter is a good way to experiment with what goes where. At the end of the day, your task is a limited one, and therefore to know all the functionality of Python is not required. That said, it's not a bad idea to get familiar with the basics, especially since they might serve you well in future. |
|
|
|
|
|
#16 |
|
Hobbyist Programmer
Join Date: May 2006
Posts: 127
Rep Power: 3
![]() |
hah well now I'm started to get a bit excited. I'm working on an office computer so I can't download anything due to admin privileges but I'll start reading the tutorials to familiarize myself with it and then hopefully when I get home I can get something done.
also, the bulleted list is copied straight from the website, the key people are displayed on the website in a bulleted list within the table. Being a noob I'm not sure why this would be a problem, but I just wanted to let you know that I did not manually bullet them. |
|
|
|
|
|
#17 | ||
|
Resident Grouch
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Jun 2005
Posts: 6,453
Rep Power: 10
![]() |
Quote:
Quote:
. It still isn't trivial.
__________________
Abstraction doesn't make it impossible to write bad code; it makes it possible to write superior code. Contributor's Corner: Grumpy on C++ Exceptions DaWei on Pointers |
||
|
|
|
|
|
#18 | |
|
Hobbyist Programmer
Join Date: May 2006
Posts: 127
Rep Power: 3
![]() |
Quote:
![]() |
|
|
|
|
|
|
#19 |
|
Programming Guru
![]() Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 4
![]() |
I've taken a further look at this, and it seems like Yahoo! Business isn't really a fan of the semantic web, but sure does like it's tables. The hardest part of this will be navigating through all of the tables Yahoo! has on its pages.
Do you know much about HTML? If not, it's best to find some tutorial online to give you a brief crash-course in it. ![]() Meanwhile, I'll go over a bit of Python and Beautiful Soup to get you going. Once you've installed both of these, run the Python interactive prompt. Python can be run in two ways; as a fixed script, or interactively. The interactive method is generally used for experimentation. When you run the interactive prompt, you'll be presented with something like this: Python 2.4.3 (#2, Apr 27 2006, 14:43:58) [GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> Anyway, this is the prompt. You type in code, press enter, and Python evaluates it and returns an answer. For instance: >>> 10 + 5 15 >>> print "Hello World" Hello World >>> from urllib2 import urlopen
>>> urlopen("http://www.google.com").read()And my dinner's just about finished cooking, I think, so I'll post the rest up later. |
|
|
|
|
|
#20 |
|
Hobbyist Programmer
Join Date: May 2006
Posts: 127
Rep Power: 3
![]() |
I used to design HMTL websites back in like 6th grade so although I don't remember every little detail, I'll be able to pick it up quickly (not that it would be particularly hard to learn new altogether) so I'm not at a complete loss, and I've been reading the basic inputs for python (ie print commands etc) so I'm beginning to learn a little bit about python as well. Enjoy your dinner
|
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|