Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Jan 31st, 2006, 3:30 PM   #1
tayspen
Hobbyist Programmer
 
Join Date: Sep 2005
Location: A House...
Posts: 191
Rep Power: 4 tayspen is on a distinguished road
Screen Scraping

Hi,

I have a question about screen scraping and Regular Expressions. There is a website, and of the source code, I want to get this line

PLAYERS:<b><br><font color="#ea0437"><b>Be Text Here</b></font><br><font color="#ea0437"><b>Be Text Here</b></font><br><font color="#ea0437"><b>Be Text Here</b></font><br><br><font color="#1796cb"><b>Be Text Here</b></font><br><font color="#1796cb"><b>Be Text Here</b></font><br>

Then after i get that line i want to go further and get all the text where the

Quote:
Originally Posted by it says Be Text Here but it will be differ
Be Text Here
is, then i want to list those in a listbox. Can this be done? Also that line might have more instances of Be Text Here it will vary....
tayspen is offline   Reply With Quote
Old Jan 31st, 2006, 6:29 PM   #2
Dameon
Troll
 
Dameon's Avatar
 
Join Date: Apr 2005
Location: Texas
Posts: 732
Rep Power: 4 Dameon is on a distinguished road
If the source code is XHTML compliant, use the nifty features of the XML namespace. Load it as an XML document and use XPath to select the nodes you want.
__________________
MD5(sig) = bcef75433db02e9ad9bf81d6f7c5c270
Dameon is offline   Reply With Quote
Old Jan 31st, 2006, 7:18 PM   #3
tayspen
Hobbyist Programmer
 
Join Date: Sep 2005
Location: A House...
Posts: 191
Rep Power: 4 tayspen is on a distinguished road
English please? Lol, serioulsy
Quote:
Originally Posted by Dameon said this....
If the source code is XHTML compliant
how do i tell?

Quote:
Originally Posted by again...
use the nifty features of the XML namespace. Load it as an XML document and use XPath to select the nodes you want.
What are the features? And what is XPath?

*Goes to look on MSDN*

-T
tayspen is offline   Reply With Quote
Old Feb 8th, 2006, 5:09 PM   #4
hoffmandirt
Hobbyist Programmer
 
hoffmandirt's Avatar
 
Join Date: Jul 2005
Location: PA
Posts: 125
Rep Power: 4 hoffmandirt is on a distinguished road
Send a message via AIM to hoffmandirt
The source doesn't look XHTML compliant so it looks like you are just going to have to hope the html code never changes. After you get the first line that you want, you will probably have to parse that line based on the second bold tag you come across. This is why screen scraping is not recommended, but sometimes there is no other choice. If someone takes the bold tags out then your code will not work correctly anymore. Just a chance you gotta take.
hoffmandirt is offline   Reply With Quote
Old Feb 11th, 2006, 11:03 PM   #5
kurifu
Expert Programmer
 
kurifu's Avatar
 
Join Date: Jul 2004
Location: Halifax, Nova Scotia (Canada)
Posts: 784
Rep Power: 5 kurifu is on a distinguished road
Send a message via ICQ to kurifu Send a message via MSN to kurifu
That is one of the great things about writing screen scrapers, putting terms in the contract which make you more money when the web source does change Of course, this is when you are writing the code for someone else, who is much less programming savy
__________________
Clifford Matthew Roche &lt;geek@cliffordroche.com&gt;
Web Hosting: http://www.crd-hosting.com
Consulting: http://www.crdev-consulting.com
kurifu is offline   Reply With Quote
Old Feb 13th, 2006, 8:49 AM   #6
hoffmandirt
Hobbyist Programmer
 
hoffmandirt's Avatar
 
Join Date: Jul 2005
Location: PA
Posts: 125
Rep Power: 4 hoffmandirt is on a distinguished road
Send a message via AIM to hoffmandirt
Hmmm I never thought about it like that
hoffmandirt is offline   Reply With Quote
Old Feb 15th, 2006, 12:09 PM   #7
Sridhar
Newbie
 
Join Date: Feb 2006
Posts: 1
Rep Power: 0 Sridhar is on a distinguished road
hello everyone, Can anyone help me... I am right now working on a project to create a Microsoft Outlook 2003 Addin... Tell me how to integrate a stand alone application in to Microsoft Outlook 2003
Sridhar is offline   Reply With Quote
Old Feb 15th, 2006, 6:43 PM   #8
Dameon
Troll
 
Dameon's Avatar
 
Join Date: Apr 2005
Location: Texas
Posts: 732
Rep Power: 4 Dameon is on a distinguished road
1. Read the posting guidelines
2. Make your own thread. Hijacking is bad.
3. ???
4. Success!
__________________
MD5(sig) = bcef75433db02e9ad9bf81d6f7c5c270
Dameon is offline   Reply With Quote
Old Feb 15th, 2006, 9:36 PM   #9
J_Kay
Newbie
 
J_Kay's Avatar
 
Join Date: Feb 2006
Location: Indiana
Posts: 7
Rep Power: 0 J_Kay is on a distinguished road
Send a message via Yahoo to J_Kay
Quote:
Originally Posted by Dameon
1. Read the posting guidelines
2. Make your own thread. Hijacking is bad.
3. ???
4. Success!
I thought step 3 was always Profit. Just a thought......
J_Kay is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 12:23 PM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC