Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Oct 20th, 2004, 2:46 PM   #1
goldenb0y
Newbie
 
Join Date: Oct 2004
Posts: 15
Rep Power: 0 goldenb0y is on a distinguished road
What's the best way to go about checking a HTML website and gather data from it ina C program? I am having a bit of trouble with Google do to the common string "HTML" Thanks alot!
goldenb0y is offline   Reply With Quote
Old Oct 20th, 2004, 4:46 PM   #2
Daggerhex_Flynn
Programmer
 
Join Date: Oct 2004
Location: Canada
Posts: 82
Rep Power: 5 Daggerhex_Flynn is on a distinguished road
You can search a text file and find all of the URL's and stuff like that, but there is no library in Standard C that involves web programming facilities. Maybe try C in conjunction with Python or Perl. I think that Perl is probably the best language to use for this type of program.
Daggerhex_Flynn is offline   Reply With Quote
Old Oct 20th, 2004, 5:10 PM   #3
kurifu
Expert Programmer
 
kurifu's Avatar
 
Join Date: Jul 2004
Location: Halifax, Nova Scotia (Canada)
Posts: 784
Rep Power: 5 kurifu is on a distinguished road
Send a message via ICQ to kurifu Send a message via MSN to kurifu
You can get CURL for C, which will connect to a server for you... intiate a request and retreive data from a variety of resource types, including HTTP and HTTPS.

From that point you would need to find another library to process the HTML, you can likely find a few DOM (Document Object Model) HTML parsers out there.

The other object is that you can embed an MSIE AtciveX control into your application (Has to be window gui based though) and set the control to hidden if you do not want it to be seen. Send the request and once the request is complete use one of its many COM interfaces to extract the code. I have done this before myself (only likely not for the same purpose, and my controls were not hidden) so I know for fact that it is possible.
__________________
Clifford Matthew Roche <geek@cliffordroche.com>
Web Hosting: http://www.crd-hosting.com
Consulting: http://www.crdev-consulting.com
kurifu is offline   Reply With Quote
Old Oct 20th, 2004, 7:54 PM   #4
goldenb0y
Newbie
 
Join Date: Oct 2004
Posts: 15
Rep Power: 0 goldenb0y is on a distinguished road
Problem solved thanks guys

http://curl.haxx.se/libcurl/c/
goldenb0y is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 7:41 PM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC