Programming Forums
User Name Password Register
 

RSS Feed
FORUM INDEX | TODAY'S POSTS | UNANSWERED THREADS | ADVANCED SEARCH

Reply
 
Thread Tools Display Modes
Old Oct 10th, 2004, 2:58 AM   #1
Tuskony
Newbie
 
Join Date: Oct 2004
Posts: 3
Rep Power: 0 Tuskony is on a distinguished road
Hey,

I'd like to create a simple C++ spider that will retrieve web pages from a certain list of news sites, I will then take these pages and parse them for content. This content will then be sent to a database, where upon it can be retrieved by a simple PHP script.

Basically I love reading the news and I read about 10 different sites, so I want to create my own little "news portal" so I don't have to run to 10 different sites... It'll all be displayed to me, and updated every hour. It's for personal use only so don't worry about copyright issues, and I want to see if I can actually make this work as a nice little challenge.

I know how to do everything but actually download web pages with C/C++. And yes, I have searched the internet quite a few times but I haven't really found anything too helpful.

Any help would be greatly appreciated.


Thanks
Tuskony is offline   Reply With Quote
Old Oct 10th, 2004, 4:26 AM   #2
Ravilj
Programmer
 
Ravilj's Avatar
 
Join Date: Sep 2004
Location: JHB , South Africa
Posts: 79
Rep Power: 4 Ravilj is on a distinguished road
If you have visual studio .net that you can use the system.net namespace, it has everything you need there. Also there is a some documentation in the msdn libraries on how to do this in C# but i am sure u will be able to port this 2 cpp.
__________________
Ravilj's OpenGL Terrain aka WinTerrain Last Updated: 17/01/2005!
Ravilj is offline   Reply With Quote
Old Oct 10th, 2004, 4:12 PM   #3
Tuskony
Newbie
 
Join Date: Oct 2004
Posts: 3
Rep Power: 0 Tuskony is on a distinguished road
Sorry I forgot to mention I'll be running this spider on my Linux box, so only raw sockets will do (I think).
Tuskony is offline   Reply With Quote
Reply

Bookmarks

« Previous Thread in Forum | Next Thread in Forum »

Currently Active Users Viewing This Thread: 1 (0 members and 1 guests)
 
Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump




DaniWeb IT Discussion Community
All times are GMT -5. The time now is 10:02 AM.

Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC