![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Newbie
Join Date: Oct 2004
Posts: 3
Rep Power: 0
![]() |
Hey,
I'd like to create a simple C++ spider that will retrieve web pages from a certain list of news sites, I will then take these pages and parse them for content. This content will then be sent to a database, where upon it can be retrieved by a simple PHP script. Basically I love reading the news and I read about 10 different sites, so I want to create my own little "news portal" so I don't have to run to 10 different sites... It'll all be displayed to me, and updated every hour. It's for personal use only so don't worry about copyright issues, and I want to see if I can actually make this work as a nice little challenge. I know how to do everything but actually download web pages with C/C++. And yes, I have searched the internet quite a few times but I haven't really found anything too helpful. Any help would be greatly appreciated. Thanks |
|
|
|
|
|
#2 |
|
Programmer
Join Date: Sep 2004
Location: JHB , South Africa
Posts: 79
Rep Power: 4
![]() |
If you have visual studio .net that you can use the system.net namespace, it has everything you need there. Also there is a some documentation in the msdn libraries on how to do this in C# but i am sure u will be able to port this 2 cpp.
__________________
Ravilj's OpenGL Terrain aka WinTerrain Last Updated: 17/01/2005! |
|
|
|
|
|
#3 |
|
Newbie
Join Date: Oct 2004
Posts: 3
Rep Power: 0
![]() |
Sorry I forgot to mention I'll be running this spider on my Linux box, so only raw sockets will do (I think).
|
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|