Thread: Site Crawler
View Single Post
Old Jun 10th, 2005, 9:25 AM   #2
skuinders
Hobbyist Programmer
 
skuinders's Avatar
 
Join Date: Jun 2005
Location: MA, US
Posts: 204
Rep Power: 4 skuinders is on a distinguished road
I wouldn't try doing that with PHP... since you are really dealing with the site(s) from the client side, it is not appropriate to use a server side scripting language. I would dump the page source to a local file then write a bash/perl/octave/?whatever script that parses through it looking for href tags and file extensions etc. and adds the desired information to your db.
__________________
"A stupid man's report of what a clever man says can never be accurate, because he unconciously translates what he hears into something he can understand."
- B. Russell

http://web.bryant.edu/~srk2
skuinders is offline   Reply With Quote