![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
|
|
#1 |
|
Hobbyist Programmer
|
Site Crawler
http://"username":"password"@www.crazedmindz.com:2082/frontend/x/index.html
would log me in to the CPanel for crazedmindz.com Would a script be able to crawl a password protected page like that, and use something like substr() and strpos() to grab certain information from the page, like first search the whole page for any links to pages that are also on that website, and then search the page for links to a certain type of file, and add them all to a database, and then go through the pages that were found and do the same thing? I ain't gonna lie, I plan on using this to crawl porn sites. Please help... |
|
|
|
|
|
#2 |
|
Hobbyist Programmer
Join Date: Jun 2005
Location: MA, US
Posts: 204
Rep Power: 4
![]() |
I wouldn't try doing that with PHP... since you are really dealing with the site(s) from the client side, it is not appropriate to use a server side scripting language. I would dump the page source to a local file then write a bash/perl/octave/?whatever script that parses through it looking for href tags and file extensions etc. and adds the desired information to your db.
__________________
"A stupid man's report of what a clever man says can never be accurate, because he unconciously translates what he hears into something he can understand." - B. Russell http://web.bryant.edu/~srk2 |
|
|
|
|
|
#3 |
|
Programming Guru
![]() |
Well yes, if you have the username and password of course this is possible. What you need to do is use sockets and send a specially crafted HTTP packet asking for the page using those credentials. Possibly more than one packet, which will require analyzing how the auth has to work. Good luck, tempest.
Edit: This should be of some help: http://www.faqs.org/rfcs/rfc2617 .
__________________
Last edited by tempest; Jun 10th, 2005 at 10:16 AM. |
|
|
|
|
|
#4 |
|
Programmer
Join Date: Jun 2005
Location: Queensland
Posts: 37
Rep Power: 0
![]() |
ive tried to do something similar to this, except trying to find my usage stats from my isp. i made a form with hidden inputs and pre-written values, and it just redirects to the page i wanted to goto and submits the info. its a quick way to view ur stats. u cant view the stuff thats inside the the secure page tho, u need a machine based language, gnome maybe.
|
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|