![]() |
How could I do this?
Hi everyone,
I visit a forum in which I like to save images from. People post images on the forum, and threads usually span around 10-700 pages. A thread that I would like to currently get images from is this one: http://bombingscience.com/graffitifo...opic=4900&st=0 I was wondering, which programming language would be best suited (and easiest) to download the .jpg images. I would like the program to recursively go through each thread page and download the images to a folder (omitting signatures, and website images). Can anyone point me in the right direction on how I might go about doing this? Thanks |
You could get a crawler and set it up to download every image from that page. I don't know specifics, but that may point you in the right direction as far as googling goes.
|
i think wget has a recursive option with a -A option that allows you to specify the filetype you wanna download.
ie. :
wget -r -l1 --no-parent -A.gif http://www.server.com/dir/ |
microsoft published a neat book called "programming bots, spiders, and intelligent agents in visual C++". using it and the libraries for a school project right now. would simplify the process for $50.00 or whatever they charge used on amazon.
|
Using a program like wget or curl would be easiest. Other than that, Python or Perl would probably be a good choice of language for this sort of work. Visual C++ strikes me as overkill for something a scripting language could accomplish in a fifth the time.
|
Quote:
|
Since the pages of the forum have predictable URLs, you could do something like:
:
for i in $(seq 0 15 525); do |
Quote:
|
Well, in Python it wouldn't be dissimilar. Perhaps:
:
Note that I haven't tried the above script in full. Probably needs some tweaking. |
Quote:
:
E:\Documents and Settings\Mark-James McDougall\Desktop\Script>grabber.py |
| All times are GMT -5. The time now is 9:13 PM. |
Powered by vBulletin® Version 3.7.0, Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Copyright ©2007 DaniWeb® LLC