![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Newbie
Join Date: Aug 2006
Posts: 13
Rep Power: 0
![]() |
urllib and save pictures
Hi.
This is my first post, and is about an assignment I've at my college. An overall description: We have to make a function, with one argument, the URL. then we have to search the HTML code for any pictures, and to do that I will search for <img and src tags. All that I can, but then we have to save the pictures local on my harddrive, and make a collage with all the pictures in it. My hindrance right now is the saving part. For testing the script, I'm using this code: def getImageUrl(urlstring):
import urllib
connection=urllib.urlopen(urlstring)
picture = connection.read()
connection.close()
curloc = picture.find("img")
if curloc <> -1:
picloc = picture.find("<src", curloc)
picstart = picture.rfind(">",0,picloc)
#writefile.open(picture,"wt")
pic = open(picture, 'wb').read
picture = urllib.urlopen(urlstring)
pic.write(picture)
pic.close()
else:
print "There is no pictures in this URL"I know my code isn't optimized, but I just can't seem to find the function, so it will save my pictures... In advanced thanks. Greetings Public2 |
|
|
|
|
|
#2 |
|
Programming Guru
![]() Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 5
![]() |
You're on the right track, but there are three problems that I can see with your code. Firstly, you appear to be looking for a 'src' tag, when it's an attribute. Secondly, you're trying to open a file named picture, where picture is a variable containing your HTML page. Thirdly, you're not getting the URL of the image, you're getting the URL of the page again.
Whenever I'm doing any work with HTML in Python, I use Beautiful Soup. It's wonderfully easy to use, and comes as a single py file, so it's really rather good. Using Beautiful Soup, your function might look like: python Syntax (Toggle Plain Text)
|
|
|
|
|
|
#3 |
|
Newbie
Join Date: Aug 2006
Posts: 13
Rep Power: 0
![]() |
Hey Arevos.
Thanks for your answer, I just got one problem that is, I don't think we are allowed to import external codes like BeautifulSoup. My code can detect that there is pictures in the HTML code, but I just can't seem to save them to my harddrive. I'll try to make the code work, but it is more difficult then I thought it would be. |
|
|
|
|
|
#4 |
|
Programming Guru
![]() Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 5
![]() |
If you've already got the "src" attribute, you can just use the inner-most indentation of the previous code:
python Syntax (Toggle Plain Text)
python Syntax (Toggle Plain Text)
|
|
|
|
|
|
#5 |
|
Newbie
Join Date: Aug 2006
Posts: 13
Rep Power: 0
![]() |
Hey again.
I finally got finished with my assignment, and thought I would write the code down here. It turned out that we had to make most of the code in Jython, so some of the modules couldn't be used, but I managed anyway. Here is the complete code: import urllib
from urlparse import urljoin
import random
def makeCollageFromUrl(urlString):
listOfImages = getImagesUrl(urlString)
imageNames = []
for imageUrl in listOfImages:
filename = saveImage(imageUrl)
imageNames.append(filename)
width = 640
height = 480
picture = makeEmptyPicture(width,height)
for imageName in imageNames:
p = makePicture(imageName)
if p.getWidth()<width and p.getHeight()<height:
copyPictureToPicture(p,picture,random.randint(0,width-p.getWidth()),random.randint(0,height-p.getHeight()),0.5)
picture.show()
writePictureTo(picture,r"C:\HTMLCollage.jpg")
def getImagesUrl(urlString):
connection=urllib.urlopen(urlString)
getPictures = connection.read()
connection.close()
executeIndex = 0
PicHTMLlist = []
while getPictures.find("<img",executeIndex) <> -1:
currentPicIndex = getPictures.find("<img",executeIndex)
currentSrcIndex = getPictures.find("src=",currentPicIndex)
nxtIndex = getPictures.find(">",currentSrcIndex)
executeIndex = nxtIndex
if getPictures.find("http",currentSrcIndex,nxtIndex)!=-1:
end = getPictures.find(" ",currentSrcIndex,nxtIndex)
currentPic = getPictures[currentSrcIndex+4:end]
currentPic = currentPic.replace('"'," ")
currentPic = currentPic.replace("'"," ")
repCurrPic = currentPic.lstrip()
repCurrPic = repCurrPic.rstrip()
if repCurrPic.rfind(".jpg") != -1 or repCurrPic.rfind(".gif") != -1:
PicHTMLlist.append(repCurrPic)
return PicHTMLlist
def saveImage(urlString):
connection = urllib.urlopen(urlString)
getPictures = connection.read()
connection.close()
sepIndex = urlString.rfind("/")
filnavn = urlString[(sepIndex+1):]
file = open(filnavn,"wb")
file.write(getPictures)
file.close()
return filnavn
def copyPictureToPicture(sourcePic,targetPic,offsetX,offsetY, blend):
for x in range(1,sourcePic.getWidth()+1):
for y in range(1,sourcePic.getHeight()+1):
color = sourcePic.getPixel(x,y).getColor()
targetPixel = targetPic.getPixel(x+offsetX,y+offsetY)
targetColor = targetPixel.getColor()
targetPixel.setRed(int(color.getRed()*blend+targetColor.getRed()*blend))
targetPixel.setGreen(int(color.getGreen()*blend+targetColor.getGreen()*blend))
targetPixel.setBlue(int(color.getBlue()*blend+targetColor.getBlue()*blend))Have a great evening. Greetings Public2 Last edited by public2; Oct 31st, 2006 at 4:03 PM. |
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|