python 爬图片毛病

管理员 2023-08-28 08:02:07 软件开发 0 ℃ 0 评论 1864字收藏

python 爬图片毛病

最近使用Python编写爬虫程序爬取图片，在调试的进程中，遇到了一些毛病，现在我将分享这些毛病与解决方法。

# 1. SSL毛病
import requests
url = 'https://www.example.com/'
response = requests.get(url)
image_content = response.content
# 解决方法
import urllib3
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
response = requests.get(url, verify=False)

这个毛病是由于网站使用https加密方式，需要验证证书，但有时候证书会过期或不受信任，因此需要禁用证书验证。

# 2. 图片没法下载
import os
import urllib
url = 'https://www.example.com/image.jpg'
image_name = 'image.jpg'
response = urllib.request.urlretrieve(url, image_name)
# 解决方法
from requests.exceptions import RequestException
try:
response = requests.get(url, stream=True)
with open(image_name, 'wb') as file:
for chunk in response.iter_content(chunk_size=1024):
file.write(chunk)
except RequestException:
print('没法下载')

这个毛病通常是由于下载进程中出现了网络毛病或服务器谢绝下载要求，需要使用requests库的流传输和捕获异常。

# 3. 图片没法保存
import os
url = 'https://www.example.com/image.jpg'
image_name = 'image.jpg'
response = requests.get(url)
with open(image_name, 'wb') as file:
file.write(response.content)
# 解决方法
try:
if not os.path.exists(image_name):
with open(image_name, 'xb') as file:
file.write(response.content)
except FileExistsError:
print('图片已存在')

这个毛病通常是由于同名文件已存在，应当检查并跳过已存在的图片。

通过以上的解决方法，我成功地解决了Python爬虫程序中遇到的各种毛病，希望这些解决方法能帮助到其他初学者。

文章来源：丸子建站

文章标题：python 爬图片毛病

https://www.wanzijz.com/view/74483.html

python 爬图片毛病

相关文章

随机看看

热门文章

热门标签