前后折腾了好多天,不废话,先直接上代码,再分析:
1 import aiohttp
2 import asyncio
3 import aiofiles
4
5 header = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1',
6 'Referer': 'https://www.mzitu.com/',
7 'Accept': "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
8 'Accept-Encoding': 'gzip',
9 }
10
11 async def fetch(session, url):
12 async with session.get(url, proxy='http://59.62.164.252:9999') as response:
13 return await response.read()
14
15 async def main():
16 async with aiohttp.ClientSession(headers=header) as session:
17 content = await fetch(session, 'https://i.meizitu.net/thumbs/2019/03/174061_01e35_236.jpg')
18 print(content)
19 async with aiofiles.open('D:/a.jpg', 'wb') as f:
20 f.write(content)
21
22 loop = asyncio.get_event_loop()
23 loop.run_until_complete(main())
24 loop.close()
开始心路历程:
1、看了廖雪峰老师python教程中协程一章节、《流畅的python》中协程一章节,以及前前后后网上查询的资料,不管怎么改均报错,人接近暴走状态。
最后Google查询ClientSession:Client Reference,复制源码做尝试:
1 import aiohttp
2 import asyncio
3
4 async def fetch(client):
5 async with client.get('http://python.org') as resp:
6 assert resp.status == 200
7 return await resp.text()
8
9 async def main():
10 async with aiohttp.ClientSession() as client:
11 html = await fetch(client)
12 print(html)
13
14 loop = asyncio.get_event_loop()
15 loop.run_until_complete(main())
运行成功
2、改为下载图片,并想fetch函数能不能直接返回response?
1 import aiohttp
2 import asyncio
3 import aiofiles
4
5 header = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/22.0.1207.1 Safari/537.1',
6 'Referer': 'https://www.mzitu.com/',
7 'Accept': "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
8 'Accept-Encoding': 'gzip',
9 }
10
11 async def fetch(session, url):
12 async with session.get(url) as response:
13 return response
14
15 async def main():
16 async with aiohttp.ClientSession() as session:
17 response = await fetch(session, 'https://i.meizitu.net/thumbs/2019/03/174061_01e35_236.jpg')
18 print(response.read())
19 with open('D:/a.jpg', 'wb') as f:
20 f.write(response.read())
21
22 loop = asyncio.get_event_loop()
23 loop.run_until_complete(main())
24 loop.close()
运行直接报错:
貌似fetch函数中不能返回response?百思不得姐,问题先放这,以后再解决吧
3、根据上面ClientSession文档中介绍:
请求头header应放在ClientSession实例化中
4、aiohttp supports HTTP/HTTPS proxies
但是,它根本就不支持 https 代理。
可参考 Python3 异步代理爬虫池
头疼,先写这么多吧
最后尝试貌似代理ip又有问题,晕