![在这里插入图片描述](https://img-blog.csdnimg.cn/bcdca58d04204ebe843d90b157e1ab00.png)
获取当前页面下的车型的 表显里程 等数据
![在这里插入图片描述](https://img-blog.csdnimg.cn/dd859eea16514dc9a8ad6c4f19761ad3.png)
结果如下:
![请添加图片描述](https://img-blog.csdnimg.cn/0b05ab1b7750419493aa1d6a61c01093.png)
直接讲代码实现
代码实现基本分四步
1.发送请求
2.获取数据
3.解析数据
4.保存数据
1.发送请求
import requests
url ='https://www.XXX.com/china/list/'
2.获取数据
#header 和cookies 数据直接F12里的数据粘贴过来
response = requests.get('https://www.XXX.com/china/list/', cookies=cookies, headers=headers)
html_data = response.text
用network定位后发现,都是静态网页,所以要用到xpath模块
import parsel
select =parsel.Selector(html_data)
这里要获得车型的网址,经过定位后,用正则取出来网址
//ul[@class="viewlist_ul"]/li/a[@class="carinfo"]/@href
detail_url_list = select.xpath('//ul[@class="viewlist_ul"]/li/a[@class="carinfo"]/@href').getall()
3.解析数据
细节不表,直接上代码
for detail_url in detail_url_list[:-1]:
if detail_url.split('/')[1] == '':
detail_url = 'http:'+detail_url
else:
detail_url ='http://www.XXX.com' + detail_url
# print(detail_url)
detail_html = requests.get(detail_url,headers=headers).text
detail_select = parsel.Selector(detail_html)
brand_name = detail_select.xpath("string(//h3[@class='car-brand-name'])").get("").strip()
biaoxian = detail_select.xpath("//ul[@class='brand-unit-item fn-clear']/li[1]/h4/text()").get("").strip()
shangpai = detail_select.xpath("//ul[@class='brand-unit-item fn-clear']/li[2]/h4/text()").get("").strip()
dangwei = detail_select.xpath("//ul[@class='brand-unit-item fn-clear']/li[3]/h4/text()").get("").strip()
location = detail_select.xpath("//ul[@class='brand-unit-item fn-clear']/li[4]/h4/text()").get("").strip()
standard = detail_select.xpath("//ul[@class='brand-unit-item fn-clear']/li[5]/h4/text()").get("").strip()
price = detail_select.xpath('string(//span[@id="overlayPrice"])').get("").strip()
indict_price = detail_select.xpath('//s[@class="price-nom"]/text()').get("")
print(brand_name,biaoxian,shangpai,dangwei,location,standard,price)
就得到如下的数据
![在这里插入图片描述](https://img-blog.csdnimg.cn/c81b8946e5464dce879543d78af2df4c.png)
4.保存数据
这里可以保存成csv格式