前言
- 本系列文章来源于真实的需求
- 本系列文章你来提我来做
- 本系列文章仅供学习参考
one:Leave a message at the end of the article
two:Get wechat contact information
一、需求
http://sc.cqnync.cn/marketSta/
获取数据,日期,价格,商品,地区,品种
二、分析
查看网页源代码
1、地区id
2、农场品数据
数据嵌入在网页源码中《tbody》标签中
获取网页源代码
import requests
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36'
}
def get_agriculture_info(url):
"""
获取网页源代码
@param url: url
@return: 源代码
"""
response = requests.request("GET", url, headers=headers)
return response.text
数据处理
import re
import time
from pyquery import PyQuery as pq
def process_data(code):
"""
数据处理
@param code: 网页源代码
@return:
"""
doc = pq(code)
return doc
def get_marks(code):
"""
获取地区信息
@param code: 网页源代码
@return: 地区信息
"""
deal = re.compile(r'<option value ="a9">.*</option>',re.S)
deal1 = re.compile(r'<option value ="(?P<value>.*?)">(?P<name>.*?)</option>')
for area in deal.findall(code):
value = deal1.findall(area)
return value
def get_goods_info(code):
"""
获取商品信息
@param code: 网页源代码
@return: 返回商品信息
"""
doc = process_data(code)
info_list = doc.find("table#ctl00_list__list")
gods_name = info_list.find("tbody tr td.variety").text().split(" ")
date_time = info_list.find("tbody tr td.gatherTime").text().split(" ")
gods_type = info_list.find("tbody tr td.saleType").text().split(" ")
gods_price = info_list.find("tbody tr td.price").text().split(" ")
gods_unit = info_list.find("tbody tr td.unit").text().split(" ")
return tuple(zip(gods_name,gods_price,date_time,gods_type,gods_unit))
三、运行
if __name__ == '__main__':
url = "http://sc.cqnync.cn/marketSta/"
code = get_agriculture_info(url)
with open("重庆市农场品市场数据.txt",mode="w",encoding="utf-8") as f:
area_list = get_marks(code)
for area in area_list:
number = area[0]
name = area[1]
url = f"http://sc.cqnync.cn/marketSta/?mexp={number}"
new_code = get_agriculture_info(url)
gods_info = get_goods_info(new_code)
f.write(f"---------------{name}-----------------\n")
for gods in gods_info:
f.write(f"{str(gods)}\n")
print(f"{name}写入数据完成!")
time.sleep(2)
print("数据全部写入完成")
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)