Python系列 | 基于Requests和PyEcharts实现爬取博客数据可视化大屏分析

2023-05-16

博客数据分析大屏可视化实现的效果：
请添加图片描述

一、核心功能设计

学习笔记分享：
博客作者数据分析实现的思路大致为爬虫(用户通过控制台输入用户博客地址和博客文章地址)和大屏可视化展示两方面。

接下来我们可以通过以下几步实现需求：

定义好相关列表准备存储相关信息
读取用户收入的博客地址和博客文章地址
使用Beautifulsoup解析器的find_all()来进行解析，使用find()和append()实现关键字和数值的查找
可视化部分读取excel爬取的数据
使用PyEchart实现博客作者详细信息和文章信息图表

二、准备工作

1. Requests

Requests官方文档:requests

requests是一个很实用的Python HTTP客户端库，爬虫和测试服务器响应数据时经常会用到，requests是Python语言的第三方的库，专门用于发送HTTP请求。

2. PyEchart

PyEchart官方文档:pyechart

Echarts是一个由百度开源的商业级数据图表，它是一个纯JavaScript的图表库，可以为用户提供直观生动，可交互，可高度个性化定制的数据可视化图表，赋予了用户对数据进行挖掘整合的能力。

三、实现步骤

(一)、爬虫部分实现

1. 获取网页数据并返回

核心设计代码如下：

# woshinsy
def get_html(url):
    try:
        headers = {
            'User-Agent': 'Mozilla/5.0 (MSIE 10.0; Windows NT 6.1; Trident/5.0)',
        }
        r = requests.get(url,headers=headers)             # 使用get来获取网页数据
        r.raise_for_status()              # 如果返回参数不为200，抛出异常
        r.encoding = r.apparent_encoding  # 获取网页编码方式
        return r.text                     # 返回获取的内容
    except:
        return '错误'

2. 爬取博客作者和文章相关数据

核心设计代码如下：

#woshinsy
def author_info():
    # 定义好相关列表准备存储相关信息
    head_img = [] # 头像
    author_name = [] # 用户名
    visitor_num = [] # 访问数
    article_num = [] # 文章数
    rank_num = [] # 排行榜
    fans_num = [] # 粉丝数
    like_num = [] # 点赞数
    comment_num = [] # 评论数
    fav_num = [] # 收藏数

    url = input("请输入博客用户地址:")
    print(url)
    # url = 'https://blog.csdn.net/woshinsy'    # 网址
    html = get_html(url)                                       # 获取返回值
    # print(html)                                              # 打印
    # beautifulsoup的find_all()来进行解析。在这里，find_all()的第一个参数是标签名，第二个是标签中的class值（注意下划线哦(class_=‘info’)）
    soup = BeautifulSoup(html, 'html.parser')  # 指定BeautifulSoup的解析器

    # 头像
    tx = soup.find('div', class_='user-profile-avatar').find('img')['src']
    head_img.append(str(tx))
    # print(head_img)
    # 用户名
    yhm = soup.find('div', class_='user-profile-head-name').find('div').get_text()
    author_name.append(str(yhm))
    # print(author_name)
    # 访问量
    fwl = soup.find_all('div', class_='user-profile-statistics-num')[0].get_text()
    visitor_num.append(fwl)
    print(visitor_num)
    # 文章数
    wzs = soup.find_all('div', class_='user-profile-statistics-num')[1].get_text()
    article_num.append(wzs)
    print(article_num)
    # 排行榜
    phb = soup.find_all('div', class_='user-profile-statistics-num')[2].get_text()
    rank_num.append(phb)
    print(rank_num)
    # 粉丝数
    fss = soup.find_all('div', class_='user-profile-statistics-num')[3].get_text()
    fans_num.append(fss)
    print(fans_num)
    # 点赞数
    dzs = soup.find('ul', class_='aside-common-box-achievement').find_all('span')[0].get_text()
    like_num.append(dzs)
    print(like_num)
    # 评论数
    pls = soup.find('ul', class_='aside-common-box-achievement').find_all('span')[1].get_text()
    comment_num.append(pls)
    print(comment_num)
    # 收藏数
    scs = soup.find('ul', class_='aside-common-box-achievement').find_all('span')[-1].get_text()
    fav_num.append(scs)
    print(fav_num)

    # 存储至excel表格中
    info = {'头像': head_img, '用户名': author_name, '访问数': visitor_num,'文章数': article_num, '排行榜': rank_num, '粉丝数': fans_num,'点赞数': like_num, '评论数': comment_num, '收藏数': fav_num}
    info_blog_file = pandas.DataFrame(info)
    info_blog_file.to_excel('info_blog_author.xlsx', sheet_name="博客数据分析")
    # 将所有列表返回
    return head_img, author_name, visitor_num,article_num,rank_num,fans_num,like_num,comment_num,fav_num

def blog_info():
    names = []  # 文章名字
    looks = []  # 阅读量
    writedown= [] # 评论数
    blog_type = [] #文章类型
    blog_time = [] #文章时间
    headers = {
        'User-Agent': 'Mozilla/5.0 (MSIE 10.0; Windows NT 6.1; Trident/5.0)',
    }
    base_url = input("请输入博客文章地址:")
    # base_url = 'https://blog.csdn.net/woshinsy/article/list/'    # 网址

    r = requests.get(base_url+"1", headers=headers,  timeout=3)
    max_page = int(re.findall(r'var listTotal = (\d+);', r.text)[0])//40+1
    count = 0
    for i in range(1, max_page + 1):
        url = base_url + str(i)
        r = requests.get(url, headers=headers)
        soup = BeautifulSoup(r.text, 'html.parser')
        articles = soup.find("div", class_='article-list').find_all('div',class_='article-item-box csdn-tracking-statistics')
        for tag in articles:
            title = tag.find('h4').find('a').get_text(strip=True)[2:]
            names.append(str(title))

            the_type = '其他'
            article_types = ['C语言', '大数据', 'Python', 'Linux']
            for article_type in article_types:
                if article_type in title:
                    the_type = article_type
                    break
            blog_type.append(str(the_type))
            issuing_time = tag.find('span', class_="date").get_text(strip=True)
            blog_time.append(issuing_time)
            num_list = tag.find_all('span', class_="read-num")
            read_num = num_list[0].get_text(strip=True)
            looks.append(read_num)

            if len(num_list) > 1:
                comment_num = num_list[1].get_text(strip=True)
                writedown.append(comment_num)
            else:
                comment_num = 0
                writedown.append(comment_num)

            count += 1
        # test
        print(names)
        print(blog_type)
        print(looks)
        print(writedown)
        time.sleep(random.choice([1, 1.1, 1.3]))
    # 存储至excel表格中
    info = {'文章名': names,'文章类型': blog_type, '发博时间': blog_time, '阅读量': looks, '评论数': writedown}
    info_blog_file = pandas.DataFrame(info)
    info_blog_file.to_excel('info_blog.xlsx', sheet_name="博客文章数据分析")
    # 将所有列表返回
    return names,blog_type,blog_time, looks, writedown

if __name__ == '__main__':

    author_info()
    print('作者信息获取成功')
    blog_info()
    print('博客信息获取成功')

(二)、可视化部分实现

1. 读取存储至excel表格的数据

#woshinsy
excel_data = pd.read_excel("info_blog.xlsx")
excel_data_author = pd.read_excel("info_blog_author.xlsx")

2. 绘制上半部分作者详细信息

#woshinsy
def tab0(name, color):  # 标题1
    c = (Pie().
        set_global_opts(
        title_opts=opts.TitleOpts(title='博客名:\n\n '+name, pos_left='center', pos_top='center',
                                  title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=20))))
    return c

def tab2(name, color):  # 标题2
    c = (Pie().
        set_global_opts(
        title_opts=opts.TitleOpts(title='访问数:\n\n '+name, pos_left='center', pos_top='center',
                                  title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=20))))
    return c
def tab3(name, color):  # 标题3
    c = (Pie().
        set_global_opts(
        title_opts=opts.TitleOpts(title='文章数:\n\n '+name, pos_left='center', pos_top='center',
                                  title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=20))))
    return c
def tab4(name, color):  # 标题4
    c = (Pie().
        set_global_opts(
        title_opts=opts.TitleOpts(title='排行榜:\n\n '+name, pos_left='center', pos_top='center',
                                  title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=20))))
    return c
def tab5(name, color):  # 标题5
    c = (Pie().
        set_global_opts(
        title_opts=opts.TitleOpts(title='粉丝数:\n\n '+name, pos_left='center', pos_top='center',
                                  title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=20))))
    return c
def tab6(name, color):  # 标题6
    c = (Pie().
        set_global_opts(
        title_opts=opts.TitleOpts(title='点赞数:\n\n '+name, pos_left='center', pos_top='center',
                                  title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=20))))
    return c
def tab7(name, color):  # 标题7
    c = (Pie().
        set_global_opts(
        title_opts=opts.TitleOpts(title='评论数:\n\n '+name, pos_left='center', pos_top='center',
                                  title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=20))))
    return c
def tab8(name, color):  # 标题8
    c = (Pie().
        set_global_opts(
        title_opts=opts.TitleOpts(title='收藏数:\n\n '+name, pos_left='center', pos_top='center',
                                  title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=20))))
    return c

def tab1(name, color):  # 大标题
    c = (Pie().
        set_global_opts(
        title_opts=opts.TitleOpts(title=name, pos_left='center', pos_top='center',
                                  title_textstyle_opts=opts.TextStyleOpts(color=color, font_size=50))))
    return c

在这里插入图片描述

3. 绘制饼图文章类型占比情况

#woshinsy
# 文章类型占比情况 饼图
def blog_type_radius():
    type_cate = excel_data["文章类型"].value_counts()
    cate = type_cate.index.tolist()
    data = []
    for v in type_cate:
        data.append(v)

    c = (
        Pie()
            .add("", [list(z) for z in zip(cate, data)])  # zip函数两个部分组合在一起list(zip(x,y))-----> [(x,y)]
            .set_global_opts(title_opts=opts.TitleOpts(title="各类型文章占比情况"))  # 标题
            .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))  # 数据标签设置
    )
    # c.render("blog_type_radius.html")
    return c

在这里插入图片描述

4. 绘制柱状图文章阅读量和评论数情况

#woshinsy
def blog_Bar():
    # 左边坐标轴的数据
    y_data_1 = excel_data["阅读量"].tolist()
    # 右边坐标轴的数据
    y_data_2 = excel_data["评论数"].tolist()
    # 第二种方法 非嵌套法
    chart = Bar(init_opts = opts.InitOpts(width="1600px")).set_global_opts(
        title_opts=opts.TitleOpts(title="各文章阅读和评论情况"),
         datazoom_opts=opts.DataZoomOpts(type_="slider"),
         xaxis_opts=opts.AxisOpts(axislabel_opts={"rotate": 30,"interval":"0"})
    ).set_series_opts(label_opts=opts.LabelOpts(position="right"))

    chart.add_xaxis(excel_data["文章名"].tolist())
    chart.add_yaxis(
        '阅读量',
        y_data_1,
        yaxis_index=0
    )
    chart.add_yaxis(
        '评论数',
        y_data_2,
        yaxis_index=1
    )
    # 添加额外的坐标轴
    chart.extend_axis(yaxis=opts.AxisOpts())
    # chart.render("blog_Bar.html")
    return chart

在这里插入图片描述

5. 绘制线图当月发博数情况

#woshinsy
def blog_line():
    month_blog = excel_data["发博时间"].apply(lambda x: x[:7].split('-')[0] + "年" + x[:7].split('-')[-1] + "月").value_counts(sort=False)
    month_blog.sort_index(inplace=True)
    x_data = month_blog.index.tolist()
    y_data = []
    for v in month_blog:
        y_data.append(v)
    c = (
        Line()
            .add_xaxis(x_data)
            .add_yaxis("当月发博篇数", y_data, is_connect_nones=True,markpoint_opts=opts.MarkPointOpts(data=[opts.MarkPointItem(type_="min"),opts.MarkPointItem(type_="max")]))
            .set_global_opts(title_opts=opts.TitleOpts(title="各月发博数情况"))
            # .render("line_connect_null.html")
    )
    return c

在这里插入图片描述

6. 四表合并博客数据分析大屏可视化

#woshinsy
page = Page()
page.add(
    tab0(excel_data_author["用户名"][0],"#2CB34A"),
    tab2(str(excel_data_author["访问数"][0]),"#2CB34A"),
    tab3(str(excel_data_author["文章数"][0]),"#2CB34A"),
    tab4(str(excel_data_author["排行榜"][0]),"#2CB34A"),
    tab5(str(excel_data_author["粉丝数"][0]),"#2CB34A"),
    tab6(str(excel_data_author["点赞数"][0]),"#2CB34A"),
    tab7(str(excel_data_author["评论数"][0]),"#2CB34A"),
    tab8(str(excel_data_author["收藏数"][0]),"#2CB34A"),
    blog_line(),
    tab1("博客作者数据分析", "#2CB34A"),
    blog_type_radius(),
    blog_Bar(),


         )
page.render("博客数据分析大屏可视化.html")
print("生成大屏成功")

with open("博客数据分析大屏可视化.html", "r+", encoding='utf-8') as html:
    html_bf = BeautifulSoup(html, 'lxml')
    divs = html_bf.select('.chart-container')
    divs[0]["style"] = "width:10%;height:10%;position:absolute;top:12%;left:10%;"
    divs[1]["style"] = "width:10%;height:10%;position:absolute;top:12%;left:20%;"
    divs[2]["style"] = "width:10%;height:10%;position:absolute;top:12%;left:30%;"
    divs[3]["style"] = "width:10%;height:10%;position:absolute;top:12%;left:40%;"
    divs[4]["style"] = "width:10%;height:10%;position:absolute;top:12%;left:50%;"
    divs[5]["style"] = "width:10%;height:10%;position:absolute;top:12%;left:60%;"
    divs[6]["style"] = "width:10%;height:10%;position:absolute;top:12%;left:70%;"
    divs[7]["style"] = "width:10%;height:10%;position:absolute;top:12%;left:80%;"

    divs[8]["style"] = "width:40%;height:50%;position:absolute;top:30%;left:5%;"
    divs[9]["style"] = "width:35%;height:10%;position:absolute;top:2%;left:30%;"
    divs[10]["style"] = "width:40%;height:50%;position:absolute;top:30%;left:55%;"
    divs[11]["style"] = "width:90%;height:50%;position:absolute;top:90%;left:5%;"
    body = html_bf.find("body")
    body["style"] = "background-image: "  # 背景颜色
    html_new = str(html_bf)
    html.seek(0, 0)
    html.truncate()
    html.write(html_new)
    html.close()

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

Python系列 | 基于Requests和PyEcharts实现爬取博客数据可视化大屏分析的相关文章

CAN总线简明易懂教程(一)

先看看工作原理当 CAN 总线上的一个节点 xff08 站 xff09 发送数据时 xff0c 它以报文的形式广播给网络中所有节点 xff0c 对每个节点来说 xff0c 无论数据是否是发给自己的 xff0c 都对其接收每组报文开头的1
北斗导航系统、GPS、GLONASS信号频率

民用方面 xff1a GLONASS xff1a L1 61 1602 43 0 5625 k MHz 和L2 61 1246 43 0 4375 k MHz L1 L2 61 9 7 GPS L1 1575 42 43 10 MHz L2
centos下安装Java

文章目录 1 解压2 配置环境变量 1 解压首先将压缩包放到 usr local路径下 xff0c 解压缩 span class token builtin class name cd span usr local span class
图像语义理解

本文转载 from xff1a http blog csdn net haitun425 article details 8802182 1 目标的检测分类和识别都为图像语义的理解服务理解是硬道理 xff1b 2 不在于图像理解模型是否
我的嵌入式5年 VS 我被国嵌的视频坑了的日子

在嵌入式的行业工作四五年了 xff0c 想想从当初的身无分文和什么都不会 xff0c 到现在的还算衣食无忧 xff0c 住行别想 xff0c 技术也有点提升 xff0c 进入了手机行业 xff0c 其中的酸甜苦辣只有自己知道 xff0c 从
我程序人生的启蒙书

是这本书 xff0c 大一的我接触了c和c 43 43 xff0c 为数学专业的我打开了通往另一个世界的道路 xff0c 做一名优秀的程序员是这本书 xff0c 大一的我开始废寝忘食的学习 xff0c 自习室里往往就放着这一本数 xff0
C++面试题(三)——STL相关各种问题

C 43 43 面试题 STL相关各种问题 tanglu2004 http blog csdn net worldwindjp STL相关的各种问题 1 xff0c 用过那些容器最常用的容器就是 xff1a vector list map
基于 Docker 搭建开发环境

基于 Node 官方镜像 https hub docker com node 获取镜像在本地 Terminal 中执行 docker pull node 以获取 node 镜像 xff0c 可在 docker desktop 中查看创建
tomcat 9 与mysql 5 的连接

1 jdk的安装配置JAVA HOME变量 xff0c 将该变量设置到path中 2 tomcat 下载最新版本apache tomcat 9 0 0 M9配置根目录CATALINA HOME 61 D apache tomcat 9
刷完 LeetCode 是什么水平？能拿到什么水平的 offer？

链接 xff1a https www zhihu com question 32019460 编辑 xff1a 深度学习与计算机视觉声明 xff1a 仅做学术分享 xff0c 侵删刷题是我们一贯的学习方式 xff0c 但是学霸和学渣的区
开心网争车位发布

本软件第一次使用C 编写 xff0c 是开心001争车位的辅助软件可以帮助你管理多个账号一起停车挣钱 xff0c 使支持和热爱开心网的玩家更方便 xff0c 请勿与用商业使用方法 xff1a 本软件需依靠 net3 5 xff0c
博士“申请考核制”经验

作者 xff1a 花花链接 xff1a https zhuanlan zhihu com p 126168158 本文转载自知乎 xff0c 作者已授权 xff0c 未经许可请勿二次转载本文是作者真实的经历 xff0c 给打算申请国内院
研究生新生要怎么看论文？

链接 xff1a https www zhihu com question 304334959 编辑 xff1a 深度学习与计算机视觉声明 xff1a 仅做学术分享 xff0c 侵删问题 xff1a 经常各种看不懂论文而且感觉好多论文
考博热会出现吗？

链接 xff1a https www zhihu com question 408008199 编辑 xff1a 深度学习与计算机视觉声明 xff1a 仅做学术分享 xff0c 侵删考研热已经是不争的事实了 xff0c 每年考研人数都在
基于python的MongoDB入门教程

总览 MongoDB是数据科学家常用的一种非结构化数据库本文我们讨论如何使用Python xff08 和PyMongo库 xff09 来使用MongoDB数据库本文我们使用Python实现对MongoDB数据库的所有基本操作结构化数据库
同组博士师兄的结果复现不出来，我应该怎么办？

链接 xff1a https www zhihu com question 502804990 编辑 xff1a 深度学习与计算机视觉声明 xff1a 仅做学术分享 xff0c 侵删今年研二 xff0c 老师给了一个课题 xff0c 让
SOC电源标志说明 VCC、VSS、VDD、VEE、VPP、Vddf

VBAT VBAT是电源电压 xff0c VCC xff1a C 61 circuit 表示电路的意思即接入电路的电压 VDD xff1a D 61 device 表示器件的意思即器件内部的工作电压 VSS xff1a S 61 ser
“error LNK2019: 无法解析的外部符号”原因总结

C 43 43 工程编译时出现如下链接错误提示 xff1a 原因一 xff1a 缺少实现只是在 h里面声明了某个方法 xff0c 没有在cpp里面实现 xff1b 我出现过这个问题 xff1b 类方法的实现未加类标识 xff1a 如 xf
不支持S/W HEVC(H265)解码的有效解决方案

最近从WIN7更换为WIN10后 xff0c PotPlayer播放器加速出现不同步情况 xff0c 网上查找了很多办法 xff0c 最终奏效失败方法一 xff1a FFmpeg64 dll 下载FFmpeg64 dll xff08 ht
Win10打开任务管理器卡死的解决方法

我的情况是刚开始装了win10家庭版 xff0c 但是安装一些软件后 xff0c 过段时间打开任务管理器就会莫名其妙的卡死 xff0c 我去重新装了原装系统 xff0c 换了固态硬盘 xff0c 清理了电脑 xff0c 还是会出现这个问题

随机推荐

基于双目视觉的非标机械臂的空间定位流程（未完待续）

文章目录系统坐标系变换原理双目标定原理准备步骤图像极线校正对应点匹配空间定位图像校正计算视差计算深度目标点空间定位三维重建手眼标定 xff08 eye in hand xff09 问题故障解决下一步计划参考系统接上一次的非标机
如何用VC++60编写查看二进制文件程序

雷霆工作室韩燕在计算机应用中 xff0c 经常需要查看二进制文件的内容目前 xff0c 在各种VC 43 43 书籍中介绍查看文本文件的文章很多 xff0c 但鲜有介绍查看二进制文件的文章本文从功能设计方案设计编程实现以及技术要
Matlab代码导入STM32F103流程

文章目录软件准备STM32CubeMX简介配置STM32CUBEMX配置SIMULINKSIMULINK对STM32F103进行点灯试验一般算法导入到STM32问题故障解决参考软件准备安装MATLAB2019a xff0c 64位下
树莓派利用OpenCV的图像跟踪、人脸识别等

文章目录准备配置测试程序颜色识别跟踪人脸识别手势识别形状识别条码识别二维码识别故障问题解决module 39 cv2 39 has no attribute 39 dnn 39 ImportError numpy core multia
Linux(ubuntu)安装AppImage步骤

方法一设置允许执行文件 xff0c 双击无反应运行以下代码 xff0c 出错 panda6 1 0 x86 64 appimage 运行sudo apt get install fuse 直接输入以下 xff0c 即可运行 panda6
Solidworks导出URDF总结（Humble）

环境 Solidwoks2021 SP5 xff1b Ubuntu22 04 xff1b ROS2 Humble 步骤基本步骤参考 xff1a Solidworks导出URDF总结 xff08 Noetic xff09 本文只介绍不同之处
博途V17(S7-1200)OPC-UA通信测试

文章目录环境步骤安装博途端 UAExpert端参考环境 S7 1200 TIA Portal V17 笔记本与PLC网线连接 Windows10 UaExpert 步骤安装 TIA Portal v17 博途安装教程附安装
LabVIEW调用Matlab函数方法总结

文章目录方法分类Matlab脚本节点方法Coder 43 VS方法 Net方法COM ActiveX方法故障问题解决调用带有符号运算的方法在 LabVIEW与Matlab混合编程进行图像处理附带颜色栏Colorbar 的基础上做个简
Linux内核之自旋锁和信号量

Linux内核实现了多种同步方法 xff0c 指令级支持的原子操作自旋锁信号量互斥锁完成量大内核锁等等 xff0c 我就挑比较有代表性的两个锁自旋锁和信号量来分析自旋锁 Linux内核中最常用的锁就是自旋锁 spin lock
NXP S32K146 FREERTOS工程配置UART底层驱动（一）

MCU平台还是S32K146 xff0c 开发环境是S32DS 用官方的SDK3 0 0 xff0c PE配置外设 xff0c 生成generation code 在SDK上边封装函数 xff0c 第三库用的ringbuf循环队列 xff0
Linux系列 | Linux 离线安装配置MySQL5.7.25教程（附mysql命令大全）

Linux 离线安装配置MySQL5 7 25教程 1 安装环境2 前置工作2 1 卸载系统自带的mariadb2 2 卸载旧版本mysql xff08 可跳过 xff09 2 3 删除etc目录下的my cnf文件 xff08 没有可跳过
创建Vue项目报HADOOP_CONF_DIR错解决方法

创建Vue项目报错解决方法创建Vue ui项目时终端报错 xff1a ERROR Failed to get response from No HADOOP CONF DIR set Please specify it either sp
android调试常见问题（持续更新）

1 jni调用时出现以下错误 failed dlopen failed cannot locate symbol rand referenced by xxx 通常是ndk编译的平台太低导致打开jni的Application mk 修改里
大数据系列 | 解决Hadoop不能打开端口8088的网页问题(50070可以打开)

解决Hadoop不能打开端口8088的网页问题 50070可以打开原因 xff1a 本地hosts文件没有添加集群ip集群环境没有开放8088端口hadoop的配置文件yarn site xml问题解决方法 xff1a 首先检查一下使用
Vue2前端请求API数据跨域问题解决

Vue2前端请求API数据跨域问题解决方法前端 xff1a Vue2 接口使用 xff1a API 问题报错提示 xff1a Access to XMLHttpRequest at span class token string 39 h
vmware虚拟机ubuntu18.04桌面版安装教程

vmware虚拟机ubuntu18 04桌面版安装教程一安装环境 xff1a VMware Workstation xff1a 15 5Pro Ubuntu xff1a 18 04桌面版二安装教程 xff1a 创建虚拟机选择自定
Python系列 | Turtle绘图学习之羽毛球随机点训练场

绘图之前先要安装turtle模块 xff1a python 2 xff1a pip install turtle python 3 pip3 install turtle 绘图思路 xff1a 首先绘制出外正方形和内线 xff0c 然后使用
大数据系列 | 全国职业院校技能大赛大数据应用技术赛项笔记分享-离线抽取模块

离线数据抽取写在前面 xff1a 此笔记是本人在备战2022年大数据赛项整理出来的 xff0c 不涉及国赛涉密内容 xff0c 如点赞收藏理想 xff0c 我将会把所有模块的笔记开源分享出来 xff0c 如有想询问国赛经验的可以关注私聊我
C语言系列 | 简单题练习

第一题 xff1a 简易计算器思路 xff1a 定义变量后使用while无限循环执行 xff0c 使用switch语句实现多分支选择源代码 xff1a span class token macro property span class
Python系列 | 基于Requests和PyEcharts实现爬取博客数据可视化大屏分析

博客数据分析大屏可视化实现的效果 xff1a 一核心功能设计学习笔记分享 xff1a 博客作者数据分析实现的思路大致为爬虫用户通过控制台输入用户博客地址和博客文章地址和大屏可视化展示两方面接下来我们可以通过以下几步实现需求 xff

Python系列 | 基于Requests和PyEcharts实现爬取博客数据可视化大屏分析

一、核心功能设计

二、准备工作

1. Requests

2. PyEchart

三、实现步骤

(一)、爬虫部分实现

1. 获取网页数据并返回

2. 爬取博客作者和文章相关数据

(二)、可视化部分实现

1. 读取存储至excel表格的数据

2. 绘制上半部分作者详细信息

3. 绘制饼图 文章类型占比情况

4. 绘制柱状图 文章阅读量和评论数情况

5. 绘制线图 当月发博数情况

6. 四表合并 博客数据分析大屏可视化

Python系列 | 基于Requests和PyEcharts实现爬取博客数据可视化大屏分析 的相关文章

随机推荐

热门标签

3. 绘制饼图文章类型占比情况

4. 绘制柱状图文章阅读量和评论数情况

5. 绘制线图当月发博数情况

6. 四表合并博客数据分析大屏可视化

Python系列 | 基于Requests和PyEcharts实现爬取博客数据可视化大屏分析的相关文章