Python下载GitHub数据

2023-05-16

配置文件

[Source]
Source_path = /Users/xxx/PycharmProjects/Source/list
Source_path_original = /Users/xxx/PycharmProjects/Source/original
Source_path_sg = /Users/xxx/PycharmProjects/Source/sgmodule
Source_backup_dir = /Users/xxx/PycharmProjects/Source/backup

下载模块

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2022/11/6 09:22
# @Author  : lumia98
# @File    : DowanloadUrls
# @Software: PyCharm
# 下载url数据
import os,sys
import ssl
import time
import urllib.request

# 下载模块
def downloanUrl(urlPath, filePath, mPath):
    """
    urlPath: 下载Github原始数据的url
    filePath: 存放Github数据的目录
    mPath: 其他格式后缀的目录
    """
    # 全局取消证书验证
    #ssl._create_default_https_context = ssl._create_unverified_context

    # 后缀的变量
    list_suffix = ".list"
    yaml_suffix = ".yaml"

    # 下载的url数据是字典集合
    # 判断传递url是否为字典
    if isinstance(urlPath, dict):
        # 循环字典
        for i in urlPath.items():
            # 切割字典key，方便辨识传递的数据是否带后缀
            a = str(i[0]).split(".")  # 以 . 切割
            if len(a) == 1:  # 如果切割完成，长度没有两个(不是aa.js这种)
                # print(a)
                # 默认用list后缀
                urllib.request.urlretrieve(i[1], os.path.join(filePath, i[0] + list_suffix))
                time.sleep(2)
            else:
                # key带了后缀
                urllib.request.urlretrieve(i[1], os.path.join(mPath, i[0] ))
                time.sleep(2)

            # 判断数据下载是否完整
            if len(a) == 1:
                # 计算list后缀下载的文件大小
                list_size = os.path.getsize(os.path.join(filePath, i[0] + list_suffix))

                # 转换成整数
                int_list_size = int(list_size)

                if int_list_size <= 90:  # 如果小于90字节，则退出
                    print("{} 数据下载过小，请查看文件是否完整.....".format(i[0]))
                    os.remove(os.path.join(filePath, i[0] + list_suffix))
                    break
                else:
                    print("{} 正在下载.............".format(i[0]))
            else:
                # 计算不是list后缀的文件大小(如模块这些)
                exists_path = os.path.exists(os.path.join(mPath, i[0])) # 判断文件是否存在
                #print(i[0])
                if exists_path:
                    other_size = os.path.getsize(os.path.join(mPath, i[0]))

                    int_other_size = int(other_size)

                    if int_other_size < 10:
                        print("{} 数据下载过小，请查看文件是否完整.....".format(i[0]))
                        os.remove(os.path.join(mPath, i[0]))
                        break
                    print("{} 正在下载.............".format(i[0]))
    else:
        # 如果不是字典的下载数据，则告诉对方
        print("""
                  url的格式请用字典方式传递,如:
                  test = {"xxx": "https://url/xxx.py"}
                  """)
        exit(0)

需要下载的数据模块

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Time    : 2022/11/7 09:11
# @Author  : lumia98
# @File    : Source
# @Software: PyCharm
import time, os
from Functions import DowanloadUrls
import configparser

class githubObj(object):
    """
    Source 相关代码
    """

    def __init__(self):
        # 获取FilePath.conf配置路径
        RootPath = os.path.abspath(".")
        ConfPath = os.path.join(RootPath, "Functions", "FilePath.conf")
        # 读取FilePath.conf内容
        cf = configparser.ConfigParser()
        cf.read(ConfPath)
        # 拿到conf内容
        self._SourcePath = cf.get("Source", "Source_path") # finsh后的存储数据目录
        self._Source_original = cf.get("Source", "Source_path_original") # Github下载存放目录
        self._Source_sgmodule = cf.get("Source", "Source_path_sg") # github模块下载地址目录
        self._Source_backup = cf.get("Source", "Source_backup_dir") # 备份上次的数据目录

        # url地址集合
        self._urls = {
            # github地址
            "xxx": "https://raw.githubusercontent.com/xxx.py",
            "aa.js": "https://raw.githubusercontent.com/aa.js"
        }

    # 下载github数据
    def StartDownload(self):
        DowanloadUrls.downloanUrl(urlPath=self._urls,filePath=self._Source_original, mPath=self._Source_module)

启动下载

githubObj().StartDownload()

此代码只需要在githubObj类(url地址集合)里的数据增删改

如果目录则修改配置文件

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

python

GitHub

Python下载GitHub数据的相关文章

在 Python 2.7 中出现“ImportError：无法导入名称 HTTPSConnection”错误

我正在尝试在 AWS ElasticBeanstalk 中部署 django 当我按照所示步骤操作时here http docs aws amazon com elasticbeanstalk latest dg create deploy
通过 python 中的另外两个修改数组[重复]

这个问题在这里已经有答案了假设我们有三个一维数组 A 长度为 5 B 长度相同示例中为5 C 更长比如长度为 100 C最初用零填充 A给出索引C应更改的元素它们可能会重复以及B给出应添加到初始零的值C 例如如果A 1 3 3
键入的完整命令行

我想获得输入时的完整命令行 This join sys argv 在这里不起作用删除双引号另外我不想重新加入已解析和拆分的内容有任何想法吗你太迟了当键入的命令到达 Python 时您的 shell 已经发挥了它的魔力例如引
Keras model.predict 函数给出输入形状错误

我已经在 Tensorflow 中实现了通用句子编码器现在我正在尝试预测句子的类概率我也将字符串转换为数组 Code if model model type universal classifier basic class probs
使用 NumPy 编写一个函数来计算具有特定公差的积分

我想编写一个自定义函数来以特定容差对表达式 python 或 lambda 函数进行数字积分我知道与scipy integrate quad人们可以简单地改变epsabs但我想使用 numpy 自己编写该函数 From 这篇博文 htt
Python - Unicode 到 ASCII 的转换

我无法在不丢失数据的情况下将以下 Unicode 转换为 ASCII u ABRA xc3O JOS xc9 I tried encode and decode他们不会这么做有人有建议吗 Unicode 字符u xce0 and u xc
为什么 re.findall 在查找字符串中的三元组项时不具体。 Python

所以我有四行代码 seq ATGGAAGTTGGATGAAAGTGGAGGTAAAGAGAAGACGTTTGA OR 0 re findall r ATG 9 TAA TAG TGA seq 首先让我解释一下我正在尝试做什么如果这令人困惑
ValueError：数据必须为正（boxcox scipy）

我正在尝试将我的数据集转换为正态分布 0 8 298511e 03 1 3 055319e 01 2 6 938647e 02 3 2 904091e 02 4 7 422441e 02 5 6 074046e 02 6 9 265747e
设置高亮大括号的 vim 颜色主题

如何更改突出显示大括号的 vim 配色方案我希望实际编辑 vim 主题文件以使更改永久生效问候克雷格匹配括号的自动高亮颜色称为MatchParen 您可以通过执行以下操作来更改 vimrc 中的颜色 highlight MatchP
类型错误：此 COM 对象无法自动执行 makepy 过程 - 请为此对象手动运行 makepy

这是什么错误回溯错误 C Users DELL PycharmProjects MyNew venv Scripts python exe C Users DELL PycharmProjects MyNew agaaaaain py T
更改 Matplotlib 投影轴的背景颜色

我正在尝试使用 Cartopy 创建一个图形该图形需要在未投影的轴上绘制投影轴这是一个尽可能简单的代码版本它将轴上的内容替换为背景颜色 import matplotlib pyplot as plt import cartopy cr
使用 if 语句的网格网格和用户定义函数的真值不明确

假设我有一个函数f x y 足够光滑然而有些值仅在有限的意义上存在以sin x x的价值x 0只存在于极限 x gt 0 中在一般情况下我用一个来处理这个问题if陈述如果我在情节中使用它meshgrid我收到一条错误消息 Val
Python 属性和 Swig

我正在尝试使用 swig 为一些 C 代码创建 python 绑定我似乎遇到了一个问题试图从我拥有的一些访问器函数创建 python 属性方法如下 class Player public void entity Entity enti
使当前提交成为 Git 存储库中唯一（初始）提交？

我目前有一个本地 Git 存储库我将其推送到 Github 存储库本地存储库有约 10 次提交 Github 存储库是其同步副本我想要做的是从本地 Git 存储库中删除所有版本历史记录以便存储库的当前内容显示为唯一提交因此存储库中
Git 更改丢失 - 为什么？

我们的开发团队正在使用 git 最近我们至少两次丢失了文件更改我们正在使用私人 Github 存储库在当前情况下我们可以返回 Github 上的日志并查看我对文件所做的一些更新后来另一位团队成员更改了文件的不同部分它似乎破坏了我
无法导入QUERY_TERMS

我正在运行一个网站Python and Django Django filters 2 1 installed Django 2 1 installed 当我运行时我收到以下错误 importError Could not import
如何通过 Python Requests 库使用基本 HTTP 身份验证？

我正在尝试在 Python 中使用基本的 HTTP 身份验证我正在使用Requests https docs python requests org 图书馆 auth requests post http hostname auth HT
从迭代器外部将 StopIteration 发送到 for 循环

有几种方法可以打破一些嵌套循环他们是 1 使用中断继续 for x in xrange 10 for y in xrange 10 print x y if x y gt 50 break else continue only exec
MoviePY 无法在 Windows 上检测 ImageMagick 二进制文件

我刚买了一台新笔记本电脑想要设置MoviePY在那新的Windows 64x Python3 7 0 机器我对所有内容都进行了三次检查但是当涉及到我的代码的文本部分时它向我抛出了这个错误 OSError MoviePy Error
Git：从 master 以外的分支克隆

我正在尝试从 Github 的存储库中提取数据但我不想克隆主分支我想克隆其他一些分支当我尝试时git clone

随机推荐

使用ICMP协议检测网络状态

ICMP xff08 Internet ControlMessages Protocol xff0c 网间控制报文协议 xff09 是TCP IP协议族的子协议 xff0c 是一种面向无连接的协议 xff0c 在IP和路由器之前传递控制消息
c++打印enum class

span class token keyword enum span span class token keyword class span span class token class name A span span class tok
使用strace查找Emacs启动阻塞的原因(exec-path-from-shell)

原文地址 https www lujun9972 win blog 2019 09 26 使用strace查找emacs启动阻塞的原因 exec path from shell index html 之前就觉得我的Emacs启动好慢 xff
为Linux安装虚拟PDF打印机

原文地址 https lujun9972 github io blog 2020 04 11 为linux安装虚拟pdf打印机 index html 今天发现一个 CUPS PDF 项目可以为 CUPS Common Unix Print
ubuntu系统启用shell远程登陆

Ubuntu desktop系统安装后 xff0c 想使用shell远程登陆 xff0c 会提示 Connecting to 192 168 220 133 22 Could not connect to 39 192 168 220 13
枚举类（ENUM）用法总结

对于ENUM一直是比较陌生的 xff0c 在和某酷爱ENUM的大神合作时 xff0c 才慢慢接触到ENUM的用法 1 ENUM是什么 xff1f 首先ENUM是一个类 xff0c 不像String int之类的数据结构 xff0c 更类似于
Python循环结构练习2

Problem A xff1a 循环结构输出数列2 xff0c 5 xff0c 8 xff0c 11 xff0c 14 题目描述输入正整数n xff08 n 100 xff09 xff0c 输出数列2 xff0c 5 xff0c 8 x
KVM网络模型之：PCI Passthrough

目录 PCI Passthrough技术介绍和KVM中配置案例内核启用重新启动虚拟机实例 PCI Passthrough技术介绍和KVM中配置 PCI Passthrough技术是虚拟化网卡的终极解决方案 xff0c 能够让虚拟机独占
微信开放公众平台，扩展自定义类，定时提醒，定时发消息

微信开放公众平台 xff0c 扩展自定义类 xff0c 定时提醒 xff0c 定时发消息 lt php class MyapiAction extends BaseAction public function index 微医疗预约提醒
Ubuntu配置iptables规则

Ubuntu配置防火墙 xff0c 并且开机iptables自启动规则适用于CentOS 1 登录root账号 span class token comment 切换到root账号 span super 64 super span cla
Linux记录用户执行命令

span class token shebang important bin bash span span class token comment By lumia98 64 vip qq com span span class token
Nginx规则配置实例

配置某个ip或者页面禁止访问及跳转方法 server listen 80 server name www test com cn location proxy redirect off proxy set header host host
Debian 11.2安装ssh服务

切换到root用户更新软件源 span class token function apt get span update 安装ssh span class token function apt get span span class to
Python计算文件大小

span class token comment usr bin env python span span class token comment coding utf 8 span span class token triple quot
Python获取文件内的下一行数据

span class token comment usr bin env python span span class token comment Version 61 3 8 1 span span class token comment
iptables配置实例

查看当前所有规则 iptables L n 查看所有规则 iptables nL line number 显示行 iptables nvL line number 显示行清空所有配置 iptables F iptables X iptab
利用Shell脚本校验数据一致性

span class token shebang important bin bash span span class token comment span span class token comment 检测两台服务器指定目录下的文件一
Debian11系统Redis源码安装

span class token shebang important bin bash span span class token comment Debian11 Redis6 2 6安装 span span class token co
iOS 根据文字内容设置cell 的高度

今天学习一个简单的根据内容的大小设置cell 的高度第一步建两个类分别继承于UIViewController 和UITableViewCell 第二步 mainViewController h import lt UIKit UIK
Python下载GitHub数据

配置文件 span class token punctuation span Source span class token punctuation span Source path span class token operator 61

Python下载GitHub数据

配置文件

下载模块

需要下载的数据模块

启动下载

此代码只需要在githubObj类(url地址集合)里的数据增删改

如果目录则修改配置文件

Python下载GitHub数据 的相关文章

随机推荐

热门标签

Python下载GitHub数据的相关文章