本宝宝有个工作中,需要报送个人所得税,其实以前我写过一个集成报送各年金税金vba的小代码,但是因为新接的任务数据量太大,vba可能会假死,即使能处理,也因为最近在自学python,所以想尝试用python
我的思路是:把报税模板复制出来,然后每月自动根据年金领取信息填写保存,吐槽一下,税表模板是xls,只支持6万数据,我尝试用一个xlsx文件上传,不过也成功了,嘻嘻,所以我手工把模板改成xlsx格式
第一步:复制模板
野生代码如下:
import pandas as pd
import numpy as np
import shutil
import os
import datetime
data_today = datetime.date.today()
print(data_today)
person_template = 'D:\报税工作\报税模板\人员信息.xlsx'
income_template = 'D:\报税工作\报税模板\年金领取.xlsx'
person_form='D:\报税工作\\xx计划年金人员'+str(data_today)+'.xlsx'
income_form='D:\报税工作\\xx计划年金收入'+str(data_today)+'.xlsx'
shutil.copyfile(person_template, person_form)
shutil.copyfile(income_template, income_form)
成功运行
然后使用了 dataframe和numpy数组进行把原始数据整理以后分别填入两个税表保存,
不管怎样,程序可以用,但是笨猪我用了遍历,好慢
12万数据简直卡死
重新搜索,利用dataframe的特性
7539人大概5秒钟就可以运行完毕
13万7500人大概使用35秒,完美!!!
import pandas as pd
import numpy as np
import shutil
import datetime
import xlwings as xw
app=xw.App(visible=True,add_book=False)
app.display_alerts=False
app.screen_updating=False
date_today = datetime.date.today()
starttime = datetime.datetime.now()
person_template = 'D:\报税工作\报税模板\人员信息.xlsx'
income_template = 'D:\报税工作\报税模板\年金领取.xlsx'
person_form='D:\报税工作\某计划年金人员'+str(date_today)+'.xlsx'
income_form='D:\报税工作\某计划年金收入'+str(date_today)+'.xlsx'
shutil.copyfile(person_template, person_form)
shutil.copyfile(income_template, income_form)
raw_data = pd.DataFrame(pd.read_excel('D:\报税工作\\201912\年金发放信息7539人.xlsm'))
raw_data['身份证'] = raw_data['身份证'].astype(str)
data_person = raw_data[['姓名','身份证']]
data_income = raw_data[['身份证','待遇支付金额']]
list_person_index=[[0,'A',''],[2,'证照类型','居民身份证'],[4,'E',''],[5,'F',''],[6,'G',''],[7,'人员状态','正常'],[8,'任职受雇类型','其他'],[9,'J',''],[10,'K',''],[11,'L',''],[12,'M',''],[13,'N',''],[14,'O',''],[15,'P',''],[16,'Q',''],[17,'R',''],[18,'S',''],[19,'备注','职业年金']]
for each in list_person_index:
data_person.insert(each[0],each[1],each[2])
list_income_index=[[0,'A',''],[1,'B',''],[2,'证照类型','居民身份证'],[4,'是否一次性领取',''],[6,'已完税金额',0.00],[7,'全部缴税额',data_income['待遇支付金额']],[8,'分摊月份数',''],[9,'J',''],[10,'K',''],[11,'L',''],[12,'M',''],[13,'N',''],[14,'O',''],[15,'备注','职业年金']]
for each in list_income_index:
data_income.insert(each[0],each[1],each[2])
data_person_arr=np.array(data_person)
data_income_arr=np.array(data_income)
wb_person=xw.Book(person_form)
wb_income=xw.Book(income_form)
sht_person=wb_person.sheets['人员信息']
sht_income=wb_income.sheets['年金领取收入']
rowmax=data_person_arr.shape[0]
rng_person = sht_person[1:rowmax+1,:20]
rng_person.value=data_person_arr
wb_person.save()
rng_income = sht_income[1:rowmax+1,:16]
rng_income.value=data_income_arr
wb_income.save()
wb_income.app.quit()
endtime = datetime.datetime.now()
print((endtime - starttime).seconds)