2022年泰迪杯数据分析_B题：银行客户忠诚度分析赛题数据_任务五

2023-10-27

银行客户长期忠诚度预测建模，此处忠诚度的指标选取为客户的流失情况，该指标分为两类，长期客户忠诚用1表示，不忠诚则用0表示。

对此预测建模中，使用到机器学习分类中的随机森林分类。

import pandas as pd

long_data26=pd.read_excel('result4.xlsx')

long_data26.Age

0       52
1       41
2       42
3       61
4       39
        ..
9175    37
9176    37
9177    39
9178    34
9179    40
Name: Age, Length: 9180, dtype: int64

long_data26

	CustomerId	CreditScore	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	Exited	Status	AssetStage	IsActiveStatus	IsActiveAssetStage	CrCardAssetStage
0	15553251	713	1	52	0	185891.54	1	1	1	46369.57	1	1	3	3	9	9
1	15553256	619	1	41	8	0.00	3	1	1	79866.73	1	2	2	5	6	6
2	15553283	603	1	42	8	91611.12	1	0	0	144675.30	1	2	0	2	2	5
3	15553308	589	1	61	1	0.00	1	1	0	61108.56	1	1	2	0	0	6
4	15553387	687	1	39	2	0.00	3	0	0	188150.60	1	1	2	0	0	0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
9175	15815628	711	1	37	8	113899.92	1	0	0	80215.20	0	2	0	2	2	5
9176	15815645	481	0	37	8	152303.66	2	1	1	175082.20	0	2	3	5	9	9
9177	15815656	541	1	39	9	100116.67	1	1	1	199808.10	1	2	0	5	8	9
9178	15815660	758	1	34	1	154139.45	1	1	1	60728.89	0	1	3	3	9	9
9179	15815690	614	1	40	3	113348.50	1	1	1	77789.01	0	1	0	3	8	9

9180 rows × 16 columns

通过观察不难发现，数据中某些列存在较大的波动，因此此处都个别列进行面元划分、标准化处理等操作。

将年龄进行离散化处理，划分为7个年龄段，7个类别

long_data0=long_data26.copy()
long_data0.loc[(long_data0['Age']<18),'离散化年龄']=1
long_data0.loc[(long_data0['Age']>=18) & (long_data0['Age']<30),'离散化年龄']=2
long_data0.loc[(long_data0['Age']>=30) & (long_data0['Age']<40),'离散化年龄']=3
long_data0.loc[(long_data0['Age']>=40) & (long_data0['Age']<50),'离散化年龄']=4
long_data0.loc[(long_data0['Age']>=50) & (long_data0['Age']<60),'离散化年龄']=5
long_data0.loc[(long_data0['Age']>=60) & (long_data0['Age']<70),'离散化年龄']=6
long_data0.loc[(long_data0['Age']>=70),'离散化年龄']=7
long_data0

	CustomerId	CreditScore	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	Exited	Status	AssetStage	IsActiveStatus	IsActiveAssetStage	CrCardAssetStage	离散化年龄
0	15553251	713	1	52	0	185891.54	1	1	1	46369.57	1	1	3	3	9	9	5.0
1	15553256	619	1	41	8	0.00	3	1	1	79866.73	1	2	2	5	6	6	4.0
2	15553283	603	1	42	8	91611.12	1	0	0	144675.30	1	2	0	2	2	5	4.0
3	15553308	589	1	61	1	0.00	1	1	0	61108.56	1	1	2	0	0	6	6.0
4	15553387	687	1	39	2	0.00	3	0	0	188150.60	1	1	2	0	0	0	3.0
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
9175	15815628	711	1	37	8	113899.92	1	0	0	80215.20	0	2	0	2	2	5	3.0
9176	15815645	481	0	37	8	152303.66	2	1	1	175082.20	0	2	3	5	9	9	3.0
9177	15815656	541	1	39	9	100116.67	1	1	1	199808.10	1	2	0	5	8	9	3.0
9178	15815660	758	1	34	1	154139.45	1	1	1	60728.89	0	1	3	3	9	9	3.0
9179	15815690	614	1	40	3	113348.50	1	1	1	77789.01	0	1	0	3	8	9	4.0

9180 rows × 17 columns

对CreditScore、EstimatedSalary、EstimatedSalary三个列的数据进行最大-最小值标准化处理。

the_min1=long_data0['CreditScore'].min()
the_max1=long_data0['CreditScore'].max()

the_min2=long_data0['EstimatedSalary'].min()
the_max2=long_data0['EstimatedSalary'].max()

# the_min3=long_data0['Balance'].min()
# the_max3=long_data0['Balance'].max()

long_data0['标准化信用']=((long_data0['CreditScore']-the_min1)/(the_max1-the_min1))*10
long_data0['标准化个人年收入']=((long_data0['EstimatedSalary']-the_min2)/(the_max2-the_min2))*10
#long_data0['标准化金融资产']=((long_data0['Balance']-the_min3)/(the_max3-the_min3))*10
long_data0

	CustomerId	CreditScore	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	Exited	Status	AssetStage	IsActiveStatus	IsActiveAssetStage	CrCardAssetStage	离散化年龄	标准化信用	标准化个人年收入
0	15553251	713	1	52	0	185891.54	1	1	1	46369.57	1	1	3	3	9	9	5.0	7.26	2.318373
1	15553256	619	1	41	8	0.00	3	1	1	79866.73	1	2	2	5	6	6	4.0	5.38	3.993573
2	15553283	603	1	42	8	91611.12	1	0	0	144675.30	1	2	0	2	2	5	4.0	5.06	7.234663
3	15553308	589	1	61	1	0.00	1	1	0	61108.56	1	1	2	0	0	6	6.0	4.78	3.055473
4	15553387	687	1	39	2	0.00	3	0	0	188150.60	1	1	2	0	0	0	3.0	6.74	9.408872
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
9175	15815628	711	1	37	8	113899.92	1	0	0	80215.20	0	2	0	2	2	5	3.0	7.22	4.011000
9176	15815645	481	0	37	8	152303.66	2	1	1	175082.20	0	2	3	5	9	9	3.0	2.62	8.755319
9177	15815656	541	1	39	9	100116.67	1	1	1	199808.10	1	2	0	5	8	9	3.0	3.82	9.991866
9178	15815660	758	1	34	1	154139.45	1	1	1	60728.89	0	1	3	3	9	9	3.0	8.16	3.036486
9179	15815690	614	1	40	3	113348.50	1	1	1	77789.01	0	1	0	3	8	9	4.0	5.28	3.889666

9180 rows × 19 columns

通过spss软件的斯皮尔逊相关系数分析，并做热力图可得知，HasCrCard(是否持有信用卡)列数据对长期忠诚度指标相关性甚小，所以模型训练时将删去此列。

long_data99=long_data0.loc[:,['CustomerId','标准化信用','Gender','离散化年龄','Status','AssetStage','NumOfProducts','IsActiveMember','标准化个人年收入']]
long_data99['Exited']=long_data0['Exited']
long_data99

	CustomerId	标准化信用	Gender	离散化年龄	Status	AssetStage	NumOfProducts	IsActiveMember	标准化个人年收入	Exited
0	15553251	7.26	1	5.0	1	3	1	1	2.318373	1
1	15553256	5.38	1	4.0	2	2	3	1	3.993573	1
2	15553283	5.06	1	4.0	2	0	1	0	7.234663	1
3	15553308	4.78	1	6.0	1	2	1	0	3.055473	1
4	15553387	6.74	1	3.0	1	2	3	0	9.408872	1
...	...	...	...	...	...	...	...	...	...	...
9175	15815628	7.22	1	3.0	2	0	1	0	4.011000	0
9176	15815645	2.62	0	3.0	2	3	2	1	8.755319	0
9177	15815656	3.82	1	3.0	2	0	1	1	9.991866	1
9178	15815660	8.16	1	3.0	1	3	1	1	3.036486	0
9179	15815690	5.28	1	4.0	1	0	1	1	3.889666	0

9180 rows × 10 columns

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier  # 导入sklearn库的RandomForestClassifier函数
from sklearn import metrics  # 分类结果评价函数
from matplotlib import pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve
from sklearn.metrics import auc

x=long_data99.iloc[:,1:-1]     #特征
y=long_data99.iloc[:,-1]       #标签
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=0, train_size=0.8) 
model = RandomForestClassifier()  # 实例化模型RandomForestClassifier
model.fit(x_train, y_train)  # 在训练集上训练模型
# 在测试集上测试模型
expected = y_test  # 测试样本的期望输出
predicted = model.predict(x_test)  # 测试样本预测

print(metrics.classification_report(expected, predicted))  # 输出结果，精确度、召回率、f-1分数

              precision    recall  f1-score   support

           0       0.86      0.94      0.90      1457
           1       0.66      0.43      0.52       379

    accuracy                           0.84      1836
   macro avg       0.76      0.69      0.71      1836
weighted avg       0.82      0.84      0.82      1836

print(metrics.confusion_matrix(expected, predicted))  # 混淆矩阵

[[1372   85]
 [ 215  164]]

auc = metrics.roc_auc_score(y_test, predicted)
accuracy = metrics.accuracy_score(y_test, predicted)  # 求精度
print("Accuracy: %.2f%%" % (accuracy * 100.0))

Accuracy: 83.66%

# y_test.shape
# x_test.shape
predicted

array([0, 0, 0, ..., 0, 0, 0], dtype=int64)

5.2

使用混淆矩阵以及F1 Score方法对模型进行评估

def plot_confusion_matrix(cm, classes,normalize=False,title='Confusion matrix',cmap=plt.cm.Blues):
    """
    This function prints and plots the confusion matrix.
    Normalization can be applied by setting `normalize=True`.
    """
    if normalize:
        cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
        print("Normalized confusion matrix")
    else:
        print('Confusion matrix, without normalization')
    print(cm)
    plt.imshow(cm, interpolation='nearest', cmap=cmap)
    plt.title(title)
    plt.colorbar()
    tick_marks = np.arange(len(classes))
    plt.xticks(tick_marks, classes, rotation=45)
    plt.yticks(tick_marks, classes)
    fmt = '.2f' if normalize else 'd'
    thresh = cm.max() / 2.
    for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
        plt.text(j, i, format(cm[i, j], fmt),
                 horizontalalignment="center",
                 color="white" if cm[i, j] > thresh else "black")
    plt.tight_layout()
    plt.ylabel('True label')
    plt.xlabel('Predicted label')

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import KFold, cross_val_score
from sklearn.metrics import confusion_matrix, recall_score, classification_report
import itertools
cnf_matrix = confusion_matrix(expected, predicted) #计算混淆矩阵
class_names = [0,1]
plt.figure()
plot_confusion_matrix(cnf_matrix, classes = class_names, title = 'Confusion matrix')  #绘制混淆矩阵
np.set_printoptions(precision=2)
print('Accary:', (cnf_matrix[1,1]+cnf_matrix[0,0])/(cnf_matrix[1,1]+cnf_matrix[0,1]+cnf_matrix[0,0]+cnf_matrix[1,0]))
print('Recall:', cnf_matrix[1,1]/(cnf_matrix[1,1]+cnf_matrix[1,0]))     
print('Precision:', cnf_matrix[1,1]/(cnf_matrix[1,1]+cnf_matrix[0,1]))  
print('Specificity:', cnf_matrix[0,0]/(cnf_matrix[0,1]+cnf_matrix[0,0]))
plt.show()

Confusion matrix, without normalization
[[1372   85]
 [ 215  164]]
Accary: 0.8366013071895425
Recall: 0.43271767810026385
Precision: 0.6586345381526104
Specificity: 0.9416609471516816

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-w59QsDp2-1669538327358)(output_24_1.png)]

long_data99

	CustomerId	标准化信用	Gender	离散化年龄	Status	AssetStage	NumOfProducts	IsActiveMember	标准化个人年收入	Exited
0	15553251	7.26	1	5.0	1	3	1	1	2.318373	1
1	15553256	5.38	1	4.0	2	2	3	1	3.993573	1
2	15553283	5.06	1	4.0	2	0	1	0	7.234663	1
3	15553308	4.78	1	6.0	1	2	1	0	3.055473	1
4	15553387	6.74	1	3.0	1	2	3	0	9.408872	1
...	...	...	...	...	...	...	...	...	...	...
9175	15815628	7.22	1	3.0	2	0	1	0	4.011000	0
9176	15815645	2.62	0	3.0	2	3	2	1	8.755319	0
9177	15815656	3.82	1	3.0	2	0	1	1	9.991866	1
9178	15815660	8.16	1	3.0	1	3	1	1	3.036486	0
9179	15815690	5.28	1	4.0	1	0	1	1	3.889666	0

9180 rows × 10 columns

#考虑类别的不平衡性，需要计算类别的加权平均，则使用‘weighted’
#F1分数（F1-score）是分类问题的一个衡量指标,常常将F1-score作为最终测评的方法。

from sklearn.metrics import f1_score
 
print(f1_score(expected, predicted, average='weighted'))

0.8231781531939228

通过混淆矩阵的热力图以及F1 Score的得到的测评分数可知，上述预测模型准确率召回率精确率都不错，因此上述综合评判，预测模型性能良好

5.3

想要使用预测数据预测客户长期忠诚度，则还需对预测数据进行同样的标准化、离散化、面元划分等处理。

long_data_test=pd.read_csv('long-customer-test.csv')

long_data0=long_data_test.copy()
long_data0.loc[(long_data0['Age']<18),'离散化年龄']=1
long_data0.loc[(long_data0['Age']>=18) & (long_data0['Age']<30),'离散化年龄']=2
long_data0.loc[(long_data0['Age']>=30) & (long_data0['Age']<40),'离散化年龄']=3
long_data0.loc[(long_data0['Age']>=40) & (long_data0['Age']<50),'离散化年龄']=4
long_data0.loc[(long_data0['Age']>=50) & (long_data0['Age']<60),'离散化年龄']=5
long_data0.loc[(long_data0['Age']>=60) & (long_data0['Age']<70),'离散化年龄']=6
long_data0.loc[(long_data0['Age']>=70),'离散化年龄']=7
long_data0

	CustomerId	CreditScore	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	离散化年龄
0	15647311	608	1	41	1	83807.86	1	0	1	112542.58	4.0
1	15737452	653	0	58	1	132602.88	1	1	0	5097.67	5.0
2	15577657	732	0	41	8	0.00	2	1	1	170886.17	4.0
3	15589475	591	1	39	3	0.00	3	1	0	140469.38	3.0
4	15687946	556	1	61	2	117419.35	1	1	1	94153.83	6.0
...	...	...	...	...	...	...	...	...	...	...	...
995	15732202	615	0	34	1	83503.11	2	1	1	73124.53	3.0
996	15735078	724	1	53	1	139687.66	2	1	1	12913.92	5.0
997	15707861	520	1	46	10	85216.61	1	1	0	117369.52	4.0
998	15594612	702	0	44	9	0.00	1	0	0	59207.41	4.0
999	15806360	609	0	41	6	0.00	1	0	1	112585.19	4.0

1000 rows × 11 columns

long_data0.loc[(long_data0.Tenure>6),'Status']='老客户'
long_data0.loc[(long_data0.Tenure<=3),'Status']='稳定客户'
long_data0.loc[(long_data0.Status.isna()),'Status']='新客户'

long_data0.loc[(long_data0.Balance>120000),'AssetStage']='高资产'
long_data0.loc[(long_data0.Balance>90000) & (long_data0.Balance<=120000),'AssetStage']='中上资产'
long_data0.loc[(long_data0.Balance>50000) & (long_data0.Balance<=90000),'AssetStage']='中下资产'
long_data0.loc[(long_data0.Balance<=50000),'AssetStage']='低资产'
long_data0

	CustomerId	CreditScore	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	离散化年龄	Status	AssetStage
0	15647311	608	1	41	1	83807.86	1	0	1	112542.58	4.0	稳定客户	中下资产
1	15737452	653	0	58	1	132602.88	1	1	0	5097.67	5.0	稳定客户	高资产
2	15577657	732	0	41	8	0.00	2	1	1	170886.17	4.0	老客户	低资产
3	15589475	591	1	39	3	0.00	3	1	0	140469.38	3.0	稳定客户	低资产
4	15687946	556	1	61	2	117419.35	1	1	1	94153.83	6.0	稳定客户	中上资产
...	...	...	...	...	...	...	...	...	...	...	...	...	...
995	15732202	615	0	34	1	83503.11	2	1	1	73124.53	3.0	稳定客户	中下资产
996	15735078	724	1	53	1	139687.66	2	1	1	12913.92	5.0	稳定客户	高资产
997	15707861	520	1	46	10	85216.61	1	1	0	117369.52	4.0	老客户	中下资产
998	15594612	702	0	44	9	0.00	1	0	0	59207.41	4.0	老客户	低资产
999	15806360	609	0	41	6	0.00	1	0	1	112585.19	4.0	新客户	低资产

1000 rows × 13 columns

from sklearn.preprocessing import LabelEncoder
t1=long_data0.loc[:,'Status']    #要输入的是标签，不是特征矩阵，所以允许一维数据
t2=long_data0.loc[:,'AssetStage']

le1 = LabelEncoder()     #实例化
le1 = le1.fit(t1) # 导入数据
label1 = le1.transform(t1) # transform接口调取结果
long_data0.loc[:,"Status"] = label1
print(long_data0['Status'].unique())

le2 = LabelEncoder()     #实例化
le2 = le2.fit(t2) # 导入数据
label2 = le2.transform(t2) # transform接口调取结果
long_data0.loc[:,"AssetStage"] = label2
long_data0['AssetStage'].unique()

[1 2 0]





array([1, 3, 2, 0])

the_min1=long_data0['CreditScore'].min()
the_max1=long_data0['CreditScore'].max()

the_min2=long_data0['EstimatedSalary'].min()
the_max2=long_data0['EstimatedSalary'].max()

# the_min3=long_data0['Balance'].min()
# the_max3=long_data0['Balance'].max()

long_data0['标准化信用']=((long_data0['CreditScore']-the_min1)/(the_max1-the_min1))*10
long_data0['标准化个人年收入']=((long_data0['EstimatedSalary']-the_min2)/(the_max2-the_min2))*10
#long_data0['标准化金融资产']=((long_data0['Balance']-the_min3)/(the_max3-the_min3))*10
long_data0

	CustomerId	CreditScore	Gender	Age	Tenure	Balance	NumOfProducts	HasCrCard	IsActiveMember	EstimatedSalary	离散化年龄	Status	AssetStage	标准化信用	标准化个人年收入
0	15647311	608	1	41	1	83807.86	1	0	1	112542.58	4.0	1	1	5.16	5.631343
1	15737452	653	0	58	1	132602.88	1	1	0	5097.67	5.0	1	3	6.06	0.250472
2	15577657	732	0	41	8	0.00	2	1	1	170886.17	4.0	2	2	7.64	8.553206
3	15589475	591	1	39	3	0.00	3	1	0	140469.38	3.0	1	2	4.82	7.029924
4	15687946	556	1	61	2	117419.35	1	1	1	94153.83	6.0	1	0	4.12	4.710429
...	...	...	...	...	...	...	...	...	...	...	...	...	...	...	...
995	15732202	615	0	34	1	83503.11	2	1	1	73124.53	3.0	1	1	5.30	3.657276
996	15735078	724	1	53	1	139687.66	2	1	1	12913.92	5.0	1	3	7.48	0.641911
997	15707861	520	1	46	10	85216.61	1	1	0	117369.52	4.0	2	1	3.40	5.873077
998	15594612	702	0	44	9	0.00	1	0	0	59207.41	4.0	2	2	7.04	2.960302
999	15806360	609	0	41	6	0.00	1	0	1	112585.19	4.0	0	2	5.18	5.633476

1000 rows × 15 columns

处理结果如下：

long_data99=long_data0.loc[:,['CustomerId','标准化信用','Gender','离散化年龄','Status','AssetStage','NumOfProducts','IsActiveMember','标准化个人年收入']]
long_data99

	CustomerId	标准化信用	Gender	离散化年龄	Status	AssetStage	NumOfProducts	IsActiveMember	标准化个人年收入
0	15647311	5.16	1	4.0	1	1	1	1	5.631343
1	15737452	6.06	0	5.0	1	3	1	0	0.250472
2	15577657	7.64	0	4.0	2	2	2	1	8.553206
3	15589475	4.82	1	3.0	1	2	3	0	7.029924
4	15687946	4.12	1	6.0	1	0	1	1	4.710429
...	...	...	...	...	...	...	...	...	...
995	15732202	5.30	0	3.0	1	1	2	1	3.657276
996	15735078	7.48	1	5.0	1	3	2	1	0.641911
997	15707861	3.40	1	4.0	2	1	1	0	5.873077
998	15594612	7.04	0	4.0	2	2	1	0	2.960302
999	15806360	5.18	0	4.0	0	2	1	1	5.633476

1000 rows × 9 columns

使用模型对数据进行预测：

x_test=long_data99.iloc[:,1:]     #特征
predicted = model.predict(x_test)  # 测试样本预测
len(predicted)

long_data99['Exited']=predicted
long_data99

	CustomerId	标准化信用	Gender	离散化年龄	Status	AssetStage	NumOfProducts	IsActiveMember	标准化个人年收入	Exited
0	15647311	5.16	1	4.0	1	1	1	1	5.631343	1
1	15737452	6.06	0	5.0	1	3	1	0	0.250472	1
2	15577657	7.64	0	4.0	2	2	2	1	8.553206	0
3	15589475	4.82	1	3.0	1	2	3	0	7.029924	1
4	15687946	4.12	1	6.0	1	0	1	1	4.710429	0
...	...	...	...	...	...	...	...	...	...	...
995	15732202	5.30	0	3.0	1	1	2	1	3.657276	0
996	15735078	7.48	1	5.0	1	3	2	1	0.641911	0
997	15707861	3.40	1	4.0	2	1	1	0	5.873077	0
998	15594612	7.04	0	4.0	2	2	1	0	2.960302	1
999	15806360	5.18	0	4.0	0	2	1	1	5.633476	0

1000 rows × 10 columns

result=long_data99.loc[:,['CustomerId','Exited']]
result.set_index("CustomerId",inplace=True)
result.to_excel("result5.xlsx",encoding = 'openpyxl')

如下图为指定的 5 个客户 ID 的预测结果：

result1=result[result.index.isin([15579131,15674442,15719508,15730076,15792228])].sort_index()
result1

	Exited
CustomerId
15579131	0
15674442	0
15719508	1
15730076	0
15792228	1

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

泰迪杯

数据分析

python

2022年泰迪杯数据分析_B题：银行客户忠诚度分析赛题数据_任务五的相关文章

如何覆盖 Django 的默认管理模板和布局

我正在尝试覆盖 Django 的默认模板现在只有base site html 我正在尝试更改 django 管理文本我做了以下事情我在我的应用程序目录中创建了一个文件夹 opt mydjangoapp templates admin
检测到通过 ChromeDriver 启动的 Chrome 浏览器

我正在尝试在 python 中使用 selenium chromedriver 来访问 www mouser co uk 网站然而从第一次拍摄开始它就被检测为机器人有人对此有解释吗此后我使用的代码 options Options
在 Pandas 中，如何从基于另一个数据框的数据框中删除行？

我有 2 个数据框一个名为 USERS 另一个名为 EXCLUDE 他们都有一个名为电子邮件的字段基本上我想删除 USERS 中包含 EXCLUDE 中包含电子邮件的每一行我该怎么做您可以使用boolean indexing
如何在 Jupyter Notebook 中运行 Python 异步代码？

我有一些 asyncio 代码在 Python 解释器 CPython 3 6 2 中运行良好我现在想在具有 IPython 内核的 Jupyter 笔记本中运行它我可以运行它 import asyncio asyncio get ev
从 Python 下载/安装 Windows 更新

我正在编写一个脚本来自动安装 Windows 更新我可以将其部署在多台计算机上这样我就不必担心手动更新它们我想用 Python 编写这个但找不到任何关于如何完成此操作的信息我需要知道如何搜索更新下载更新并从 python 脚本安
Python 是解释型的还是编译型的，或者两者兼而有之？

据我了解 An 解释的语言是由解释器将高级语言转换为机器代码然后执行的程序实时运行和执行的高级语言它一次处理一点程序 A compiled语言是一种高级语言其代码首先由编译器将高级语言转换为机器代码的程序转换为机器代码然后由执
Python 使用 Gstreamer 访问 USB 麦克风时遇到问题，以便在 Raspberry Pi 上使用 Pocketsphinx 执行语音识别

所以Python的表现就好像它根本听不到我的麦克风发出的任何声音问题就在这里我有一个Python 2 7 假设使用的脚本Gstreamer通过以下方式访问我的麦克风并为我进行语音识别口袋狮身人面像我在用着脉冲音频我的设备是树莓派我的
字符串中的注释和注释中的字符串

我正在尝试使用 Python 和 Regex 计算 C 代码中包含的注释中的字符数但没有成功我可以先删除字符串以删除字符串中的注释但这也会删除注释中的字符串结果会很糟糕是否有机会通过使用正则表达式来询问不匹配注释中的字符串反之亦
编辑 Jupyter Notebook 时 VS Code 中缺少“在选择中查找”

使用 Jupyter Notebook 时 VSCode 中缺少在选择中查找按钮它会减慢开发速度所以我想请问有人知道如何激活它吗第一张图显示了在 python 文件中的搜索替换第二张图显示了笔记本电脑中缺少的按钮 Python
CNTK 抱怨 LSTM 中的动态轴

我正在尝试在 CNTK 中实现 LSTM 使用 Python 来对序列进行分类 Input 特征是固定长度的数字序列时间序列标签是 one hot 值的向量 Network input input variable input dim
如何在 Django 中使用基于类的视图创建注册视图？

当我开始使用 Django 时我几乎使用 FBV 基于函数的视图来处理所有事情包括注册新用户但当我更深入地研究项目时我意识到基于类的视图通常更适合大型项目因为它们更干净且可维护但这并不是说 FBV 不是无论如何我将整个项目
Python、subprocess、call()、check_call 和 returncode 来查找命令是否存在

我已经弄清楚如何使用 call 让我的 python 脚本运行命令 import subprocess mycommandline lumberjack sleep all night work all day subprocess cal
Pandas 堆积条形图中元素的排序

我正在尝试绘制有关某个地区 5 个地区的家庭在特定行业赚取的收入比例的信息我使用 groupby 按地区对数据框中的信息进行排序 df df orig groupby District Portion of income value co
Python：我不明白 sum() 的完整用法

当然我明白你使用 sum 与几个数字然后它总结所有但我正在查看它的文档我发现了这一点 sum iterable start 第二个参数 start 的作用是什么这太尴尬了但我似乎无法通过谷歌找到任何示例并且对于尝试学习该语言的
Werkzeug 中的线程和本地代理。用法

首先我想确保我正确理解了功能的分配分配本地代理功能以通过线程内的模块包共享变量对象我对吗其次用法对我来说仍然不清楚也许是因为我误解了作业我用烧瓶如果我有两个或更多模块 A B 我想将对象C从模块A导入到模块B 但我
从 python 检测 macOS 中的暗模式

我正在编写一个 PyQt 应用程序我必须添加一个补丁以便在启用暗模式的 Macos 上可以读取字体 app QApplication Fix for the font colours on macos when running dark
Python对象初始化性能

我只是做了一些快速的性能测试我注意到一般情况下初始化列表比显式初始化列表慢大约四到六倍这些可能是错误的术语我不确定这里的行话例如 gt gt gt import timeit gt gt gt print timeit timeit
附加两个具有相同列、不同顺序的数据框

我有两个熊猫数据框 noclickDF DataFrame 0 123 321 0 1543 432 columns click id location clickDF DataFrame 1 123 421 1 1543 436 colu
导入错误：无法导入名称“时间戳”

我使用以下代码在 python 3 6 3 中成功安装了 ggplot conda install c conda forge ggplot 但是当我使用下面的代码将其导入笔记本时出现错误 from ggplot import Impor
通过 Web 界面执行 python 单元测试

是否可以通过 Web 界面执行单元测试如果可以如何执行 EDIT 现在我想要结果对于测试我希望它们是自动化的可能每次我对代码进行更改时抱歉我忘了说得更清楚 EDIT 这个答案此时已经过时了 Use Jenkins https j

随机推荐

Python 正则表达式验证IPv4地址

1 Simple regex to check for an IP address 0 9 1 3 3 0 9 1 3 2 Accurate regex to check for an IP address allowing leading
SecureCRT 从安装到当成串口简单使用的教程：

目录 1 安装 SecureCRT 9 2安装与激活教程哔哩哔哩 bilibili 看这个视频教程里面置顶的评论里有安装包 2 配置串口选择这个 Port会自动的识别出来你的端口如果没有就查驱动然后这里什么都不要选 3 配置显示
publish.vue?02fe:77 Uncaught (in promise) TypeError: _api_edu_course__WEBPACK_IMPORTED_MODULE_0__.de

前端Vue点击事件后没反应数据库数据未更新后端未报错然后发现控制台报错如下 Uncaught in promise TypeError api edu course WEBPACK IMPORTED MODULE 0 default
读取jar文件内容

一 SpringBoot项目打包成jar后读取文件的大坑使用ClassPathResource获取classpath下文件失败 java io FileNotFoundException class path resource World
less 函数_前端开发：less-gulp如何使用？

大家好我来了本期为大家带来的前端入门知识是前端开发 less gulp如何使用有兴趣做前端的朋友和我一起来看看吧主要内容 less gulp less和gulp 学习目标第一节 less上 1 less介绍是css的预处理语
从 Deblur GAN ( Keras ) 导出模型训练参数

由于 deblur gan master 实现上用的是 deblur gan 0c0c0296f143b7a070a0969cb64a8774f8e79f1d 也有一个去模糊的生成模型 generator h5 先安装 Keras 运行 p
家人们，有多久没看读者了

我好像很久没有看读者了可能是高中毕业之后当时我很喜欢读这个期刊虽然里面的很多文章都不太懂但我还是把它当作小说来看最近突然想起它于是一口气看完了一年的读者期刊希望提升一下自己的写作能力没想到读完以后却让我充满了感慨
学计算机的前后对比,2020计算机考研(408)大纲前后对比分析!

2020计算机考研 408 大纲已发布计算机考研 408 大纲对于考研计算机复习具有指导意义让复习方向化零为整提高复习效率在考研大纲发布后学府考研招生老师第一时间整理2020计算机考研 408 大纲速来查阅吧更有2020考研大
重置密码解决MySQL5.7 for Linux错误 ERROR 1045 (28000): Access denied for user 'root'@'localhost'

一般这个错误是由密码错误引起解决的办法自然就是重置密码假设我们使用的是root账户 1 重置密码的第一步就是跳过MySQL的密码认证过程方法如下 vim etc my cnf 注 windows下修改的是my ini 在文档内搜索my
centos4.7+rac+oracle10g + asm安装遇到问题

author skate time 2010 12 23 centos4 7 rac oracle10g asm安装遇到问题 1 在ORACLE 10G在安装cluster时第到61 的时候就提示如下错误 WARNING Error w
构建面向全世界的网站——gettext支持多种语言

原文地址 http book 51cto com art 200905 123469 htm 构建面向全世界的网站 Web的出现使人们之间的交流简单得不可思议通过一个Internet连接和一个Web浏览器你就可以与任何人通信而此时此刻
【ArcGIS】ArcGIS10.2完整安装操作手册-配置SDE的ST_Geometry

操作原因在 Oracle 中 ST Geometry 和 ST Raster 的 SQL 函数使用通过 Oracle 的外部过程代理即 extproc 访问的共享库要将 SQL 和 ST Geometry 或 ST Raster 配合
Go语言基础篇 (二)安装VSCode开发环境

文章目录前言开发环境概述安装VSCode 安装Remote SSH插件 Linux系统安装openssh 配置Remote SSH 测试下期预告参考连接前言无论什么编程语言一个得心应手的开发环境不仅可以提高我们的开发效率
(z) 什么是好的硬件工程师

1 具备高质量产品交付能力具备完备的知识体系有完整的较复杂硬件产品的交付能力和经验全面把握硬件研发流程对可靠性可维护性可测试性可生产性可供应性有深入理解和实战经验首先要结果导向做过好的产品或者说做得出好产品的工程师才
天池布匹瑕疵检测代码实现问题

天池布匹瑕疵检测GitHub代码实现一数据处理转换成coco数据集实现过程遇到的问题数据集处理部分分为两组每组都要删除一次模板图片 GitHub上少了第二次数据删除处理操作后续数据处理操作都是合理的没有问题下一节会贴出预训
unity 利用PlayerPrefs实现关卡解锁功能

关卡结构在关卡中确认过关的地方添加 public int jiesuo jiesuo SceneManager GetActiveScene buildIndex PlayerPrefs SetInt jiesuo jiesuo 然后在
Centos8 配置静态IP

安装centos 8之后重启启动网络时会出现以下报错报错信息如下 Failed to start network service Unit network service not found 意思为无法重启网络服务原因是无法找到ne
Pig基本语法项目实战

Apache Pig是一个高级过程语言适合使用Hadoop和Mapreduce平台来查询大型半结构化数据当Pig处理数据时 Pig本身会在后台生成一系列得MapReduce操作来执行任务这个过程对用户来说是透明的用于执行Pig L
mysql连接符_【MySQL】字符串连接

CONCAT 函数可以连接一或者多个字符串不过其中如果有 Null 就直接返回 Null mysql gt select concat 11 22 33 44 concat 11 22 33 44 11223344 1 row in s
2022年泰迪杯数据分析_B题：银行客户忠诚度分析赛题数据_任务五

银行客户长期忠诚度预测建模此处忠诚度的指标选取为客户的流失情况该指标分为两类长期客户忠诚用1表示不忠诚则用0表示对此预测建模中使用到机器学习分类中的随机森林分类 import pandas as pd long data26 p

2022年泰迪杯数据分析_B题：银行客户忠诚度分析赛题数据_任务五

2022年泰迪杯数据分析_B题：银行客户忠诚度分析赛题数据_任务五 的相关文章

随机推荐

热门标签

2022年泰迪杯数据分析_B题：银行客户忠诚度分析赛题数据_任务五的相关文章