Python股票历史数据预处理(一)
在进行量化投资交易编程时,我们需要股票历史数据作为分析依据,下面介绍如何通过Python获取股票历史数据并且将结果存为DataFrame格式。处理后的股票历史数据下载链接为:http://download.csdn.net/detail/suiyingy/9688505。
具体步骤如下:
-
(1) 建立股票池,这里按照股本大小来作为选择依据。
-
(2) 分别读取股票池中所有股票的历史涨跌幅。
-
(3) 将各支股票的历史涨跌幅存到DataFrame结构变量中,每一列代表一支股票,对于在指定时间内还没有发行的股票的涨跌幅设置为0。
-
(4) 将DataFrame最后一行的数值设置为各支股票对应的交易天数。
-
(5) 将DataFrame数据存到csv文件中去。
具体代码如下:
# -*- coding: utf-8 -*-
"""
Created on Thu Nov 17 23:04:33 2016
获取股票的历史涨跌幅,先合并为DataFrame后存为csv格式
@author: yehxqq1513760265
"""
import numpy
as np
import pandas
as pd
#按照市值从小到大的顺序获得50支股票的代码
df
= get_fundamentals
(
query
(fundamentals.
eod_derivative_indicator.
market_cap
)
.
order_by
(fundamentals.
eod_derivative_indicator.
market_cap.
asc
(
)
)
.
limit
(
50
)
,
'2016-11-17'
,
'1y'
)
b1
=
{
}
priceChangeRate_300
= get_price_change_rate
(
'000300.XSHG'
,
'20060101'
,
'20161118'
)
df300
= pd.
DataFrame
(priceChangeRate_300
)
lenReference
=
len
(priceChangeRate_300
)
dfout
= df300
dflen
= pd.
DataFrame
(
)
dflen
[
'000300.XSHG'
]
=
[lenReference
]
#分别对这一百只股票进行50支股票操作
#获取从2006.01.01到2016.11.17的涨跌幅数据
#将数据存到DataFrame中
#DataFrame存为csv文件
for stock
in
range
(
50
):
priceChangeRate
= get_price_change_rate
(df
[
'market_cap'
].
columns
[stock
]
,
'20150101'
,
'20161118'
)
if priceChangeRate
is
None:
openDays
=
0
else:
openDays
=
len
(priceChangeRate
)
dftempPrice
= pd.
DataFrame
(priceChangeRate
)
tempArr
=
[
]
for i
in
range
(lenReference
):
if df300.
index
[i
]
in
list
(dftempPrice.
index
):
#保存为4位有效数字
tempArr.
append
(
"%.4f" %
(
(dftempPrice.
loc
[
str
(df300.
index
[i
]
)
]
[
0
]
)
)
)
pass
else:
tempArr.
append
(
float
(
0.0
)
)
fileName
=
''
fileName
= fileName.
join
(df
[
'market_cap'
].
columns
[stock
].
split
(
'.'
)
)
dfout
[fileName
]
= tempArr
dflen
[fileName
]
=
[
len
(priceChangeRate
)
]
dfout
= dfout.
append
(dflen
)
dfout.
to_csv
(
'00050.csv'
)