Use numpy.r_ https://docs.scipy.org/doc/numpy/reference/generated/numpy.r_.html连接第一个和最后一个位置,然后更改值iloc http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.iloc.html,对于列的位置B
use Index.get_loc http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.get_loc.html:
N = .2
total = len(df.index)
#convert to int for always integer
i = int(total * N)
idx = np.r_[0:i, total-i:total]
df.iloc[idx, df.columns.get_loc('B')] = 0
Or:
N = .2
total = len(df.index)
i = int(total * N)
pos = df.columns.get_loc('B')
df.iloc[:i, pos] = 0
df.iloc[total - i:, pos] = 0
print (df)
A B
0 1 0
1 2 7
2 3 8
3 4 4
4 5 0
EDIT:
If Sparsedataframe http://pandas.pydata.org/pandas-docs/stable/sparse.html#sparsedataframe并且相同类型的值可以转换为 numpy 数组,设置值并转换回来:
arr = df.values
N = .2
total = len(df.index)
i = int(total * N)
pos = df.columns.get_loc('B')
idx = np.r_[0:i, total-i:total]
arr[idx, pos] = 0
print (arr)
[[1 0]
[2 7]
[3 8]
[4 4]
[5 0]]
df = pd.SparseDataFrame(arr, columns=df.columns)
print (df)
A B
0 1 0
1 2 7
2 3 8
3 4 4
4 5 0
print (type(df))
<class 'pandas.core.sparse.frame.SparseDataFrame'>
EDIT1:
另一个解决方案是先转换为密集,然后再转换回来:
df = df.to_dense()
#apply solution
df = df.to_sparse()