我有一个名为finalscores.txt
我想创建一个 python 脚本,它将打开它并从两个单独的列中读取值。
这是我的finalscores.txt
file
Atom nVa predppm avgppm stdev delta QPred QMulti qTotal
7.H2 2 7.674 7.853 0.000 0.000 0.968 1.000 0.993
9.H2 2 7.434 7.458 0.000 0.001 0.996 1.000 0.999
20.H2 1 7.602 7.898 0.000 0.000 0.945 1.000 0.982
21.H2 1 7.959 8.113 0.000 0.000 0.972 1.000 0.991
8.H1' 2 5.363 5.238 0.002 0.003 0.978 0.997 0.993
22.H1' 2 5.593 5.523 0.002 0.003 0.988 0.997 0.995
10.H1' 1 5.378 5.426 0.000 0.000 0.992 1.000 0.997
19.H1' 1 5.691 5.681 0.000 0.000 0.998 1.000 0.999
score: 0.9941270604681679
我想要获取的值来自第一列“Atom”和第四列“avgppm”。我不想接受第一行:
Atom nVa predppm avgppm stdev delta QPred QMulti qTotal
或最后一行:score: 0.9941270604681679
我还有另一个文件叫pinkH1_ppm.txt
我想打开它并附加它。这就是我的pinkH1_ppm.txt
好像:
2.H8 7.61004 0.3
1.H8 8.13712 0.3
3.H6 7.53261 0.3
4.H8 7.49932 0.3
5.H6 7.72158 0.3
7.H8 8.16859 0.3
6.H6 7.70272 0.3
9.H8 8.1053 0.3
8.H6 7.65014 0.3
10.H6 7.5231 0.3
11.H6 7.58213 0.3
12.H6 7.72805 0.3
13.H6 8.02977 0.3
14.H6 7.69624 0.3
15.H8 7.82994 0.3
17.H8 7.24899 0.3
18.H6 7.6439 0.3
20.H8 7.78512 0.3
19.H8 7.65501 0.3
22.H8 7.47677 0.3
23.H6 7.7306 0.3
24.H6 7.80104 0.3
25.H8 7.67295 0.3
26.H6 7.67463 0.3
27.H6 7.64807 0.3
1.H1' 5.8202 0.3
2.H1' 5.90291 0.3
4.H1' 5.74125 0.3
3.H1' 5.54935 0.3
6.H1' 5.54297 0.3
8.H1' 5.36287 0.3
11.H1' 5.50093 0.3
10.H1' 5.37814 0.3
14.H1' 5.96177 0.3
15.H1' 5.959 0.3
17.H1' 5.75214 0.3
19.H1' 5.69108 0.3
22.H1' 5.59257 0.3
24.H1' 5.55313 0.3
25.H1' 5.70819 0.3
27.H1' 5.74236 0.3
26.H1' 5.48061 0.3
我想检查我的“原子列”中是否有任何值finalscores.txt
,匹配第一列中的任何值pinkH1_ppm.txt
如果他们这样做,我想替换我的第二列pinkH1_ppm.txt
与我的 Atom 的值finalscores.txt
file.
例如,在finalscores.txt
, 19.H1' 是原子,也可以在pinkH1_ppm.txt
所以我想替换第二列中的值pinkH1_ppm.txt
,也对应于 19.H1',即 5.69108 和 5.681。
到目前为止,这是我的代码:
import pandas as pd
import os
import sys
import re
filename = 'finalscore.txt'
ppmColor = 'pinkH1_ppm.txt'
df = pd.read_cv(filename,sep = " ", skiprows = 1)
col1 = df["Atom"]
col2 = df["avgppm"]
df2 = pd.read_cv(ppmColor,sep = " ", skiprows=0)
name = df2[0]
ppm = df2[1]
with open(ppmColor, "a") as ppmAppend:
for line in ppmAppend
if col1 == name:
我正在尝试使用熊猫。我对第二个数据帧 df2 非常不确定,因为 ppmColor 文件中没有标题,我想从第一行开始读取它。我认为使用 pandas 将是最好的主意,但我不确定如何准确地解决这个问题。
Error:
replaceppm.py:10: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support skipfooter; you can avoid this warning by specifying engine='python'.
df=pd.read_csv('finalscore.txt',sep=r'\s+',skipfooter=1)
Traceback (most recent call last):
File"replaceppm.py", line 18, in <module>
pink.set_index("Atom",inplace=True)
NameError: name 'pink' is not defined