ACDAEACCDDAEFFAFCCDFABAADACCCDDAFEDGABEGFCADEHBHACAGHACDEDECDAFAFAFCACDCDEAFCDFCGHDDFF
1 ACD.E..........F......................................................................
2 ...A...CD...F.........................................................................
3 .....AC..D.E.F........................................................................
4 ..........A.....C.DF..................................................................
5 ..............A..C......D.............EG.......H......................................
6 ....................AB................................................................
7 ......................A.....C.D..E.G.........H........................................
8 .......................A..C..D..F.....................................................
9 .........................A.C......D.....F.............................................
10 ...............................A.....B................................................
11 ....................................A....C.DE......GH.................................
12 ..........................................A...B.......................................
13 ................................................AC.....DE.....F.......................
14 ..................................................A........CD.....F...................
15 .....................................................AC..DE.....F.....................
16 .............................................................A.....C..D....F..........
17 ...............................................................A.....C..DE......GH....
18 .................................................................A.....C.....DF.......
19 ....................................................................A.......C.....D.F.
20 ..........................................................................A....C...D.F
ACDAEACCDDAEFFAFCCDFABAADACCCDDAFEDGABEGFCADEHBHACAGHACDEDECDAFAFAFCACDCDEAFCDFCGHDDFF
1 11 1111 1 11111111111111111111111111211112111212
11121332234323514544667859897870879710559121172533411553355446375846976877069880779090
11122333334443332222222334444445544455444344443222333333333334444443444444554443332221
我需要这样做,因为这是研究论文中的一个图表,它按列号引用列。
因此理想情况下,列标签应该是数字而不是字母。
我尝试了 Python Pandas,导出到 csv 并导入它,但这并不正确:
列号从零开始,行号从 2 开始而不是 1。
这意味着按照研究论文,我必须做一点数学运算来索引表格。
index=False
我在将数据框转换为 .csv 时尝试使用 pandas 选项,并得到了
我从列名中删除了 0 索引,复制并粘贴并移动了一个位置,因此列号现在是正确的,但它仍然从第 2 行开始。我怎样才能将行号索引为零或将列名更改为数字而不是字母?
答案1
首先,我将论文中的图表复制粘贴到 Jupyter Notebook 中,并添加引号、逗号等,以在 Python 中创建数据结构:
colNames = "ACDAEACCDDAEFFAFCCDFABAADACCCDDAFEDGABEGFCADEHBHACAGHACDEDECDAFAFAFCACDCDEAFCDFCGHDDFF"
figure2 = {
1 : "ACD.E..........F......................................................................",
2 : "...A...CD...F.........................................................................",
3 : ".....AC..D.E.F........................................................................",
4 : "..........A.....C.DF..................................................................",
5 : "..............A..C......D.............EG.......H......................................",
6 : "....................AB................................................................",
7 : "......................A.....C.D..E.G.........H........................................",
8 : ".......................A..C..D..F.....................................................",
9 : ".........................A.C......D.....F.............................................",
10 : "...............................A.....B................................................",
11 : "....................................A....C.DE......GH.................................",
12 : "..........................................A...B.......................................",
13 : "................................................AC.....DE.....F.......................",
14 : "..................................................A........CD.....F...................",
15 : ".....................................................AC..DE.....F.....................",
16 : ".............................................................A.....C..D....F..........",
17 : "...............................................................A.....C..DE......GH....",
18 : ".................................................................A.....C.....DF.......",
19 : "....................................................................A.......C.....D.F.",
20 : "..........................................................................A....C...D.F"}
srcDig1=" 1 11 1111 1 11111111111111111111111111211112111212"
srcDig2="11121332234323514544667859897870879710559121172533411553355446375846976877069880779090"
numSrcs="11122333334443332222222334444445544455444344443222333333333334444443444444554443332221"
#colNames to list
colNamesArr = [x for x in colNames]
figureD = {k:[x for x in v] for k,v in figure2.items()}
接下来我使用 Pandas 输出 .csv
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
df = pd.DataFrame.from_dict(figureD, orient="index")
然后采取我的问题中概述的步骤,删除 0 并移位,再加上使用@David Yockey 的建议。