python如何提取excel表格内部子表格?
问题描述
使用python提取内容未知的excel表格中子表格
类似图片的红框部分,如何在读取excel表格的之后,根据提前配置的关键字,完成对红色部分的子表的提取。
回答:
数据表格
相关读取代码
import xlrdworkbook = xlrd.open_workbook(u'1.xlsx')
sheet_names = workbook.sheet_names()
for sheet_name in sheet_names:
sheet2 = workbook.sheet_by_name(sheet_name)
hangshu = sheet2.nrows
lieshu = sheet2.ncols
for i in range(hangshu):
print(sheet2.row_values(i))
执行结果
['', '', '', '', '', '', '']['', '', '', '', '', '', '']
['合并单元格1', '', 'ID', '字段1', '字段2', '字段3', '字段4']
['', '', 1.0, 1.0, 1.0, 1.0, 1.0]
['', '', 2.0, 2.0, 2.0, 2.0, 2.0]
['', '', 3.0, 3.0, 3.0, 3.0, 3.0]
['', '', 4.0, 4.0, 4.0, 4.0, 4.0]
['', '', '', '', '', '', '']
['合并单元格2', '', 'ID', '字段1', '字段2', '字段3', '字段4']
['', '', '', '', '', '', '']
['', '', '', '', '', '', '']
['', '', '', '', '', '', '']
['', '', '', '', '', '', '']
['', '', '', '', '', '', '']
['', '', 0.0, 0.0, 0.0, 0.0, 0.0]
回答:
示例数据:
假设你是想获取“用户填写”和“公司填写”之间的数据。
import pandas as pddf = pd.read_excel("test.xlsx")
target_index = df.iloc[:,0].isin(["用户填写", "公司填写"]).index
res = df.iloc[target_index[0]:target_index[1],:]
res = res.rename(columns=res.iloc[0]).drop(res.index[0])
结果:
Out[42]: 0 用户填写 col1 col2 col3
1 NaN NaN NaN NaN
2 NaN NaN NaN NaN
3 NaN NaN NaN NaN
4 NaN NaN NaN NaN
以上是 python如何提取excel表格内部子表格? 的全部内容, 来源链接: utcz.com/a/158389.html