1
2
3
4
5
6
7
8
9
10
11
12
13
14
def iterpage(istream, pagesize):
buffer = []
for data in istream:
buffer.append(data)
if len(buffer)>=pagesize:
yield buffer
buffer = []
if buffer:
yield buffer
with open("source.txt", 'rt') as handle:
for page in iterpage(handle, 1000):
print page # or your business logical
print "-"*32 # page break
删除文本文件的前N行:
1
2
3
4
5
6
7
8
9
10
def removehead(filename, headlines):
buffer = []
with open(filename, 'rt') as handle:
for i, ln in enumerate(handle):
if ln <headlines:
continue
buffer.append(ln)
with open(filename, 'wt') as handle:
handle.writelines(buffer)
或者:
1
2
3
4
5
6
def getandremovehead(filename, headlines):
with open(filename, 'rt') as handle:
buffer = handle.readlines()
with open(filename, 'wt') as handle:
handle.writelines(buffer[headlines:])
return buffer[:headlines]
但遇到大文本文件时,删除其中N行不是很理想的业务方案
由于没有看到导出的表格样本,直接说吧:遍历建议直接用pandas的itertuples(),去除前面的空白字符串用lstrip()就行
df = pd.read_excel("test.xlsx")
for row in df.itertuples():
row.行名称=row.行名称.lstrip()
以上应该就可以了,注意缩进。