利用Python操作Word文檔頁碼的實際應(yīng)用

更新時間：2025年09月26日 15:12:45 作者：睿思達(dá)DBA_WGX

在撰寫長篇文檔時,經(jīng)常需要將文檔分成多個節(jié),每個節(jié)都需要單獨的頁碼,下面這篇文章主要介紹了利用Python操作Word文檔頁碼的相關(guān)資料,文中通過代碼介紹的非常詳細(xì),需要的朋友可以參考下

需求：

一次性處理24個文檔的頁碼。

文檔詳情：

1、每個word文檔包含800頁左右，每一頁包含一個標(biāo)題和一張圖片。

2、由于圖片有橫排也有豎排，因此，每頁文檔都進(jìn)行了分節(jié)處理。

3、但每一節(jié)的頁碼格式不統(tǒng)一，并且沒有連續(xù)編號。

要求：

1、所有頁面的頁碼必須連續(xù)編號。

2、所有頁面的頁碼格式必須統(tǒng)一（字體、字號相同）。

如果手工處理工作量很大，因為無法全部選中頁腳。使用Python語言程序來處理上述文檔，程序代碼如下：

from docx import Document
from docx.oxml.shared import qn
from docx.oxml import parse_xml

def process_word_document(doc_path, output_path):
    # 打開Word文檔
    doc = Document(doc_path)
    
    # 獲取文檔中的所有節(jié)
    sections = doc.sections
    
    print(f"文檔共有 {len(sections)} 個節(jié)")
    
    # 處理第一節(jié)（特殊處理，不鏈接到前一節(jié)）
    first_section = sections[0]
    first_footer = first_section.footer
    
    # 清除第一節(jié)頁腳內(nèi)容
    for paragraph in list(first_footer.paragraphs):
        p = paragraph._element
        p.getparent().remove(p)
    
    for table in list(first_footer.tables):
        t = table._element
        t.getparent().remove(t)
    
    print("已處理第1節(jié)")
    
    # 處理其他節(jié)
    for i, section in enumerate(sections[1:], 1):
        footer = section.footer
        
        # 清除頁腳內(nèi)容
        for paragraph in list(footer.paragraphs):
            p = paragraph._element
            p.getparent().remove(p)
        
        for table in list(footer.tables):
            t = table._element
            t.getparent().remove(t)
        
        # 設(shè)置頁腳鏈接到前一節(jié)
        footer.is_linked_to_previous = True
        
        # 設(shè)置頁碼為續(xù)前節(jié)
        sectPr = section._sectPr
        pgNumType = sectPr.find(qn('w:pgNumType'))
        
        if pgNumType is None:
            pgNumType = parse_xml(r'<w:pgNumType xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"/>')
            sectPr.append(pgNumType)
        
        # 移除start屬性以確保續(xù)前節(jié)
        start_attr = qn('w:start')
        if pgNumType.get(start_attr) is not None:
            pgNumType.attrib.pop(start_attr, None)
        
        # 每處理100節(jié)打印一次進(jìn)度
        if (i + 1) % 100 == 0:
            print(f"已處理 {i + 1} 個節(jié)")
    
    # 保存文檔
    doc.save(output_path)
    print(f"處理完成! 共處理了 {len(sections)} 個節(jié)")
    print("所有節(jié)的頁腳已清除，設(shè)置為鏈接到前一節(jié)，且頁碼設(shè)置為續(xù)前節(jié)")

# 使用示例
if __name__ == "__main__":
    input_file = r"d:\wgx\ok\a619.docx"  # 輸入文件路徑
    output_file = r"d:\wgx\ok\a6190.docx"  # 輸出文件路徑
    process_word_document(input_file, output_file)

由于程序代碼調(diào)用了第三方庫（python-docx），因此需要先安裝python-docx庫才能運行上述代碼。

打開windows命令行窗口，執(zhí)行如下命令：