Python自動(dòng)化讀取txt文件數(shù)據(jù)的8個(gè)實(shí)用腳本

更新時(shí)間：2025年09月05日 09:47:33 作者：Python資訊站

這篇文章主要為大家詳細(xì)介紹了Python自動(dòng)化讀取txt文件數(shù)據(jù)的8個(gè)實(shí)用腳本,包括讀取,對(duì)比,轉(zhuǎn)換格式等,感興趣的小伙伴可以跟隨小編一起學(xué)習(xí)一下

準(zhǔn)備工作：安裝所需的Python庫(kù)

re（正則表達(dá)式操作，用于復(fù)雜文本匹配）
csv（處理CSV文件）
json（處理JSON文件）
collections（用于統(tǒng)計(jì)詞頻）
matplotlib 和 wordcloud（生成詞云圖）

1.讀取txt內(nèi)容

1.1 逐行讀取txt文件

在數(shù)據(jù)處理的第一步就是讀取txt文件。以下是逐行讀取txt文件的示例代碼：

def read_txt_file_by_line(filepath):
    with open(filepath, 'r', encoding='utf-8') as file:
        for line in file:
            print(line.strip())
# 示例調(diào)用
read_txt_file_by_line('example.txt')

1.2 讀入整個(gè)txt文件內(nèi)容

如果需要將整個(gè)txt文件的內(nèi)容讀入到一個(gè)字符串中，可以使用以下代碼：

def read_txt_file(filepath):
    with open(filepath, 'r', encoding='utf-8') as file:
        content = file.read()
    return content
# 示例調(diào)用
content = read_txt_file('example.txt')
print(content)

2. 對(duì)比兩個(gè)txt文件內(nèi)容

2.1 基本文本對(duì)比

有時(shí)候我們需要比較兩個(gè)txt文件內(nèi)容是否相同，以下代碼可以實(shí)現(xiàn)這一功能：

def compare_txt_files(file1, file2):
    with open(file1, 'r', encoding='utf-8') as f1, open(file2, 'r', encoding='utf-8') as f2:
        content1 = f1.readlines()
        content2 = f2.readlines()    
    for line1, line2 in zip(content1, content2):
        if line1 != line2:
            print(f'Difference found:\nFile1: {line1}\nFile2: {line2}')
# 示例調(diào)用
compare_txt_files('file1.txt', 'file2.txt')

2.2 差異高亮顯示

為了更直觀地顯示txt文件之間的差異，可以用差異高亮顯示的方法。我們使用difflib庫(kù)來(lái)實(shí)現(xiàn)：

import difflib
def highlight_differences(file1, file2):
    with open(file1, 'r', encoding='utf-8') as f1, open(file2, 'r', encoding='utf-8') as f2:
        content1 = f1.readlines()
        content2 = f2.readlines()

    diff = difflib.unified_diff(content1, content2, fromfile='file1', tofile='file2')
    for line in diff:
        print(line)
# 示例調(diào)用
highlight_differences('file1.txt', 'file2.txt')

3. txt文件內(nèi)容過(guò)濾

3.1 過(guò)濾特定關(guān)鍵字行

在處理txt文件時(shí)，可能需要過(guò)濾掉包含特定關(guān)鍵字的行。以下是一個(gè)示例代碼：

def filter_lines_by_keyword(filepath, keyword):
    with open(filepath, 'r', encoding='utf-8') as file:
        lines = file.readlines()
    
    filtered_lines = [line for line in lines if keyword not in line]
    return filtered_lines
# 示例調(diào)用
filtered = filter_lines_by_keyword('example.txt', 'filter_keyword')
for line in filtered:
    print(line.strip())

3.2 過(guò)濾空行和注釋行

有時(shí)候需要過(guò)濾掉空行和注釋行（比如以#開(kāi)頭的行）。以下是實(shí)現(xiàn)這一功能的代碼：

def filter_empty_and_comment_lines(filepath):
    with open(filepath, 'r', encoding='utf-8') as file:
        lines = file.readlines()
    
    filtered_lines = [line for line in lines if line.strip() and not line.strip().startswith('#')]
    return filtered_lines
# 示例調(diào)用
filtered = filter_empty_and_comment_lines('example.txt')
for line in filtered:
    print(line.strip())

4. 合并多個(gè)txt文件

4.1 簡(jiǎn)單合并

將多個(gè)txt文件的內(nèi)容簡(jiǎn)單合并成一個(gè)文件，可以使用以下代碼：

def merge_txt_files(file_list, output_file):
    with open(output_file, 'w', encoding='utf-8') as outfile:
        for file in file_list:
            with open(file, 'r', encoding='utf-8') as infile:
                outfile.write(infile.read())
                outfile.write('\n')
# 示例調(diào)用
merge_txt_files(['file1.txt', 'file2.txt', 'file3.txt'], 'merged.txt')

4.2 按行混合合并

如果需要按行混合合并多個(gè)文件的內(nèi)容，可以使用以下代碼：

def merge_files_by_line(file_list, output_file):
    files = [open(file, 'r', encoding='utf-8') for file in file_list]
    with open(output_file, 'w', encoding='utf-8') as outfile:
        while True:
            lines = [file.readline() for file in files]
            if all(line == '' for line in lines):
                break
            for line in lines:
                if line:
                    outfile.write(line.strip() + '\n')
    for file in files:
        file.close()
# 示例調(diào)用
merge_files_by_line(['file1.txt', 'file2.txt', 'file3.txt'], 'merged_by_line.txt')

5. 將txt文件轉(zhuǎn)換為其他格式

5.1 轉(zhuǎn)換為csv格式

有時(shí)候我們需要將txt文件的內(nèi)容轉(zhuǎn)換成csv格式以便進(jìn)行數(shù)據(jù)處理或分析，下面是相關(guān)代碼示例：

import csv
def txt_to_csv(txt_file, csv_file):
with open(txt_file, 'r', encoding='utf-8') as infile, open(csv_file, 'w', newline='', encoding='utf-8') 
as outfile:
        writer = csv.writer(outfile)
        for line in infile:
            writer.writerow(line.strip().split())
# 示例調(diào)用
txt_to_csv('example.txt', 'output.csv')

這段代碼將txt文件的內(nèi)容逐行讀取，并按空格或制表符拆分成csv格式。

5.2 轉(zhuǎn)換為json格式

除了csv格式，JSON格式也是常用的數(shù)據(jù)存儲(chǔ)格式。以下是將txt文件轉(zhuǎn)換為JSON格式的代碼示例：

import json
def txt_to_json(txt_file, json_file):
    data = []
    with open(txt_file, 'r', encoding='utf-8') as infile:
        for line in infile:
            data.append(line.strip())

    with open(json_file, 'w', encoding='utf-8') as outfile:
        json.dump(data, outfile, indent=4)
# 示例調(diào)用
txt_to_json('example.txt', 'output.json')

這段代碼將txt文件的每一行內(nèi)容作為JSON數(shù)組里的一個(gè)元素進(jìn)行存儲(chǔ)。

6. 從txt文件提取數(shù)據(jù)

6.1 提取特定模式的文本

有時(shí)候我們需要從txt文件中提取符合特定模式的文本，可以使用正則表達(dá)式(re庫(kù))來(lái)實(shí)現(xiàn)。以下代碼示例演示如何提取符合某個(gè)模式的文本：

import re
def extract_pattern_from_txt(pattern, txt_file):
    matches = []
    with open(txt_file, 'r', encoding='utf-8') as file:
        content = file.read()
        matches = re.findall(pattern, content)
    return matches
# 示例調(diào)用，提取所有的數(shù)字
pattern = r'\d+'
matches = extract_pattern_from_txt(pattern, 'example.txt')
print("Match found:", matches)

6.2 提取郵件地址或URL

我們可以使用類(lèi)似的方法來(lái)提取郵件地址或URL：

def extract_emails_and_urls(txt_file):
    email_pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}'
    url_pattern = r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+'    
    with open(txt_file, 'r', encoding='utf-8') as file:
        content = file.read()
        
    emails = re.findall(email_pattern, content)
    urls = re.findall(url_pattern, content)    
    return emails, urls
# 示例調(diào)用
emails, urls = extract_emails_and_urls('example.txt')
print("Emails found:", emails)
print("URLs found:", urls)

7. 統(tǒng)計(jì)txt文件中的詞頻

7.1 統(tǒng)計(jì)單詞出現(xiàn)次數(shù)

我們可以統(tǒng)計(jì)txt文件中單詞的出現(xiàn)頻次，并對(duì)其進(jìn)行排序。以下代碼示例展示如何實(shí)現(xiàn)：

from collections import Counter

def count_word_frequency(txt_file):
    with open(txt_file, 'r', encoding='utf-8') as file:
        words = file.read().split()
        word_freq = Counter(words)
    return word_freq
# 示例調(diào)用
word_freq = count_word_frequency('example.txt')
for word, freq in word_freq.most_common():
    print(f'{word}: {freq}')

7.2 生成詞云圖

對(duì)于可視化效果，可以生成詞云圖來(lái)顯示詞頻分布：

from wordcloud import WordCloud
import matplotlib.pyplot as plt

def generate_word_cloud(txt_file):
    with open(txt_file, 'r', encoding='utf-8') as file:
        text = file.read()        
    wordcloud = WordCloud(width=800, height=400, background_color='white').generate(text)
    plt.figure(figsize=(10, 5))
    plt.imshow(wordcloud, interpolation='bilinear')
    plt.axis('off')
    plt.show()
# 示例調(diào)用
generate_word_cloud('example.txt')

8. 自動(dòng)生成txt報(bào)告

8.1 從模板生成報(bào)告

可以使用txt模板生成報(bào)告，將動(dòng)態(tài)數(shù)據(jù)填充到模板中。以下示例展示如何從模板生成報(bào)告：

def generate_report_from_template(template_file, output_file, data):
with open(template_file, 'r', encoding='utf-8') as infile, open(output_file, 'w', encoding='utf-8') 
as outfile:
        content = infile.read()
        for key, value in data.items():
            content = content.replace(f'{{{{ {key} }}}}', str(value))
        outfile.write(content)
# 示例調(diào)用
data = {
    'name': 'Alice',
    'date': '2024-08-17',
    'summary': 'This is a summary of the report.'}
generate_report_from_template('template.txt', 'report.txt', data)

8.2 動(dòng)態(tài)生成報(bào)告內(nèi)容

有時(shí)候需要?jiǎng)討B(tài)生成報(bào)告的內(nèi)容，以下示例展示如何實(shí)現(xiàn)：

def generate_dynamic_report(output_file, sections):
    with open(output_file, 'w', encoding='utf-8') as outfile:
        for section in sections:
            outfile.write(f'# {section["title"]}\n\n')
            outfile.write(f'{section["content"]}\n\n')
# 示例調(diào)用
sections = [{"title": "Introduction",
        "content": "This is the introduction section of the report."},
    {"title": "Data Analysis",
        "content": "This section contains the analysis of the data."}]
generate_dynamic_report('dynamic_report.txt', sections)

9. 最后

通過(guò)這篇文章，你已經(jīng)了解了使用Python進(jìn)行txt文件的多種辦公自動(dòng)化方法，包括讀取、對(duì)比、過(guò)濾、合并、轉(zhuǎn)換格式、提取數(shù)據(jù)、統(tǒng)計(jì)詞頻、生成報(bào)告等。這些技巧不僅能提高效率，還能為數(shù)據(jù)分析工作打下堅(jiān)實(shí)的基礎(chǔ)。

到此這篇關(guān)于Python自動(dòng)化讀取txt文件數(shù)據(jù)的8個(gè)實(shí)用腳本的文章就介紹到這了,更多相關(guān)Python讀取txt文件數(shù)據(jù)內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章:

国产无遮挡裸体免费直播视频,久久精品国产蜜臀av,动漫在线视频一区二区,欧亚日韩一区二区三区,久艹在线免费视频,国产精品美女网站免费,正在播放 97超级视频在线观看,斗破苍穹年番在线观看免费,51最新乱码中文字幕

Python自動(dòng)化讀取txt文件數(shù)據(jù)的8個(gè)實(shí)用腳本

目錄

準(zhǔn)備工作：安裝所需的Python庫(kù)

1.讀取txt內(nèi)容

1.1 逐行讀取txt文件

1.2 讀入整個(gè)txt文件內(nèi)容

2. 對(duì)比兩個(gè)txt文件內(nèi)容

2.1 基本文本對(duì)比

2.2 差異高亮顯示

3. txt文件內(nèi)容過(guò)濾

3.1 過(guò)濾特定關(guān)鍵字行

3.2 過(guò)濾空行和注釋行

4. 合并多個(gè)txt文件

4.1 簡(jiǎn)單合并

4.2 按行混合合并

5. 將txt文件轉(zhuǎn)換為其他格式

5.1 轉(zhuǎn)換為csv格式

5.2 轉(zhuǎn)換為json格式

6. 從txt文件提取數(shù)據(jù)

6.1 提取特定模式的文本

6.2 提取郵件地址或URL

7. 統(tǒng)計(jì)txt文件中的詞頻

7.1 統(tǒng)計(jì)單詞出現(xiàn)次數(shù)

7.2 生成詞云圖

8. 自動(dòng)生成txt報(bào)告

8.1 從模板生成報(bào)告

8.2 動(dòng)態(tài)生成報(bào)告內(nèi)容

9. 最后

相關(guān)文章

最新評(píng)論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

国产无遮挡裸体免费直播视频,久久精品国产蜜臀av,动漫在线视频一区二区,欧亚日韩一区二区三区,久艹在线 免费视频,国产精品美女网站免费,正在播放 97超级视频在线观看,斗破苍穹年番在线观看免费,51最新乱码中文字幕

Python自動(dòng)化讀取txt文件數(shù)據(jù)的8個(gè)實(shí)用腳本

目錄

準(zhǔn)備工作：安裝所需的Python庫(kù)

1.讀取txt內(nèi)容

1.1 逐行讀取txt文件

1.2 讀入整個(gè)txt文件內(nèi)容

2. 對(duì)比兩個(gè)txt文件內(nèi)容

2.1 基本文本對(duì)比

2.2 差異高亮顯示

3. txt文件內(nèi)容過(guò)濾

3.1 過(guò)濾特定關(guān)鍵字行

3.2 過(guò)濾空行和注釋行

4. 合并多個(gè)txt文件

4.1 簡(jiǎn)單合并

4.2 按行混合合并

5. 將txt文件轉(zhuǎn)換為其他格式

5.1 轉(zhuǎn)換為csv格式

5.2 轉(zhuǎn)換為json格式

6. 從txt文件提取數(shù)據(jù)

6.1 提取特定模式的文本

6.2 提取郵件地址或URL

7. 統(tǒng)計(jì)txt文件中的詞頻

7.1 統(tǒng)計(jì)單詞出現(xiàn)次數(shù)

7.2 生成詞云圖

8. 自動(dòng)生成txt報(bào)告

8.1 從模板生成報(bào)告

8.2 動(dòng)態(tài)生成報(bào)告內(nèi)容

9. 最后

相關(guān)文章

最新評(píng)論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

国产无遮挡裸体免费直播视频,久久精品国产蜜臀av,动漫在线视频一区二区,欧亚日韩一区二区三区,久艹在线免费视频,国产精品美女网站免费,正在播放 97超级视频在线观看,斗破苍穹年番在线观看免费,51最新乱码中文字幕