Python3正則匹配re.split，re.finditer及re.findall函數(shù)用法詳解

更新時間：2018年06月11日 14:57:08 作者：Citizen_Wang

這篇文章主要介紹了Python3正則匹配re.split，re.finditer及re.findall函數(shù)用法,結合實例形式詳細分析了正則匹配re.split，re.finditer及re.findall函數(shù)的概念、參數(shù)、用法及操作注意事項,需要的朋友可以參考下

本文實例講述了Python3正則匹配re.split，re.finditer及re.findall函數(shù)用法。分享給大家供大家參考，具體如下：

re.split re.finditer re.findall

@(python3)

官方 re 模塊說明文檔

re.compile() 函數(shù)

編譯正則表達式模式，返回一個對象?？梢园殉Ｓ玫恼齽t表達式編譯成正則表達式對象，方便后續(xù)調(diào)用及提高效率。

re 模塊最離不開的就是 re.compile 函數(shù)。其他函數(shù)都依賴于 compile 創(chuàng)建的正則表達式對象

re.compile(pattern, flags=0)

pattern 指定編譯時的表達式字符串
flags 編譯標志位，用來修改正則表達式的匹配方式。支持 re.L|re.M 同時匹配

flags 標志位參數(shù)

re.I(re.IGNORECASE)
使匹配對大小寫不敏感

re.L(re.LOCAL)
做本地化識別（locale-aware）匹配

re.M(re.MULTILINE)
多行匹配，影響 ^ 和 $

re.S(re.DOTALL)
使 . 匹配包括換行在內(nèi)的所有字符

re.U(re.UNICODE)
根據(jù)Unicode字符集解析字符。這個標志影響 \w, \W, \b, \B.

re.X(re.VERBOSE)
該標志通過給予你更靈活的格式以便你將正則表達式寫得更易于理解。

示例：

import re
content = 'Citizen wang , always fall in love with neighbour，WANG'
rr = re.compile(r'wan\w', re.I) # 不區(qū)分大小寫
print(type(rr))
a = rr.findall(content)
print(type(a))
print(a)

findall 返回的是一個 list 對象

<class '_sre.SRE_Pattern'>
<class 'list'>
['wang', 'WANG']

re.split 函數(shù)

按照指定的 pattern 格式，分割 string 字符串，返回一個分割后的列表。

re.split(pattern, string, maxsplit=0, flags=0)

pattern compile 生成的正則表達式對象，或者自定義也可
string 要匹配的字符串
maxsplit 指定最大分割次數(shù)，不指定將全部分割

import re
str = 'say hello world! hello python'
str_nm = 'one1two2three3four4'
pattern = re.compile(r'(?P<space>\s)') # 創(chuàng)建一個匹配空格的正則表達式對象
pattern_nm = re.compile(r'(?P<space>\d+)') # 創(chuàng)建一個匹配空格的正則表達式對象
match = re.split(pattern, str)
match_nm = re.split(pattern_nm, str_nm, maxsplit=1)
print(match)
print(match_nm)

結果：

['say', ' ', 'hello', ' ', 'world!', ' ', 'hello', ' ', 'python']
['one', '1', 'two2three3four4']

re.findall() 方法

返回一個包含所有匹配到的字符串的列表。

pattern 匹配模式，由 re.compile 獲得
string 需要匹配的字符串

import re
str = 'say hello world! hello python'
pattern = re.compile(r'(?P<first>h\w)(?P<symbol>l+)(?P<last>o\s)') # 分組，0 組是整個 world!, 1組 or，2組 ld!
match = re.findall(pattern, str)
print(match)

結果

[('he', 'll', 'o '), ('he', 'll', 'o ')]

re.finditer 、re.findall

re.finditer(pattern, string[, flags=0]) re.findall(pattern, string[, flags=0])

pattern compile 生成的正則表達式對象，或者自定義也可
string 要匹配的字符串

findall 返回一個包含所有匹配到的字符的列表，列表類以元組的形式存在。

finditer 返回一個可迭代對象。

示例一：

pattern = re.compile(r'\d+@\w+.com') #通過 re.compile 獲得一個正則表達式對象
result_finditer = re.finditer(pattern, content)
print(type(result_finditer))
print(result_finditer) # finditer 得到的結果是個可迭代對象
for i in result_finditer: # i 本身也是可迭代對象，所以下面要使用 i.group()
 print(i.group())
result_findall = re.findall(pattern, content)
print(type(result_findall)) # findall 得到的是一個列表
print(result_findall)
for p in result_finditer:
 print(p)

輸出結果：

<class 'callable_iterator'>
<callable_iterator object at 0x10545ec88>
123456@163.com
234567@163.com
345678@163.com
<class 'list'>
['123456@163.com', '234567@163.com', '345678@163.com']

由結果可知：finditer 得到的是可迭代對象，finfdall 得到的是一個列表。

示例二：

import re
content = '''email:123456@163.com
email:234567@163.com
email:345678@163.com
'''
pattern = re.compile(r'(?P<number>\d+)@(?P<mail_type>\w+).com')
result_finditer = re.finditer(pattern, content)
print(type(result_finditer))
print(result_finditer)
iter_dict = {} # 把最后得到的結果
for i in result_finditer:
 print('郵箱號碼是：', i.group(1),'郵箱類型是：',i.group(2))
 number = i.group(1)
 mail_type = i.group(2)
 iter_dict.setdefault(number, mail_type) # 使用 dict.setdefault 創(chuàng)建了一個字典
print(iter_dict)
print('+++++++++++++++++++++++++++++++')
result_findall = re.findall(pattern, content)
print(result_findall)
print(type(result_findall))

輸出結果：

<class 'callable_iterator'>
<callable_iterator object at 0x104c5cbe0>
郵箱號碼是： 123456 郵箱類型是： 163
郵箱號碼是： 234567 郵箱類型是： 163
郵箱號碼是： 345678 郵箱類型是： 163
{'123456': '163', '234567': '163', '345678': '163'}
+++++++++++++++++++++++++++++++
[('123456', '163'), ('234567', '163'), ('345678', '163')]
<class 'list'>

finditer 得到的可迭代對象 i，也可以使用 lastindex，lastgroup 方法。

print('lastgroup 最后一個被捕獲的分組的名字',i.lastgroup)

findall 當正則沒有分組，返回就是正則匹配。

re.findall(r"\d+@\w+.com", content)
['2345678@163.com', '2345678@163.com', '345678@163.com']

有一個分組返回的是分組的匹配

re.findall(r"(\d+)@\w+.com", content)
['2345678', '2345678', '345678']

多個分組時，將結果作為元組，一并存入到列表中。

re.findall(r"(\d+)@(\w+).com", content)
[('2345678', '163'), ('2345678', '163'), ('345678', '163')]

PS：這里再為大家提供2款非常方便的正則表達式工具供大家參考使用：

JavaScript正則表達式在線測試工具：
http://tools.jb51.net/regex/javascript

正則表達式在線生成工具：
http://tools.jb51.net/regex/create_reg

更多關于Python相關內(nèi)容可查看本站專題：《Python正則表達式用法總結》、《Python數(shù)據(jù)結構與算法教程》、《Python函數(shù)使用技巧總結》、《Python字符串操作技巧匯總》、《Python入門與進階經(jīng)典教程》及《Python文件與目錄操作技巧匯總》

希望本文所述對大家Python程序設計有所幫助。

您可能感興趣的文章:

基于Python編寫一個簡單的http服務器
這篇文章主要為大家詳細介紹了如何基于Python編寫一個簡單的http服務器，文中的示例代碼簡潔易懂，感興趣的小伙伴可以跟隨小編一起學習一下
2023-04-04
python安裝教程
這篇文章主要為大家詳細介紹了python安裝教程，文中安裝步驟介紹的非常詳細，具有一定的參考價值，感興趣的小伙伴們可以參考一下
2018-02-02
python之PyMongo使用總結
本篇文章主要介紹了python之PyMongo使用總結，詳細的介紹了PyMongo模塊的使用，具有一定的參考價值，有興趣的可以了解一下
2017-05-05
使用torchtext導入NLP數(shù)據(jù)集的操作
這篇文章主要介紹了使用torchtext導入NLP數(shù)據(jù)集的操作，具有很好的參考價值，希望對大家有所幫助。如有錯誤或未考慮完全的地方，望不吝賜教
2021-05-05
Python網(wǎng)頁解析利器BeautifulSoup安裝使用介紹
這篇文章主要介紹了Python網(wǎng)頁解析利器BeautifulSoup安裝使用介紹,本文用一個完整示例一步一步安裝了BeautifulSoup的安裝和使用過程,需要的朋友可以參考下
2015-03-03
python基礎之并發(fā)編程(三)
這篇文章主要介紹了詳解python的并發(fā)編程，文中通過示例代碼介紹的非常詳細，對大家的學習或者工作具有一定的參考學習價值，需要的朋友們下面隨著小編來一起學習學習吧
2021-10-10
pandas 使用均值填充缺失值列的小技巧分享
今天小編就為大家分享一篇pandas 使用均值填充缺失值列的小技巧分享，具有很好的參考價值，希望對大家有所幫助。一起跟隨小編過來看看吧
2019-07-07
Python爬蟲之超級鷹驗證碼應用
眾所周知python是一個很強大的語言,它擁有眾多的庫,今天我嘗試了使用超級鷹第三方平臺進行驗證碼的開發(fā),需要的朋友可以參考下
2022-08-08
Django調(diào)用支付寶接口代碼實例詳解
這篇文章主要介紹了Django調(diào)用支付寶接口代碼實例詳解,文中通過示例代碼介紹的非常詳細，對大家的學習或者工作具有一定的參考學習價值,需要的朋友可以參考下
2020-04-04
django-celery-beat搭建定時任務的實現(xiàn)
本文主要介紹了django-celery-beat搭建定時任務的實現(xiàn)，文中通過示例代碼介紹的非常詳細，對大家的學習或者工作具有一定的參考學習價值，需要的朋友們下面隨著小編來一起學習學習吧
2023-03-03