python urllib模塊使用操作方法

更新時(shí)間：2025年08月11日 15:12:45 作者：高級(jí)測(cè)試工程師歐陽(yáng)

Python提供了多個(gè)庫(kù)用于處理 URL,常用的有urllib、requests和 urlparse（Python 3 中為 urllib.parse）,下面是這些庫(kù)的主要功能和用法,感興趣的朋友跟隨小編一起看看吧

URL 處理庫(kù)

Python 提供了多個(gè)庫(kù)用于處理 URL，常用的有 urllib、requests 和 urlparse（Python 3 中為 urllib.parse）。以下是這些庫(kù)的主要功能和用法。

urllib 模塊

urllib 是 Python 的標(biāo)準(zhǔn)庫(kù)之一，包含多個(gè)子模塊用于處理 URL 相關(guān)操作：

from urllib.request import urlopen
from urllib.parse import urlparse, urljoin
# 打開 URL 并讀取內(nèi)容
response = urlopen('https://www.example.com')
content = response.read()
# 解析 URL
parsed_url = urlparse('https://www.example.com/path?query=123')
print(parsed_url.scheme)  # 'https'
print(parsed_url.netloc)  # 'www.example.com'
print(parsed_url.path)    # '/path'
print(parsed_url.query)   # 'query=123'
# 拼接 URL
base_url = 'https://www.example.com/path'
relative_url = 'subpath'
full_url = urljoin(base_url, relative_url)
print(full_url)  # 'https://www.example.com/subpath'

requests 庫(kù)

requests 是第三方庫(kù)，提供了更簡(jiǎn)潔的 API 用于發(fā)送 HTTP 請(qǐng)求和處理 URL：

import requests
# 發(fā)送 GET 請(qǐng)求
response = requests.get('https://www.example.com')
print(response.status_code)  # 200
print(response.text)         # HTML 內(nèi)容
# 發(fā)送 POST 請(qǐng)求
data = {'key': 'value'}
response = requests.post('https://www.example.com/post', data=data)
# 處理 URL 參數(shù)
params = {'query': 'python', 'page': 1}
response = requests.get('https://www.example.com/search', params=params)
print(response.url)  # 'https://www.example.com/search?query=python&page=1'

urlparse 和 urljoin

urlparse 和 urljoin 是 urllib.parse 模塊中的函數(shù)，專門用于解析和拼接 URL：

from urllib.parse import urlparse, urljoin
# 解析 URL
url = 'https://www.example.com:8080/path/to/page?query=python#section'
parsed = urlparse(url)
print(parsed.scheme)   # 'https'
print(parsed.netloc)   # 'www.example.com:8080'
print(parsed.path)     # '/path/to/page'
print(parsed.query)    # 'query=python'
print(parsed.fragment) # 'section'
# 拼接 URL
base = 'https://www.example.com/path/'
relative = 'subpath'
full_url = urljoin(base, relative)
print(full_url)  # 'https://www.example.com/path/subpath'

編碼和解碼 URL

URL 中的特殊字符需要進(jìn)行編碼和解碼處理：

from urllib.parse import quote, unquote, urlencode
# 編碼 URL
encoded = quote('python url example')
print(encoded)  # 'python%20url%20example'
# 解碼 URL
decoded = unquote('python%20url%20example')
print(decoded)  # 'python url example'
# 編碼查詢參數(shù)
params = {'q': 'python url', 'page': 1}
encoded_params = urlencode(params)
print(encoded_params)  # 'q=python+url&page=1'

總結(jié)

Python 提供了多種工具用于處理 URL，包括標(biāo)準(zhǔn)庫(kù) urllib 和第三方庫(kù) requests。urllib.parse 模塊適合解析和拼接 URL，而 requests 更適合發(fā)送 HTTP 請(qǐng)求和處理響應(yīng)。根據(jù)具體需求選擇合適的工具可以更高效地完成 URL 相關(guān)操作。

到此這篇關(guān)于python urllib模塊使用操作方法的文章就介紹到這了,更多相關(guān)python urllib模塊使用內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: