pandas.DataFrame的for循環(huán)迭代的實(shí)現(xiàn)

更新時(shí)間：2023年02月22日 10:23:44 作者：餃子大人

本文主要介紹了pandas.DataFrame的for循環(huán)迭代的實(shí)現(xiàn)，文中通過(guò)示例代碼介紹的非常詳細(xì)，對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值，需要的朋友們下面隨著小編來(lái)一起學(xué)習(xí)學(xué)習(xí)吧

pandas.DataFrame for循環(huán)的應(yīng)用

當(dāng)pandas.DataFrame直接使用for循環(huán)時(shí)，按以下順序獲取列名（列名）。

for column_name in df:
    print(type(column_name))
    print(column_name)
    print('======\n')
# <class 'str'>
# age
# ======
# 
# <class 'str'>
# state
# ======
# 
# <class 'str'>
# point
# ======
#

調(diào)用方法__iter __（）。

for column_name in df.__iter__():
    print(type(column_name))
    print(column_name)
    print('======\n')
# <class 'str'>
# age
# ======
# 
# <class 'str'>
# state
# ======
# 
# <class 'str'>
# point
# ======
#

逐列檢索

DataFrame.iteritems()

使用iteritems（）方法，您可以一一獲取列名稱（列名稱）和元組（列名稱，系列）的每個(gè)列的數(shù)據(jù)（pandas.Series類型）。

pandas.Series可以通過(guò)指定索引名稱等來(lái)檢索行的值。

for column_name, item in df.iteritems():
? ? print(type(column_name))
? ? print(column_name)
? ? print('~~~~~~')

? ? print(type(item))
? ? print(item)
? ? print('------')

? ? print(item['Alice'])
? ? print(item[0])
? ? print(item.Alice)
? ? print('======\n')
# <class 'str'>
# age
# ~~~~~~
# <class 'pandas.core.series.Series'>
# Alice ? ?24
# Bob ? ? ?42
# Name: age, dtype: int64
# ------
# 24
# 24
# 24
# ======
#?
# <class 'str'>
# state
# ~~~~~~
# <class 'pandas.core.series.Series'>
# Alice ? ?NY
# Bob ? ? ?CA
# Name: state, dtype: object
# ------
# NY
# NY
# NY
# ======
#?
# <class 'str'>
# point
# ~~~~~~
# <class 'pandas.core.series.Series'>
# Alice ? ?64
# Bob ? ? ?92
# Name: point, dtype: int64
# ------
# 64
# 64
# 64
# ======
#?

逐行檢索

一次檢索一行的方法包括iterrows（）和itertuples（）。 itertuples（）更快。

如果只需要特定列的值，則如下所述，指定列并將它們分別在for循環(huán)中進(jìn)行迭代會(huì)更快。

DataFrame.iterrows()

通過(guò)使用iterrows（）方法，可以獲得每一行的數(shù)據(jù)（pandas.Series類型）和行名和元組（索引，系列）。

pandas.Series可以通過(guò)指定列名等來(lái)檢索列的值。

for index, row in df.iterrows():
? ? print(type(index))
? ? print(index)
? ? print('~~~~~~')

? ? print(type(row))
? ? print(row)
? ? print('------')

? ? print(row['point'])
? ? print(row[2])
? ? print(row.point)
? ? print('======\n')
# <class 'str'>
# Alice
# ~~~~~~
# <class 'pandas.core.series.Series'>
# age ? ? ?24
# state ? ?NY
# point ? ?64
# Name: Alice, dtype: object
# ------
# 64
# 64
# 64
# ======
#?
# <class 'str'>
# Bob
# ~~~~~~
# <class 'pandas.core.series.Series'>
# age ? ? ?42
# state ? ?CA
# point ? ?92
# Name: Bob, dtype: object
# ------
# 92
# 92
# 92
# ======

DataFrame.itertuples()

使用itertuples（）方法，可以一一獲取索引名（行名）和該行數(shù)據(jù)的元組。元組的第一個(gè)元素是索引名稱。

默認(rèn)情況下，返回一個(gè)名為Pandas的namedtuple。由于它是namedtuple，因此可以訪問(wèn)每個(gè)元素的值。

for row in df.itertuples():
? ? print(type(row))
? ? print(row)
? ? print('------')

? ? print(row[3])
? ? print(row.point)
? ? print('======\n')
# <class 'pandas.core.frame.Pandas'>
# Pandas(Index='Alice', age=24, state='NY', point=64)
# ------
# 64
# 64
# ======
#?
# <class 'pandas.core.frame.Pandas'>
# Pandas(Index='Bob', age=42, state='CA', point=92)
# ------
# 92
# 92
# ======
#?

如果參數(shù)name為None，則返回一個(gè)普通的元組。

for row in df.itertuples(name=None):
? ? print(type(row))
? ? print(row)
? ? print('------')

? ? print(row[3])
? ? print('======\n')
# <class 'tuple'>
# ('Alice', 24, 'NY', 64)
# ------
# 64
# ======
#?
# <class 'tuple'>
# ('Bob', 42, 'CA', 92)
# ------
# 92
# ======

檢索特定列的值

上述的iterrows（）和itertuples（）方法可以檢索每一行中的所有列元素，但是如果僅需要特定的列元素，可以使用以下方法。

pandas.DataFrame的列是pandas.Series。

print(df['age'])
# Alice ? ?24
# Bob ? ? ?42
# Name: age, dtype: int64

print(type(df['age']))
# <class 'pandas.core.series.Series'>

如果將pandas.Series應(yīng)用于for循環(huán)，則可以按順序獲取值，因此，如果指定pandas.DataFrame列并將其應(yīng)用于for循環(huán)，則可以按順序獲取該列中的值。

for age in df['age']:
    print(age)
# 24
# 42

如果使用內(nèi)置函數(shù)zip（），則可以一次收集多列值。

for age, point in zip(df['age'], df['point']):
    print(age, point)
# 24 64
# 42 92

如果要獲取索引（行名），使用index屬性。如以上示例所示，可以與其他列一起通過(guò)zip（）獲得。

print(df.index)
# Index(['Alice', 'Bob'], dtype='object')

print(type(df.index))
# <class 'pandas.core.indexes.base.Index'>

for index in df.index:
? ? print(index)
# Alice
# Bob

for index, state in zip(df.index, df['state']):
? ? print(index, state)
# Alice NY
# Bob CA

循環(huán)更新值

iterrows（）方法逐行檢索值，返回一個(gè)副本，而不是視圖，因此更改pandas.Series不會(huì)更新原始數(shù)據(jù)。

for index, row in df.iterrows():
? ? row['point'] += row['age']

print(df)
# ? ? ? ?age state ?point
# Alice ? 24 ? ?NY ? ? 64
# Bob ? ? 42 ? ?CA ? ? 92

at[]選擇并處理原始DataFrame中的數(shù)據(jù)時(shí)更新。

for index, row in df.iterrows():
? ? df.at[index, 'point'] += row['age']

print(df)
# ? ? ? ?age state ?point
# Alice ? 24 ? ?NY ? ? 88
# Bob ? ? 42 ? ?CA ? ?134

有關(guān)at[]的文章另請(qǐng)參考以下連接。

Pandas獲取和修改任意位置的值（at,iat,loc,iloc）

請(qǐng)注意，上面的示例使用at[]只是一個(gè)示例，在許多情況下，有必要使用for循環(huán)來(lái)更新元素或基于現(xiàn)有列添加新列，for循環(huán)的編寫(xiě)更加簡(jiǎn)單快捷。

與上述相同的處理。上面更新的對(duì)象被進(jìn)一步更新。

df['point'] += df['age']
print(df)
#        age state  point
# Alice   24    NY    112
# Bob     42    CA    176

可以添加新列。

df['new'] = df['point'] + df['age'] * 2
print(df)
#        age state  point  new
# Alice   24    NY    112  160
# Bob     42    CA    176  260

除了簡(jiǎn)單的算術(shù)運(yùn)算之外，NumPy函數(shù)還可以應(yīng)用于列的每個(gè)元素。以下是平方根的示例。另外，這里，NumPy的功能可以通過(guò)pd.np訪問(wèn)，但是，當(dāng)然可以單獨(dú)導(dǎo)入NumPy。

df['age_sqrt'] = pd.np.sqrt(df['age'])
print(df)
#        age state  point  new  age_sqrt
# Alice   24    NY    112  160  4.898979
# Bob     42    CA    176  260  6.480741

對(duì)于字符串，提供了用于直接處理列（系列）的字符串方法。下面是轉(zhuǎn)換為小寫(xiě)并提取第一個(gè)字符的示例。

df['state_0'] = df['state'].str.lower().str[0]
print(df)
#        age state  point  new  age_sqrt state_0
# Alice   24    NY    112  160  4.898979       n
# Bob     42    CA    176  260  6.480741       c

到此這篇關(guān)于pandas.DataFrame的for循環(huán)迭代的實(shí)現(xiàn)的文章就介紹到這了,更多相關(guān)pandas.DataFrame for循環(huán)內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: