pandas.DataFrame的pivot()和unstack()實(shí)現(xiàn)行轉(zhuǎn)列

更新時間：2019年07月06日 11:43:23 作者：Leohahah

這篇文章主要介紹了pandas.DataFrame的pivot()和unstack()實(shí)現(xiàn)行轉(zhuǎn)列，小編覺得挺不錯的，現(xiàn)在分享給大家，也給大家做個參考。一起跟隨小編過來看看吧

示例：有如下表需要進(jìn)行行轉(zhuǎn)列：

代碼如下：

# -*- coding:utf-8 -*-

import pandas as pd

import MySQLdb

from warnings import filterwarnings

# 由于create table if not exists總會拋出warning，因此使用filterwarnings消除

filterwarnings('ignore', category = MySQLdb.Warning)

from sqlalchemy import create_engine

import sys

if sys.version_info.major<3:

 reload(sys)

 sys.setdefaultencoding("utf-8")

 # 此腳本適用于python2和python3

host,port,user,passwd,db,charset="192.168.1.193",3306,"leo","mysql","test","utf8"

 

def get_df():

 global host,port,user,passwd,db,charset

 conn_config={"host":host, "port":port, "user":user, "passwd":passwd, "db":db,"charset":charset}

 conn = MySQLdb.connect(**conn_config)

 result_df=pd.read_sql('select UserName,Subject,Score from TEST',conn)

 return result_df

 

def pivot(result_df):

 df_pivoted_init=result_df.pivot('UserName','Subject','Score')

 df_pivoted = df_pivoted_init.reset_index() # 將行索引也作為DataFrame值的一部分，以方便存儲數(shù)據(jù)庫

 return df_pivoted_init,df_pivoted

 # 返回的兩個DataFrame，一個是以姓名作index的，一個是以數(shù)字序列作index，前者用于unpivot，后者用于save_to_mysql

 

def unpivot(df_pivoted_init):

 # unpivot需要進(jìn)行df_pivoted_init二維表格的行、列索引遍歷，需要拼SQL因此不能使用save_to_mysql存數(shù)據(jù)，這里使用SQL和MySQLdb接口存

 insert_sql="insert into test_unpivot(UserName,Subject,Score) values "

 # 處理值為NaN的情況

 df_pivoted_init=df_pivoted_init.fillna(0)

 for col in df_pivoted_init.columns:

  for index in df_pivoted_init.index:

   value=df_pivoted_init.at[index,col]

   if value!=0:

    insert_sql=insert_sql+"('%s','%s',%s)" %(index,col,value)+','

 insert_sql = insert_sql.strip(',')

 global host, port, user, passwd, db, charset

 conn_config = {"host": host, "port": port, "user": user, "passwd": passwd, "db": db, "charset": charset}

 conn = MySQLdb.connect(**conn_config)

 cur=conn.cursor()

 cur.execute("create table if not exists test_unpivot like TEST")

 cur.execute(insert_sql)

 conn.commit()

 conn.close()

 

def save_to_mysql(df_pivoted,tablename):

 global host, port, user, passwd, db, charset

 """

 只有使用sqllite時才能指定con=connection實(shí)例，其他數(shù)據(jù)庫需要使用sqlalchemy生成engine，engine的定義可以添加?來設(shè)置字符集和其他屬性

 """

 conn="mysql://%s:%s@%s:%d/%s?charset=%s" %(user,passwd,host,port,db,charset)

 mysql_engine = create_engine(conn)

 df_pivoted.to_sql(name=tablename, con=mysql_engine, if_exists='replace', index=False)

 

# 從TEST表讀取源數(shù)據(jù)至DataFrame結(jié)構(gòu)

result_df=get_df()

# 將源數(shù)據(jù)行轉(zhuǎn)列為二維表格形式

df_pivoted_init,df_pivoted=pivot(result_df)

# 將二維表格形式的數(shù)據(jù)存到新表test中

save_to_mysql(df_pivoted,'test')

# 將被行轉(zhuǎn)列的數(shù)據(jù)unpivot，存入test_unpivot表中

unpivot(df_pivoted_init)

結(jié)果如下：

關(guān)于Pandas DataFrame類自帶的pivot方法：

DataFrame.pivot(index=None, columns=None, values=None)：

Return reshaped DataFrame organized by given index / column values.

這里只有3個參數(shù)，是因為pivot之后的結(jié)果一定是二維表格，只需要行列及其對應(yīng)的值，而且也因為是二維表格，unpivot之后is_pass列是肯定會丟失的，因此一開始我就沒查這個列。

補(bǔ)充說明：

在學(xué)習(xí)到Pandas的層次化索引部分時發(fā)現(xiàn)了2個很有意思的函數(shù)，也可以進(jìn)行行列互轉(zhuǎn)，其用法如下：(很久之后我才意識到，pivot只是封裝了unstack的一個快捷方式而已，其本質(zhì)上還是先用set_index建立層次化索引，然后用unstack進(jìn)行重塑，就像我在下面示例做的操作)

# -*- coding:utf-8 -*-

import pandas as pd

import MySQLdb

from warnings import filterwarnings

# 由于create table if not exists總會拋出warning，因此使用filterwarnings消除

filterwarnings('ignore', category = MySQLdb.Warning)

from sqlalchemy import create_engine

import sys

if sys.version_info.major<3:

 reload(sys)

 sys.setdefaultencoding("utf-8")

 # 此腳本適用于python2和python3

host,port,user,passwd,db,charset="192.168.1.193",3306,"leo","mysql","test","utf8"

 

def get_df():

 global host,port,user,passwd,db,charset

 conn_config={"host":host, "port":port, "user":user, "passwd":passwd, "db":db,"charset":charset}

 conn = MySQLdb.connect(**conn_config)

 result_df=pd.read_sql('select UserName,Subject,Score from TEST',conn)

 return result_df

 

def pivot(result_df):

 df_pivoted_init=result_df.pivot('UserName','Subject','Score')

 df_pivoted = df_pivoted_init.reset_index() # 將行索引也作為DataFrame值的一部分，以方便存儲數(shù)據(jù)庫

 return df_pivoted_init,df_pivoted

 # 返回的兩個DataFrame，一個是以姓名作index的，一個是以數(shù)字序列作index，前者用于unpivot，后者用于save_to_mysql

 

def unpivot(df_pivoted_init):

 # unpivot需要進(jìn)行df_pivoted_init二維表格的行、列索引遍歷，需要拼SQL因此不能使用save_to_mysql存數(shù)據(jù)，這里使用SQL和MySQLdb接口存

 insert_sql="insert into test_unpivot(UserName,Subject,Score) values "

 # 處理值為NaN的情況

 df_pivoted_init=df_pivoted_init.fillna(0)

 for col in df_pivoted_init.columns:

  for index in df_pivoted_init.index:

   value=df_pivoted_init.at[index,col]

   if value!=0:

    insert_sql=insert_sql+"('%s','%s',%s)" %(index,col,value)+','

 insert_sql = insert_sql.strip(',')

 global host, port, user, passwd, db, charset

 conn_config = {"host": host, "port": port, "user": user, "passwd": passwd, "db": db, "charset": charset}

 conn = MySQLdb.connect(**conn_config)

 cur=conn.cursor()

 cur.execute("create table if not exists test_unpivot like TEST")

 cur.execute(insert_sql)

 conn.commit()

 conn.close()

 

def save_to_mysql(df_pivoted,tablename):

 global host, port, user, passwd, db, charset

 """

 只有使用sqllite時才能指定con=connection實(shí)例，其他數(shù)據(jù)庫需要使用sqlalchemy生成engine，engine的定義可以添加?來設(shè)置字符集和其他屬性

 """

 conn="mysql://%s:%s@%s:%d/%s?charset=%s" %(user,passwd,host,port,db,charset)

 mysql_engine = create_engine(conn)

 df_pivoted.to_sql(name=tablename, con=mysql_engine, if_exists='replace', index=False)

 

# 從TEST表讀取源數(shù)據(jù)至DataFrame結(jié)構(gòu)

result_df=get_df()

# 將源數(shù)據(jù)行轉(zhuǎn)列為二維表格形式

df_pivoted_init,df_pivoted=pivot(result_df)

# 將二維表格形式的數(shù)據(jù)存到新表test中

save_to_mysql(df_pivoted,'test')

# 將被行轉(zhuǎn)列的數(shù)據(jù)unpivot，存入test_unpivot表中

unpivot(df_pivoted_init)

以上利用了Pandas的層次化索引，實(shí)際上這也是層次化索引一個主要的用途，結(jié)合本例我們可以把代碼改成如下：

result_df=pd.read_sql('select UserName,Subject,Score from TEST',conn)

# 在從數(shù)據(jù)庫中獲取的數(shù)據(jù)格式是這樣的：

    UserName Subject Score

0    張三   語文  80.0

1    張三   數(shù)學(xué)  90.0

2    張三   英語  70.0

3    張三   生物  85.0

4    李四   語文  80.0

5    李四   數(shù)學(xué)  92.0

6    李四   英語  76.0

7    王五   語文  60.0

8    王五   數(shù)學(xué)  82.0

9    王五   英語  96.0

10    王五   生物  78.0

# 如果要使用層次化索引，那么我們只需要把UserName和Subject列設(shè)置為層次化索引，Score為其對應(yīng)的值即可，我們借用set_index()函數(shù)：

df=result_df.set_index(['UserName','Subject'])

In [112]: df.unstack()

Out[112]: 

     Score         

Subject   數(shù)學(xué)  生物  英語  語文

UserName            

張三    90.0 85.0 70.0 80.0

李四    92.0  NaN 76.0 80.0

王五    82.0 78.0 96.0 60.0

# 使用stack可以將unstack的結(jié)果轉(zhuǎn)回來，這樣就也在形式上實(shí)現(xiàn)了行列互轉(zhuǎn)，之后的操作基本一致了。

以上就是本文的全部內(nèi)容，希望對大家的學(xué)習(xí)有所幫助，也希望大家多多支持腳本之家。

您可能感興趣的文章:

相關(guān)文章

Python在PDF中添加或刪除超鏈接的操作
PDF文件現(xiàn)已成為文檔存儲和分發(fā)的首選格式,然而,PDF文件的靜態(tài)特性有時會限制其交互性,超鏈接是提高PDF文件互動性和用戶體驗的關(guān)鍵元素,本文將詳細(xì)介紹如何使用第三方庫Spire.PDF for Python來進(jìn)行這些操作,需要的朋友可以參考下
2024-12-12
Django contenttypes 框架詳解(小結(jié))
這篇文章主要介紹了Django contenttypes 框架詳解(小結(jié))，小編覺得挺不錯的，現(xiàn)在分享給大家，也給大家做個參考。一起跟隨小編過來看看吧
2018-08-08
這篇文章主要介紹了Python多重繼承的方法解析執(zhí)行順序,結(jié)合實(shí)例形式分析了Python多重繼承時存在方法命名沖突情況的解析執(zhí)行順序與相關(guān)原理,需要的朋友可以參考下
2018-05-05

Python實(shí)現(xiàn)Excel表格轉(zhuǎn)置與翻譯工具

本文主要介紹如何使用Python編寫一個GUI程序,能夠讀取Excel文件,將第一個列的數(shù)據(jù)轉(zhuǎn)置,并將英文內(nèi)容翻譯成中文,有需要的小伙伴可以參考一下

2024-10-10

python中私有函數(shù)調(diào)用方法解密

這篇文章主要介紹了python中私有函數(shù)調(diào)用方法,較為詳細(xì)的分析了Python私有函數(shù)的原理與調(diào)用技巧,需要的朋友可以參考下

2016-04-04

Python Django實(shí)現(xiàn)個人博客系統(tǒng)的搭建

個人博客是一個非常好的平臺，可以讓人們分享自己的知識和經(jīng)驗，也可以讓人們交流和互動。在這篇文章中，我們將介紹如何使用Python Django框架來開發(fā)一個個人博客系統(tǒng)，希望對大家有所幫助

2023-04-04

使用 Python ssh 遠(yuǎn)程登陸服務(wù)器的最佳方案

這篇文章主要介紹了使用 Python ssh 遠(yuǎn)程登陸服務(wù)器的最佳方案,文中通過示例代碼介紹的非常詳細(xì)，對大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價值，需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧

2020-03-03

python文件絕對路徑寫法介紹（windows)

今天小編就為大家分享一篇python文件絕對路徑寫法介紹（windows)，具有很好的參考價值，希望對大家有所幫助。一起跟隨小編過來看看吧

2019-12-12

Python super( )函數(shù)用法總結(jié)

今天給大家?guī)淼闹R是關(guān)于Python的相關(guān)知識,文章圍繞著super( )函數(shù)展開,文中有非常詳細(xì)的介紹及代碼示例,需要的朋友可以參考下

2021-06-06

Python中GeoJson和bokeh-1的使用講解

今天小編就為大家分享一篇關(guān)于Python中GeoJson和bokeh-1的使用講解，小編覺得內(nèi)容挺不錯的，現(xiàn)在分享給大家，具有很好的參考價值，需要的朋友一起跟隨小編來看看吧

2019-01-01

国产无遮挡裸体免费直播视频,久久精品国产蜜臀av,动漫在线视频一区二区,欧亚日韩一区二区三区,久艹在线免费视频,国产精品美女网站免费,正在播放 97超级视频在线观看,斗破苍穹年番在线观看免费,51最新乱码中文字幕

pandas.DataFrame的pivot()和unstack()實(shí)現(xiàn)行轉(zhuǎn)列

相關(guān)文章

最新評論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

国产无遮挡裸体免费直播视频,久久精品国产蜜臀av,动漫在线视频一区二区,欧亚日韩一区二区三区,久艹在线 免费视频,国产精品美女网站免费,正在播放 97超级视频在线观看,斗破苍穹年番在线观看免费,51最新乱码中文字幕

pandas.DataFrame的pivot()和unstack()實(shí)現(xiàn)行轉(zhuǎn)列

相關(guān)文章

最新評論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具

国产无遮挡裸体免费直播视频,久久精品国产蜜臀av,动漫在线视频一区二区,欧亚日韩一区二区三区,久艹在线免费视频,国产精品美女网站免费,正在播放 97超级视频在线观看,斗破苍穹年番在线观看免费,51最新乱码中文字幕