Python量化因子測(cè)算與繪圖超詳細(xì)流程代碼

更新時(shí)間：2023年02月24日 09:15:09 作者：呆萌的代Ma

這篇文章主要介紹了Python量化因子測(cè)算與繪圖，文中通過(guò)示例代碼介紹的非常詳細(xì)，對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值，需要的朋友們下面隨著小編來(lái)一起學(xué)習(xí)吧

因子測(cè)算框架

這里博主分享自己測(cè)算時(shí)常使用的流程，希望與大家共同進(jìn)步！

測(cè)算時(shí)從因子到收益的整個(gè)流程如下：策略（因子組合） -> 買賣信號(hào) -> 買點(diǎn)與賣點(diǎn) -> 收益

因此我們?cè)跍y(cè)算時(shí)，針對(duì)每一個(gè)個(gè)股：

1. 預(yù)處理股票數(shù)據(jù)

首先這里是常用的一個(gè)工具導(dǎo)入，包括測(cè)算用的庫(kù)與繪圖用的庫(kù)（含圖片中文顯示空白解決方案）

# 測(cè)算用
import numpy as np
import pandas as pd
from copy import deepcopy
from tqdm import tqdm
from datetime import datetime
import talib
# 繪圖用
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
# 繪圖現(xiàn)實(shí)中文
sns.set()
plt.rcParams["figure.figsize"] = (20,10)
plt.rcParams['font.sans-serif'] = ['Arial Unicode MS']  # 當(dāng)前字體支持中文
plt.rcParams['axes.unicode_minus'] = False  # 解決保存圖像是負(fù)號(hào)'-'顯示為方塊的問(wèn)題
# 其他
import warnings
warnings.filterwarnings("ignore")

然后是循環(huán)讀取股票的代碼：

import os
def readfile(path, limit=None):
    files = os.listdir(path)
    file_list = []
    for file in files:  # 遍歷文件夾
        if not os.path.isdir(file):
            file_list.append(path + '/' + file)
    if limit:
        return file_list[:limit]
    return file_list
stock_dict = {}
for _file in tqdm(readfile("../data/stock_data")):
    if not _file.endswith(".pkl"):
        continue
    # TODO 這里可以添加篩選，是否需要將當(dāng)前的股票添加到測(cè)算的股票池中
    file_df = pd.read_pickle(_file)
    file_df.set_index(["日期"], inplace=True)
    file_df.index.name = ""
    file_df.index = pd.to_datetime(file_df.index)
    file_df.rename(columns={'開盤':'open',"收盤":"close","最高":"high","最低":"low","成交量":"volume"},inplace=True)
    stock_code = _file.split("/")[-1].replace(".pkl", '')
    # TODO 這里可以添加日期，用來(lái)截取一部分?jǐn)?shù)據(jù)
    stock_dict[stock_code] = file_df

上面一部分是處理股票數(shù)據(jù)，處理后的數(shù)據(jù)都會(huì)保存在 stock_dict 這個(gè)變量中，鍵是股票的代碼，值是股票數(shù)據(jù)

2. 指標(biāo)測(cè)算

測(cè)算指標(biāo)時(shí)，我們以一只股票為例：

for _index,_stock_df in tqdm(stock_dict.items()):
    measure_df = deepcopy(_stock_df)

代碼中的：

這里的measure_df即要測(cè)算的dataframe數(shù)據(jù)
使用deepcopy是防止測(cè)算的過(guò)程影響到原始數(shù)據(jù)

然后我們就可以循環(huán)這一個(gè)股票的每一行（代表每一天），測(cè)算的交易規(guī)則如下：

買入規(guī)則：買入信號(hào)發(fā)出&當(dāng)前沒(méi)有持倉(cāng)，則買入
賣出規(guī)則：賣出信號(hào)發(fā)出&當(dāng)前有持倉(cāng)，則賣出

# 開始測(cè)算
trade_record_list = []
this_trade:dict = None
for _mea_i, _mea_series in measure_df.iterrows(): # 循環(huán)每一天
    if 發(fā)出買入信號(hào):
        if this_trade is None:  # 當(dāng)前沒(méi)有持倉(cāng)，則買入
            this_trade = {
                "buy_date": _mea_i,
                "close_record": [_mea_series['close']],
            }
    elif 發(fā)出賣出信號(hào):
        if this_trade is not None:  # 要執(zhí)行賣出
            this_trade['sell_date'] = _mea_i
            this_trade['close_record'].append(_mea_series['close'])
            trade_record_list.append(this_trade)
            this_trade = None
    else:
        if this_trade is not None:  # 當(dāng)前有持倉(cāng)
            this_trade['close_record'].append(_mea_series['close'])

上述代碼中，我們將每一個(gè)完整的交易（買->持有->賣），都保存在了trade_record_list變量中，每一個(gè)完整的交易都會(huì)記錄：

{
    'buy_date': Timestamp('2015-08-31 00:00:00'), # 買入時(shí)間
    'close_record': [41.1,42.0,40.15,40.65,36.6,32.97], # 收盤價(jià)的記錄
    'sell_date': Timestamp('2015-10-12 00:00:00')} # 賣出時(shí)間
    # TODO 也可以添加自定義記錄的指標(biāo)
}

3. 測(cè)算結(jié)果整理

直接使用 pd.DataFrame(trade_record_list)，就可以看到總的交易結(jié)果：

整理的過(guò)程也相對(duì)簡(jiǎn)單且獨(dú)立，就是循環(huán)這個(gè)交易，然后計(jì)算想要的指標(biāo)，比如單次交易的年化收益可以使用：

trade_record_df = pd.DataFrame(trade_record_list)
for _,_trade_series in trade_record_df.iterrows():
    trade_record_df.loc[_i,'年化收益率'] = (_trade_series['close_record'][-1] - _trade_series['close_record'][0])/_trade_series['close_record'][0]/(_trade_series['sell_date'] - _trade_series['buy_date']).days * 365 # 年化收益
    # TODO 這里根據(jù)自己想要的結(jié)果添加更多的測(cè)算指標(biāo)

4. 結(jié)果繪圖

繪圖的代碼通常比較固定，比如勝率圖：

# 清理繪圖緩存
plt.cla()
plt.clf()
# 開始繪圖
plt.figure(figsize=(10, 14), dpi=100)
# 使用seaborn繪制勝率圖
fig = sns.heatmap(pd.DataFrame(total_measure_record).T.round(2), annot=True, cmap="RdBu_r",center=0.5)
plt.title("勝率圖")
scatter_fig = fig.get_figure()
# 保存到本地
scatter_fig.savefig("勝率圖")
scatter_fig.show() # 最后顯示