使用Prophet预测股价并进行多策略交易（附代码）（使用premiere pro2019cc时,导入素材）

由：sddy008 发布于：2023-03-22 分类：股票知识阅读：96次评论：0次

▎作者：Mike

▎编译：公众号翻译部

在这篇文章中，我们将使用Prophet来预测时间序列。使用的数据是SA&P500历史调整收盘价。先建立一个3年的预测，然后模拟1980年以来的历史月度预测。最后，将创建多样的交易策略。

https://facebook.github.io/prophet/

库的导入

导入Python标准库。还将从 functools 中导入 Prophet 和 reduce。

import pandas as pd import numpy as np from fbprophet import Prophet import matplotlib.pyplot as plt from functools import reduce %matplotlib inline import warnings warnings.filterwarnings(ignore) plt.style.use(seaborn-deep) pd.options.display.float_format = "{:,.2f}".format

数据获取

使用的数据是1980年以来的标准普尔500指数历史数据。

stock_price.describe()

数据准备

要使prophet用起来，我们需要将日期和Adj close列的名称更改为ds和y。在大多数机器学习项目中，术语y通常用于目标列（要测试的内容）。

stock_price = stock_price[[Date,Adj Close]] stock_price.columns = [ds, y] stock_price.head(10)

Prophet

在使用Prophet创建预测之前，先将数据可视化。直观的来感受数据。

model = Prophet() model.fit(stock_price) <fbprophet.forecaster.Prophet at 0x21216301c18>

要激活Prophet模型，我们只需调用Prophet()并将其分配给一个名为Model的变量。接下来，我们通过调用fit方法将股票数据匹配到模型中。

future = model.make_future_dataframe(1095, freq=d) future_boolean = future[ds].map(lambda x : True if x.weekday() in range(0, 5) else False) future = future[future_boolean] future.tail()

我们需要创建一些未来的日期。Prophet为我们提供了一个名为make_future_dataframe的函数。传入未来周期和频率的数量。以上是我们对未来1095天或3年的预测。

由于股票只能在交易日操作，我们需要将预测数据从周末中删除。为此，我们创建一个布尔表达式，如果一天不等于0-4，则返回False。“0 =星期一，6=星期六，等等。”

然后我们将布尔表达式传递给dataframe，它只返回True值。我们现在有一个包含未来3年交易日的预测数据。

forecast = model.predict(future) forecast.tail()

我们从模型中调用predict的预测，并在前面创建的future的dataframe中传递该预测。我们在一个名为forecast的新dataframe中返回结果。

当我们查看预测数据时，会看到一堆新术语。我们最感兴趣的是yhat，它是我们的预测值。（yhat是y的预测）

model.plot_components(forecast);

Prophet提供了两个方便的可视化助手，plot和plot_components。plot函数创建了实际/预测的图表，plot_components提供了趋势/季节性的图表。

stock_price_forecast = forecast[[ds, yhat, yhat_lower, yhat_upper]] df = pd.merge(stock_price, stock_price_forecast, on=ds, how=right) df.set_index(ds).plot(figsize=(16,8), color=[royalblue, "#34495e", "#e74c3c", "#e74c3c"], grid=True);

可视化助手只是使用我们的预测dataframe中的数据。我们可以重新创建相同的图表。

模拟预测

虽然我们在上面创建的3年预测非常酷，但我们不想在没有使用交易策略对业绩进行反向测试的情况下就对其做出任何交易决定。

在本节中，我们将模拟Prophet在1980年就已经存在，并使用它来创建到2019年的月度预测。然后，我们将在下一节中使用这些数据来模拟各种交易策略对我们刚刚购买并持有该股票的效果。

stock_price[dayname] = stock_price[ds].dt.day_name() stock_price[month] = stock_price[ds].dt.month stock_price[year] = stock_price[ds].dt.year stock_price[month/year] = stock_price[month].map(str) + / + stock_price[year].map(str) stock_price = pd.merge(stock_price, stock_price[month/year].drop_duplicates().reset_index(drop=True).reset_index(), on=month/year, how=left) stock_price = stock_price.rename(columns={index:month/year_index}) stock_price.tail()

在模拟月度预测之前，我们需要向stock_price dataframe添加一些列，这是我们在这个项目开始时创建的，目的是使它更容易使用。添加month、year、month/year和month/year_index。

loop_list = stock_price[month/year].unique().tolist() max_num = len(loop_list) - 1 forecast_frames = [] for num, item in enumerate(loop_list): if num == max_num: pass else: df = stock_price.set_index(ds)[ stock_price[stock_price[month/year] == loop_list[0]][ds].min():\ stock_price[stock_price[month/year] == item][ds].max()] df = df.reset_index()[[ds, y]] model = Prophet() model.fit(df) future = stock_price[stock_price[month/year_index] == (num + 1)][[ds]] forecast = model.predict(future) forecast_frames.append(forecast)

stock_price_forecast = reduce(lambda top, bottom: pd.concat([top, bottom], sort=False), forecast_frames) stock_price_forecast = stock_price_forecast[[ds, yhat, yhat_lower, yhat_upper]] stock_price_forecast.to_csv(stock_price_forecast.csv, index=False)

基本上我们在stock_price中循环每个月/年，并将预测模型与该时期可用的股票数据进行拟合，然后提前一个月预测。继续这样做，直到最后一个月/年。最后，将这些预测合并到一个名为stock_price_forecast的数据框中。将结果保存在csv文件中，这样如果需要重置数据，就不必再次运行模型。

stock_price_forecast = pd.read_csv(stock_price_forecast.csv, parse_dates=[ds]) df = pd.merge(stock_price[[ds,y, month/year_index]], stock_price_forecast, on=ds) df[Percent Change] = df[y].pct_change() df.set_index(ds)[[y, yhat, yhat_lower, yhat_upper]].plot(figsize=(16,8), color=[royalblue, "#34495e", "#e74c3c", "#e74c3c"], grid=True)

最后，我们将预测与实际价格结合起来，创建一个百分比变化列，将在下面的交易算法中使用。最后，将预测与实际情况作图，以查看它的表现如何。正如你所看到的，有一点延迟。它的行为很像移动平均线。

交易算法

创建了四个初始交易算法:

Hold：这是一种买入并持有的策略。也就是说，我们买股票并持有到最后一段时间。

Prophet：这种策略是当我们的预测显示下跌趋势时卖出，当我的预测显示上涨趋势时买进

Prophet Thresh：这个策略是只有当股票价格跌破我们的yhat_lower边界时才卖出。

Seasonality：这一策略是在8月退出市场，重新进入Ocober。这是基于上面的季节性图表。

(df.dropna().set_index(ds)[[Hold, Prophet, Prophet Thresh,Seasonality]] * 1000).plot(figsize=(16,8), grid=True) print(f"Hold = {df[Hold].iloc[-1]*1000:,.0f}") print(f"Prophet = {df[Prophet].iloc[-1]*1000:,.0f}") print(f"Prophet Thresh = {df[Prophet Thresh].iloc[-1]*1000:,.0f}") print(f"Seasonality = {df[Seasonality].iloc[-1]*1000:,.0f}")

Hold = 24,396

Prophet = 13,366 Prophet Thresh = 17,087 Seasonality = 30,861

我们绘制了初始资金为1000美元初始模拟策略结果。你可以看到季节性做得最好，持有策略其次。两种基于Prophet的策略都做得不太好。让我们看看是否可以通过优化阈值来改进Prophet Thresh。

performance = {} for x in np.linspace(.9,.99,10): y = ((df[y] > df[yhat_lower]*x).shift(1)* (df[Percent Change]) + 1).cumprod() performance[x] = y best_yhat = pd.DataFrame(performance).max().idxmax() pd.DataFrame(performance).plot(figsize=(16,8), grid=True); fBest Yhat = {best_yhat:,.2f}

上面我们循环遍历不同百分比的thresh以找到最优的thresh。最佳阈值是当前yhat_lower值的92%。

df[Optimized Prophet Thresh] = ((df[y] > df[yhat_lower] * best_yhat).shift(1) * (df[Percent Change]) + 1).cumprod() (df.dropna().set_index(ds)[[Hold, Prophet, Prophet Thresh, Seasonality, Optimized Prophet Thresh]] * 1000).plot(figsize=(16,8), grid=True) print(f"Hold = {df[Hold].iloc[-1]*1000:,.0f}") print(f"Prophet = {df[Prophet].iloc[-1]*1000:,.0f}") print(f"Prophet Thresh = {df[Prophet Thresh].iloc[-1]*1000:,.0f}") print(f"Seasonality = {df[Seasonality].iloc[-1]*1000:,.0f}") print(f"Optimized Prophet Thresh = {df[Optimized Prophet Thresh].iloc[-1]*1000:,.0f}") Hold = 24,396 Prophet = 13,366 Prophet Thresh = 17,087 Seasonality = 30,861 Optimized Prophet Thresh = 36,375

以上我们看到的是新的最佳交易策略。不幸的是，无论是优化的ProphetThresh都在有问题，因为他们使用的数据有未来函数存在，而这些数据在我们交易的时候是不可用的。我们将需要为我们预测的每个当前时间点创建一个优化的Thresh。

fcst_thresh = {} for num, index in enumerate(df[month/year_index].unique()): temp_df = df.set_index(ds)[ df[df[month/year_index] == df[month/year_index].unique()[0]][ds].min():\ df[df[month/year_index] == index][ds].max()] performance = {} for thresh in np.linspace(0, .99, 100): percent = ((temp_df[y] > temp_df[yhat_lower] * thresh).shift(1)* (temp_df[Percent Change]) + 1).cumprod() performance[thresh] = percent best_thresh = pd.DataFrame(performance).max().idxmax() if num == len(df[month/year_index].unique())-1: pass else: fcst_thresh[df[month/year_index].unique()[num+1]] = best_thresh fcst_thresh = pd.DataFrame([fcst_thresh]).T.reset_index().rename(columns={index:month/year_index, 0:Fcst Thresh}) fcst_thresh[Fcst Thresh].plot(figsize=(16,8), grid=True);

循环遍历数据，并为当前时间点找到到目前为止的最佳阈值百分比。正如你所看到的，随着时间的推移（1980年1月1日- 2019年3月18日），当前的thresh的百分比会跳跃。

df[yhat_optimized] = pd.merge(df, fcst_thresh, on=month/year_index, how=left)[Fcst Thresh].shift(1) * df[yhat_lower] df[Prophet Fcst Thresh] = ((df[y] > df[yhat_optimized]).shift(1)* (df[Percent Change]) + 1).cumprod() (df.dropna().set_index(ds)[[Hold, Prophet, Prophet Thresh,Prophet Fcst Thresh]] * 1000).plot(figsize=(16,8), grid=True) print(f"Hold = {df[Hold].iloc[-1]*1000:,.0f}") print(f"Prophet = {df[Prophet].iloc[-1]*1000:,.0f}") print(f"Prophet Thresh = {df[Prophet Thresh].iloc[-1]*1000:,.0f}") # print(f"Seasonality = {df[Seasonality].iloc[-1]*1000:,.0f}") print(f"Prophet Fcst Thresh = {df[Prophet Fcst Thresh].iloc[-1]*1000:,.0f}") Hold = 24,396 Prophet = 13,366 Prophet Thresh = 17,087 Prophet Fcst Thresh = 20,620

就像我们在创建新的交易策略并绘制图表所做的那样。不幸的是，结果变得更糟，但我们做得比我们最初的ProphetThresh更好。我们不使用到目前为止的整个周期来计算thresh，而是像移动平均线（30、60、90等等），尝试各种滚动窗口的时间。

rolling_thresh = {} for num, index in enumerate(df[month/year_index].unique()): rolling_performance = {} for roll in range(10, 400, 10): temp_df = df.set_index(ds)[ df[df[month/year_index] == index][ds].min() - pd.DateOffset(months=roll):\ df[df[month/year_index] == index][ds].max()] performance = {} for thresh in np.linspace(.0,.99, 100): percent = ((temp_df[y] > temp_df[yhat_lower] * thresh).shift(1)* (temp_df[Percent Change]) + 1).cumprod() performance[thresh] = percent per_df = pd.DataFrame(performance) best_thresh = per_df.iloc[[-1]].max().idxmax() percents = per_df[best_thresh] rolling_performance[best_thresh] = percents per_df = pd.DataFrame(rolling_performance) best_rolling_thresh = per_df.iloc[[-1]].max().idxmax() if num == len(df[month/year_index].unique())-1: pass else: rolling_thresh[df[month/year_index].unique()[num+1]] = best_rolling_thresh rolling_thresh = pd.DataFrame([rolling_thresh]).T.reset_index().rename(columns={index:month/year_index, 0:Fcst Thresh}) rolling_thresh[Fcst Thresh].plot(figsize=(16,8), grid=True);

上面和以前很相似，但是现在我们尝试了不同的移动窗口和不同的阈值百分比。这变得相当复杂。从上面可以看到，随着时间的推移，阈值百分比随时间而变化。现在让我们看看我们是怎么做的。

df[yhat_optimized] = pd.merge(df, rolling_thresh, on=month/year_index, how=left)[Fcst Thresh].fillna(1).shift(1) * df[yhat_lower] df[Prophet Rolling Thresh] = ((df[y] > df[yhat_optimized]).shift(1)* (df[Percent Change]) + 1).cumprod() (df.dropna().set_index(ds)[[Hold, Prophet, Prophet Thresh,Prophet Fcst Thresh, Prophet Rolling Thresh]] * 1000).plot(figsize=(16,8), grid=True) print(f"Hold = {df[Hold].iloc[-1]*1000:,.0f}") print(f"Prophet = {df[Prophet].iloc[-1]*1000:,.0f}") print(f"Prophet Thresh = {df[Prophet Thresh].iloc[-1]*1000:,.0f}") # print(f"Seasonality = {df[Seasonality].iloc[-1]*1000:,.0f}") print(f"Prophet Fcst Thresh = {df[Prophet Fcst Thresh].iloc[-1]*1000:,.0f}") print(f"Prophet Rolling Thresh = {df[Prophet Rolling Thresh].iloc[-1]*1000:,.0f}") Hold = 24,396 Prophet = 13,366 Prophet Thresh = 17,087 Prophet Fcst Thresh = 20,620 Prophet Rolling Thresh = 23,621

正如你所看到的，仍然没有击败最简单买入持有策略。也许"Time in the Market is better then Timing the Market"这句话有一定道理。

df[Time Traveler] = ((df[y].shift(-1) > df[yhat]).shift(1) * (df[Percent Change]) + 1).cumprod() (df.dropna().set_index(ds)[[Hold, Prophet, Prophet Thresh,Prophet Fcst Thresh, Prophet Rolling Thresh,Time Traveler]] * 1000).plot(figsize=(16,8), grid=True) print(f"Hold = {df[Hold].iloc[-1]*1000:,.0f}") print(f"Prophet = {df[Prophet].iloc[-1]*1000:,.0f}") print(f"Prophet Thresh = {df[Prophet Thresh].iloc[-1]*1000:,.0f}") # print(f"Seasonality = {df[Seasonality].iloc[-1]*1000:,.0f}") print(f"Prophet Fcst Thresh = {df[Prophet Fcst Thresh].iloc[-1]*1000:,.0f}") print(f"Prophet Rolling Thresh = {df[Prophet Rolling Thresh].iloc[-1]*1000:,.0f}") print(f"Time Traveler = {df[Time Traveler].iloc[-1]*1000:,.0f}") Hold = 24,396 Prophet = 13,366 Prophet Thresh = 17,087 Prophet Fcst Thresh = 20,620 Prophet Rolling Thresh = 23,621 Time Traveler = 288,513

上面是Time Traveler策略。这当然是一个完美的交易策略，因为我们事先知道当市场上下波动。你最多能从1000美元中赚到288,513美元。

总结

时间序列预测是非常复杂的，但Prophet使它非常容易创建稳健的预测，只需很少的努力。虽然它并没有使我们对股票市场的预测变得丰富，但是它仍然非常有用，并且可以快速地来解决不同领域的许多问题。原文：https://www.gardnmi.com/post/forecasting-stock-perfomance-with-prophet#Simulating-Forecasts

—End—

量化投资与机器学习微信公众号，是业内垂直于Quant、MFE、CST等专业的主流自媒体。公众号拥有来自公募、私募、券商、银行、海外等众多圈内10W+关注者。每日发布行业前沿研究成果和最新资讯。

上一篇：2.19 下周走势预测 & 周末要闻|为提升注册制下股票异常交易监管透明度和规范性，沪深交易所制定了主板股票异常交易实时监控细则（注册制上市前五个交易日）上一篇：20230223 丨财经资讯（2022年财经新闻）