The basic workflow of a machine learning strategy is as follows:
Let's see how to use this workflow with MoonshotML.
A simple MoonshotML strategy is provided in demo-ml.py.
Machine learning strategies inherit from the MoonshotML
class:
from moonshot import MoonshotML
class DemoMLStrategy(MoonshotML):
CODE = "demo-ml"
DB = "usstock-free-1d"
DB_FIELDS = ["Open", "Close"]
UNIVERSES = "usstock-free"
...
MoonshotML
is a subclass of the Moonshot
class and shares much of its functionality. However, instead of defining a prices_to_signals
method as with a standard Moonshot strategy, a machine learning strategy should define two methods for generating signals: prices_to_features
and predictions_to_signals
.
The prices_to_features
method takes a DataFrame of prices and should return a tuple of features and targets that will be used to train the machine learning model. In our demo strategy, we calculate each security's percent return over a variety of lookback windows as our set of features. For targets, we calculate whether the security's 1-month forward return is above or below the median 1-month forward return for all securities.
class DemoMLStrategy(MoonshotML):
...
LOOKBACK_WINDOWS = [1,2,3,4,5,6,7,8,9,10,12,14,16,18,20,30,40,50,60,80,100,125,150,175,200]
FORWARD_RETURNS_WINDOW = 22
def prices_to_features(self, prices):
"""
Creates features and targets for training and backtesting the model.
"""
closes = prices.loc["Close"]
opens = prices.loc["Open"]
# FEATURES
features = {}
for n in self.LOOKBACK_WINDOWS:
features[f'return_{n}'] = closes.pct_change(n)
# TARGET
returns = opens.pct_change(self.FORWARD_RETURNS_WINDOW)
# Calculate median cross-sectional returns (a Series)...
median_returns = returns.median(axis=1)
# ...and broadcast back to shape of original DataFrame
median_returns = closes.apply(lambda x: median_returns)
# Find stocks which will outperfom in the future
outperformers = returns > median_returns
targets = outperformers.shift(-self.FORWARD_RETURNS_WINDOW).fillna(False).astype(int)
...
The other method that is unique to MoonshotML
is predictions_to_signals
. After the model is trained on the features and targets from prices_to_features
in the training period of the walk-forward optimization, the model is used to make predictions on new data in the subsequent test period. The model's predictions during the test period are fed to the predictions_to_signals
method, where we use them to generate signals. In our demo strategy, we select the 3 stocks with the highest probability of outperforming the cross-sectional median and rebalance monthly:
TOP_N = 3
REBALANCE_INTERVAL = "M" # M = monthly;
...
def predictions_to_signals(self, predictions, prices):
"""
Turn a DataFrame of prediction probabilities into a DataFrame of signals.
"""
# Rank by probability of outperforming
winner_ranks = predictions.rank(axis=1, ascending=False)
signals = winner_ranks <= self.TOP_N
signals = signals.astype(int)
# Resample using the rebalancing interval.
# Keep only the last signal of the month, then fill it forward
signals = signals.resample(self.REBALANCE_INTERVAL).last()
signals = signals.reindex(predictions.index, method="ffill")
return signals
Once we've generated signals, the rest of a MoonshotML
strategy is identical to a Moonshot
strategy.
See the usage guide for more detail about how a MoonshotML backtest works.
To "install" the strategy, execute the following cell to move the strategy file to the /codeload/moonshot
directory, where MoonshotML looks:
The ! sytax below lets us execute terminal commands from inside the notebook.
!mv demo-ml.py /codeload/moonshot/