When analyzing price data, it is common to compute changes in prices over time, such as calculating a 52-week high or a 50-day moving average. We call these "windowed computations" because they involve looking at a window of data rather than a single price observation. In Pipeline, factors that accept a window_length
parameter are used to perform windowed computations. For example, SimpleMovingAverage(inputs=EquityPricing.close, window_length=10)
computes the average of the last 10 days of closing prices.
Just as with price data, it is often useful to compute changes in fundamental values over time. For example, you might want to compute 5-year dividend growth or screen for companies who have consistently grown their earnings over a certain number of quarters. Typical window-based Pipeline factors like SimpleMovingAverage
aren't suitable for fundamental data because fundamental data changes quarterly, not daily. We don't want to compute the average dividend of the last N days but of the last N quarters.
Pipeline makes it easy to perform computations on multiple quarters or years of fundamental data. These are referred to as "periodic computations" because they use fiscal periods rather than the daily values that are used in typical windowed computations like SimpleMovingAverage
. There are ready-made factors to compute the average, high, low, percent change, or CAGR of a fundamental metric over time, or to screen for companies with metrics that are consistently above or below a certain value (such as consistently positive earnings or dividends), or to screen for consistently increasing or decreasing metrics (such as consistently increasing revenue).
Before we look at some of the ready-made periodic factors and filters, let's look at the period_offset
parameter, which forms the basis of all periodic computations.
As we saw in the previous lesson, you must specify a dimension
when taking a slice of a fundamental dataset:
from zipline.pipeline import sharadar
# ARQ = As-Reported Quarterly fundamentals
fundamentals = sharadar.Fundamentals.slice('ARQ')
The slice()
method also accepts an optional second parameter, period_offset
. If omitted, as in the above example, period_offset
defaults to 0, which means that Pipeline will return data for the most recent fiscal period (as of the pipeline simulation date). In contrast, a negative period_offset
means to return data for a previous fiscal period: -1 means the immediately preceding fiscal period, -2 means two fiscal periods ago, etc. For quarterly and trailing-twelve-month dimensions, previous period means previous quarter, while for annual dimensions, previous period means previous year.
To illustrate the use of period_offset
, let's look at Microsoft's current and previous EPS. First, we take two slices of Fundamentals
, one representing the latest period and one representing the previous period, and from these slices create factors for the current and previous EPS:
from zipline.pipeline import sharadar
current_fundamentals = sharadar.Fundamentals.slice('ART', period_offset=0)
previous_fundamentals = sharadar.Fundamentals.slice('ART', period_offset=-1)
eps = current_fundamentals.EPS.latest
previous_eps = previous_fundamentals.EPS.latest
Then, we include the factors as pipeline columns and limit the initial universe to MSFT only. We also include a column with the fiscal period end date for reference:
from zipline.pipeline import Pipeline
from zipline.pipeline.filters import StaticAssets
from zipline.research import symbol
MSFT = symbol("MSFT")
pipeline = Pipeline(
columns={
'fiscal_period_end_date': current_fundamentals.CALENDARDATE.latest,
'eps': eps,
'previous_eps': previous_eps,
},
initial_universe=StaticAssets([MSFT])
)
Finally, we run the pipeline. To see what's going on, we can use drop_duplicates()
to limit the output to rows where the values changed from the previous row:
from zipline.research import run_pipeline
results = run_pipeline(pipeline, '2022-01-01', '2022-12-31')
results.drop_duplicates()
fiscal_period_end_date | eps | previous_eps | ||
---|---|---|---|---|
date | asset | |||
2022-01-03 | Equity(FIBBG000BPH459 [MSFT]) | 2021-09-30 | 9.02 | 8.12 |
2022-01-26 | Equity(FIBBG000BPH459 [MSFT]) | 2021-12-31 | 9.47 | 9.02 |
2022-04-27 | Equity(FIBBG000BPH459 [MSFT]) | 2022-03-31 | 9.65 | 9.47 |
2022-07-29 | Equity(FIBBG000BPH459 [MSFT]) | 2022-06-30 | 9.70 | 9.65 |
2022-10-26 | Equity(FIBBG000BPH459 [MSFT]) | 2022-09-30 | 9.32 | 9.70 |
You can see that the previous_eps
column contains the eps
column value shifted down from the previous period.
Using period_offset
, we can do things like compare the current and previous EPS to create a new Filter that computes True if EPS increased from the previous period:
eps_increased = eps > previous_eps
You can go back an arbitrary number of periods with period_offset
, and you combine the different periods into arbitrarily complex expressions. Under the hood, this is what Pipeline's built-in periodic factors and filters do.
The Pipeline API includes a variety of built-in factors and filters for performing periodic computations. These live in the zipline.pipeline.periodic
module. To see the full list of available factors, click on periodic
in the following import statement in JupyterLab and press Ctrl
to see the module docstring:
from zipline.pipeline import periodic
Let's create some real-world examples.
To smooth out variation in quarterly earnings, we can compute the average EBITDA over the last 4 quarters:
from zipline.pipeline.periodic import PeriodicAverage
fundamentals = sharadar.Fundamentals.slice('ARQ')
avg_earnings = PeriodicAverage(fundamentals.EBITDA, window_length=4)
Note that the first argument we pass to PeriodicAverage()
is the column itself (fundamentals.EBITDA
), not the latest
factor of the column (fundamentals.EBITDA.latest
). This is true of all built-in periodic factors and filters.
We can use PeriodicCAGR()
to compute the compound annual growth rate of revenue over the last 5 years:
from zipline.pipeline.periodic import PeriodicCAGR
fundamentals = sharadar.Fundamentals.slice('ARY')
revenue_growth = PeriodicCAGR(fundamentals.REVENUE, window_length=5)
A similar factor is PeriodicPercentChange()
, which differs only in that it calculates the total percent change over the window length rather than the annual growth rate.
In this example, we use AllPeriodAbove()
to screen for companies that have paid dividends in each of the last 8 years:
from zipline.pipeline.periodic import AllPeriodsAbove
fundamentals = sharadar.Fundamentals.slice('ARY')
consistently_pay_dividends = AllPeriodsAbove(fundamentals.DPS, 0, window_length=8)
This example builds on the previous one by using AllPeriodsIncreasing()
to further limit the screen to companies that have never cut their dividends over the 8-year period. We use allow_equal=True
to allow for equal or increasing dividends, and we provide the previous screen as a mask to limit the computation to dividend payers:
from zipline.pipeline.periodic import AllPeriodsIncreasing
have_never_cut_dividends = AllPeriodsIncreasing(fundamentals.DPS, allow_equal=True, window_length=8, mask=consistently_pay_dividends)
Suppose we'd like to know how the current EPS compares to the 4-year high of EPS. We can use PeriodicHigh()
to compute the 4-year high (16 quarters using trailing-twelve-month fundamentals), then compare it to EPS to get a ratio. We use where()
to limit the output to companies with positive EPS:
from zipline.pipeline.periodic import PeriodicHigh
fundamentals = sharadar.Fundamentals.slice('ART')
eps = fundamentals.EPS.latest
high_eps = PeriodicHigh(fundamentals.EPS, window_length=16)
eps_vs_high = (eps / high_eps).where(eps > 0)
Let's look at a variation of the previous example. Suppose we want to find companies whose current EPS is higher than any of the previous 16 quarters. To do this, we need to compute the 16-quarter high of EPS as of the previous quarter, then see if the current EPS is higher than that. We can calculate the highest EPS as of the previous quarter by using period_offset
to pass the previous quarter's EPS to PeriodicHigh()
:
current_fundamentals = sharadar.Fundamentals.slice('ART', period_offset=0)
previous_fundamentals = sharadar.Fundamentals.slice('ART', period_offset=-1)
eps = current_fundamentals.EPS.latest
previous_high_eps = PeriodicHigh(previous_fundamentals.EPS, window_length=16)
is_new_high_eps = eps > previous_high_eps
So far, we have passed fundamental columns (such as REVENUE
or EPS
) directly to the built-in periodic factors. What if we want to perform periodic computations using derived factors, such as operating margin, which as we saw in a previous notebook can be derived as follows:
operating_margin = fundamentals.OPINC.latest / fundamentals.REVENUE.latest
To use a derived factor with any of the built-in periodic factors or filters, we must create a function that returns the derived factor, then pass the function to the periodic factor or filter.
The function we create must accept two parameters: period_offset
and mask
. The function should use the period_offset
parameter to derive the factor corresponding to that period_offset
. The function should use the mask
parameter (if provided) to mask the derived factor it returns. Here is a function that computes operating margin:
def OPMARGIN(period_offset=0, mask=None):
fundamentals = sharadar.Fundamentals.slice("ART", period_offset)
operating_margin = fundamentals.OPINC.latest / fundamentals.REVENUE.latest
if mask is not None:
operating_margin = operating_margin.where(mask)
return operating_margin
We can now pass the OPMARGIN
function to any of the built-in periodic factors and filters, just as we would pass a data column. Here, we compute the lowest and highest operating margin over the last 4 quarters:
from zipline.pipeline.periodic import PeriodicLow, PeriodicHigh
high_opmargin = PeriodicHigh(OPMARGIN, window_length=4)
low_opmargin = PeriodicLow(OPMARGIN, window_length=4)
Make sure to pass the function itself to the periodic factor or filter, not the result of calling the function (OPMARGIN
, not OPMARGIN()
).
If you were to pass a mask
to PeriodicHigh()
or PeriodicLow()
, that mask would be passed in turn to your OPMARGIN
function. If you don't pass a mask
to PeriodicHigh()
or PeriodicLow()
, no mask will be passed to your OPMARGIN
function. Regardless of whether you intend to pass a mask or not, your OPMARGIN
function must accept a mask
parameter.