Before researching specific factors, we will define a base universe. We don't want to include certain securities such as ETFs and ADRS in any of our subsequent analysis, and by defining a base universe in a separate file, we can import and use the definition in our notebooks without having to re-define the universe rules in each notebook.
The base universe will still be quite broad, for two reasons. First, we can always add more rules to the base rules in any given notebook to narrow the universe. Second, using a broad universe will help us see how factors behave across the US equities market, even if we subsequently wish to narrow the universe for trading or further analysis.
Different types of securities are categorized in the sharadar_Category
field of the securities master database. Let's query all Sharadar records in the securities master database and group by sharadar_Category
to see a breakdown of security types. (You can also obtain this information by browsing the sharadar-1d bundle in the Data Browser and looking at the Universe tab.)
from quantrocket.master import get_securities
securities = get_securities(vendors="sharadar", fields=["Symbol", "sharadar_Category"])
securities.groupby("sharadar_Category").Symbol.count()
sharadar_Category ADR 2 ADR Common Stock 2118 ADR Common Stock Primary Class 141 ADR Common Stock Secondary Class 118 ADR Common Stock Warrant 179 ADR Preferred 6 ADR Preferred Stock 96 ADR Stock Warrant 1 CEF 1067 CEF Preferred 63 CEF Warrant 41 Canadian 1 Canadian Common Stock 373 Canadian Common Stock Primary Class 12 Canadian Common Stock Secondary Class 3 Canadian Common Stock Warrant 8 Canadian Preferred Stock 3 Canadian Stock Warrant 3 Domestic 76 Domestic Common Stock 13861 Domestic Common Stock Primary Class 1121 Domestic Common Stock Secondary Class 1098 Domestic Common Stock Warrant 1455 Domestic Preferred 45 Domestic Preferred Stock 1169 Domestic Primary 1 Domestic Stock Warrant 208 Domestic Warrant 16 ETD 498 ETF 4992 ETMF 18 ETN 403 IDX 5 UNIT 25 Name: Symbol, dtype: int64
We will focus on domestic common stocks. Since some companies have multiple share classes, we will exclude "Domestic Common Stock Secondary Class". The following Pipeline expression will satisfy these requirements:
from zipline.pipeline import master
category = master.SecuritiesMaster.sharadar_Category.latest
common_stocks = (
# domestic common stocks
category.has_substring("Domestic Common")
# no secondary shares
& ~category.has_substring("Secondary")
)
Since sharadar_Category
is a field from the SecuritiesMaster
Dataset, this filter can be applied as the initial_universe
argument of our Pipelines in the following notebooks. Applying the filter as initial_universe
will completely exclude from the pipeline workspace any assets that aren't primary-share common stocks and will provides a speed boost compared to include these rules in the screen
along with our other rules. For more information on the initial_universe
parameter and how it relates to screen
, see the Usage Guide or the Pipeline Tutorial.
The additional filters below cannot be used with initial_universe
and must be applied separately as the screen
parameter of the Pipeline (or as a mask to other terms).
Even though we want our base universe to be broad and include companies of all sizes, it is still important to add a basic liquidity filter. We will limit the universe to stocks that have had at least some trading volume on each trading day of the past month (approximately 21 trading days). Stocks that have zero trading volume are not only untradable but are also more likely to have suspect prices that can cause unexpected results in Alphalens tear sheets and other analyses.
from zipline.pipeline import EquityPricing
base_universe = (EquityPricing.volume.latest > 0).all(21)
In addition to the liquidity filter, we will also add a rule to filter out penny stocks by requiring that the closing price must be above $1.00 for 21 consecutive days. Penny stocks often undergo dramatic price jumps and price drops that, if included in the analysis, can bias the results and make it harder to interpret overall factor performance.
base_universe = (EquityPricing.close.latest > 1.00).all(21, mask=base_universe)
To be able to reuse our universe filters, we put them in a separate file, universe.py. The universes can be imported as follows.
from codeload.fundamental_factors.universe import CommonStocks, BaseUniverse
initial_universe = CommonStocks()
base_universe = BaseUniverse()
The CommonStocks()
filter will be used as the initial_universe
of the Pipeline in the following notebooks, while the BaseUniverse()
filter will be used as the screen
parameter of the Pipeline and as a mask to various factors, and will sometimes be combined with additional filtering rules. (As a reminder from other tutorials, screen
supports more kinds of filters than initial_universe
, but using initial_universe
reduces the size of the computational universe and thus provides a speed boost compared to using screen
.)