QuantRocket logo
Disclaimer


Fundamental Factors › Lesson 2: Define a Base Universe


Define a Base Universe¶

Before researching specific factors, we will define a base universe. We don't want to include certain securities such as ETFs and ADRS in any of our subsequent analysis, and by defining a base universe in a separate file, we can import and use the definition in our notebooks without having to re-define the universe rules in each notebook.

The base universe will still be quite broad, for two reasons. First, we can always add more rules to the base rules in any given notebook to narrow the universe. Second, using a broad universe will help us see how factors behave across the US equities market, even if we subsequently wish to narrow the universe for trading or further analysis.

Explore Sharadar Categories¶

Different types of securities are categorized in the sharadar_Category field of the securities master database. Let's query all Sharadar records in the securities master database and group by sharadar_Category to see a breakdown of security types. (You can also obtain this information by browsing the sharadar-1d bundle in the Data Browser and looking at the Universe tab.)

In [1]:
from quantrocket.master import get_securities
securities = get_securities(vendors="sharadar", fields=["Symbol", "sharadar_Category"])

securities.groupby("sharadar_Category").Symbol.count()
Out[1]:
sharadar_Category
ADR                                          2
ADR Common Stock                          2118
ADR Common Stock Primary Class             141
ADR Common Stock Secondary Class           118
ADR Common Stock Warrant                   179
ADR Preferred                                6
ADR Preferred Stock                         96
ADR Stock Warrant                            1
CEF                                       1067
CEF Preferred                               63
CEF Warrant                                 41
Canadian                                     1
Canadian Common Stock                      373
Canadian Common Stock Primary Class         12
Canadian Common Stock Secondary Class        3
Canadian Common Stock Warrant                8
Canadian Preferred Stock                     3
Canadian Stock Warrant                       3
Domestic                                    76
Domestic Common Stock                    13861
Domestic Common Stock Primary Class       1121
Domestic Common Stock Secondary Class     1098
Domestic Common Stock Warrant             1455
Domestic Preferred                          45
Domestic Preferred Stock                  1169
Domestic Primary                             1
Domestic Stock Warrant                     208
Domestic Warrant                            16
ETD                                        498
ETF                                       4992
ETMF                                        18
ETN                                        403
IDX                                          5
UNIT                                        25
Name: Symbol, dtype: int64

We will focus on domestic common stocks. Since some companies have multiple share classes, we will exclude "Domestic Common Stock Secondary Class". The following Pipeline expression will satisfy these requirements:

In [2]:
from zipline.pipeline import master

category = master.SecuritiesMaster.sharadar_Category.latest
common_stocks = (
    # domestic common stocks
    category.has_substring("Domestic Common")
    # no secondary shares
    & ~category.has_substring("Secondary")
)

Since sharadar_Category is a field from the SecuritiesMaster Dataset, this filter can be applied as the initial_universe argument of our Pipelines in the following notebooks. Applying the filter as initial_universe will completely exclude from the pipeline workspace any assets that aren't primary-share common stocks and will provides a speed boost compared to include these rules in the screen along with our other rules. For more information on the initial_universe parameter and how it relates to screen, see the Usage Guide or the Pipeline Tutorial.

The additional filters below cannot be used with initial_universe and must be applied separately as the screen parameter of the Pipeline (or as a mask to other terms).

Liquidity Filter¶

Even though we want our base universe to be broad and include companies of all sizes, it is still important to add a basic liquidity filter. We will limit the universe to stocks that have had at least some trading volume on each trading day of the past month (approximately 21 trading days). Stocks that have zero trading volume are not only untradable but are also more likely to have suspect prices that can cause unexpected results in Alphalens tear sheets and other analyses.

In [3]:
from zipline.pipeline import EquityPricing

base_universe = (EquityPricing.volume.latest > 0).all(21)

Penny Stock Filter¶

In addition to the liquidity filter, we will also add a rule to filter out penny stocks by requiring that the closing price must be above $1.00 for 21 consecutive days. Penny stocks often undergo dramatic price jumps and price drops that, if included in the analysis, can bias the results and make it harder to interpret overall factor performance.

In [4]:
base_universe = (EquityPricing.close.latest > 1.00).all(21, mask=base_universe)

Helper file¶

To be able to reuse our universe filters, we put them in a separate file, universe.py. The universes can be imported as follows.

In [5]:
from codeload.fundamental_factors.universe import CommonStocks, BaseUniverse

initial_universe = CommonStocks()
base_universe = BaseUniverse()

The CommonStocks() filter will be used as the initial_universe of the Pipeline in the following notebooks, while the BaseUniverse() filter will be used as the screen parameter of the Pipeline and as a mask to various factors, and will sometimes be combined with additional filtering rules. (As a reminder from other tutorials, screen supports more kinds of filters than initial_universe, but using initial_universe reduces the size of the computational universe and thus provides a speed boost compared to using screen.)


Next Up¶

Lesson 3: Basic Usage