Free data for US stocks is available through two distinct, complementary datasets. The free minute bundle provides minute and daily price data from 2007 to the present for a small selection of stocks. The "learning bundle" provides daily price history from 2007-2011 for all US stocks (including stocks that later delisted from the exchange). The free minute bundle provides a large date range for a small number of stocks, while the learning bundle provides a smaller date range for a large number of stocks. Both datasets are utilized in the lecture series.
The datasets are stored as Zipline bundles, a type of local database. (Zipline is a backtesting library that also provides a database technology for storing and querying price data. See the usage guide for more information.)
Start by creating an empty bundle called 'usstock-free-1min', specifying free=True
:
from quantrocket.zipline import create_usstock_bundle
create_usstock_bundle("usstock-free-1min", free=True)
{'status': 'success', 'msg': 'successfully created usstock-free-1min bundle'}
Then collect the data, which in Zipline bundle terminology is referred to as ingestion:
from quantrocket.zipline import ingest_bundle
ingest_bundle("usstock-free-1min")
{'status': 'the data will be ingested asynchronously'}
This runs in the background. Open flightlog to monitor the progress. You can open flightlog from the JupyterLab launcher or by executing the following IPython magic command:
%flightlog
The following messages will indicate completion:
quantrocket.zipline: INFO [usstock-free-1min] Ingesting minute bars for 9 securities in usstock-free-1min bundle
quantrocket.zipline: INFO [usstock-free-1min] Ingesting daily bars for usstock-free-1min bundle
quantrocket.zipline: INFO [usstock-free-1min] Ingesting adjustments for usstock-free-1min bundle
quantrocket.zipline: INFO [usstock-free-1min] Ingesting assets for usstock-free-1min bundle
quantrocket.zipline: INFO [usstock-free-1min] Completed ingesting data for 9 securities in usstock-free-1min bundle
Collecting the learning dataset is similar. Create a bundle called 'usstock-learn-1d', specifying learn=True
:
from quantrocket.zipline import create_usstock_bundle
create_usstock_bundle("usstock-learn-1d", learn=True)
{'status': 'success', 'msg': 'successfully created usstock-learn-1d bundle'}
Then ingest the data:
from quantrocket.zipline import ingest_bundle
ingest_bundle("usstock-learn-1d")
{'status': 'the data will be ingested asynchronously'}
This dataset only takes a few seconds to ingest:
quantrocket.zipline: INFO [usstock-learn-1d] Completed ingesting daily history for all stocks from 2007-2011