IEX Trading#

datashaderpanelholoviews
Published: November 20, 2019 · Updated: November 1, 2023


IEX, the Investors Exchange, is a transparent stock exchange that discourages high-frequency trading and makes historical trading data publicly available. The data is offered in the form of daily pcap files where each single packet corresponds to a stock trade.

Even with this specialized pcap file format, these daily records can exceed a gigabyte in size on a given day. In this notebook, we will develop a dashboard that will allow us to explore every single trade that happened in a day, including the associated metadata. To visualize all this data at once both rapidly and interactively, we will use datashader via the HoloViews API.

Loading the data#

The IEX stock data is saved in two formats of pcap file called TOPS and DEEP. These formats are complex enough to make it non trivial to parse the trades with standard packet loading tools. For this reason, the trades for Monday 21st of October 2019 are supplied as a CSV file that has been generated from the original pcap file using the IEXTools library.

import warnings
warnings.simplefilter('ignore')
import datetime
import pandas as pd
df = pd.read_csv('./data/IEX_2019-10-21.csv')
print('Dataframe loaded containing %d events' % len(df))
Dataframe loaded containing 1222412 events

We can now look at the head of this DataFrame to see its structure:

df.head()
symbol size price timestamp
0 ZVZZT 50 10.015 1571659221573444414
1 RSX 300 23.210 1571659313752906463
2 BABA 100 171.400 1571659356868902969
3 BABA 3 171.400 1571659357585239782
4 KMT 83 25.000 1571659403813905391

Each row above corresponds to a stock trade where price indicates the stock price, the size indicates the size of the trade and the symbol specifies which stock was traded. Every trade also has a timestamp specified in nanoseconds.

Note that multiple trades can occur on the same timestamp.

Visualizing trade with Spikes#

We can now load HoloViews with the Bokeh plotting extension to start visualizing some of this data:

import holoviews as hv
from bokeh.models import HoverTool
from holoviews.operation.datashader import spikes_aggregate
hv.config.image_rtol = 10e-3 # Fixes datetime issue at high zoom level
hv.extension('bokeh')

One way to visualize events that occur over time is to use the Spikes element. Here we look at the first hundred spikes in this dataframe:

hv.Spikes(df.head(100), ['timestamp'],
          ['symbol', 'size', 'price']).opts(xrotation=90,  tools=['hover'],
                                            spike_length=1, position=0)

As in the dataframe tables shown above, the timestamps are expressed as integers counting the nanoseconds since Unix epoch (UTC). While many domains may use integers as their time axis (e.g CPU cycle for processor events), in this case we would like to recover the timestamp as a date.

We will do this in two steps (1) we map the integers to datetime64[ns] to get datetime objects and (2) we subtract 4 hours to go from UTC to the local time at the exchange (located in New Jersey):

df.timestamp = df.timestamp.astype('datetime64[ns]')
df.timestamp -= datetime.timedelta(hours=4)

Here every line corresponds to a trade where the position along the x-axis indicates the time at which that trade occurred (the timestamp in nanoseconds). If you hover over the spikes above, you can view all the timestamp values for the trades underneath the cursor as well as their corresponding stock symbols.

Using Bokeh we can only visualize a small number of trades effectively, but using datashader we can visualize all 1.2 million trades available:

spikes = hv.Spikes(df, ['timestamp'], ['symbol', 'size', 'price'])
rasterized = spikes_aggregate(spikes,
                              aggregator='count', spike_length=1).opts(
                                  width=600, colorbar=True, cmap='blues',
                                  yaxis=None, xrotation=90,
                                  default_tools=['xwheel_zoom', 'xpan', 'xbox_zoom'])
rasterized