Penguin Crossfilter#

Cross-filtering Palmer Penguins#

import numpy as np
import pandas as pd
import panel as pn

import holoviews as hv
import hvplot.pandas # noqa


In this introduction to building interactive dashboards we will primarily be using 4 libraries:

  1. Pandas: To load and manipulate the data

  2. hvPlot: To quickly generate plots using a simple and familiar API

  3. HoloViews: To link selections between plots easily

  4. Panel: To build a dashboard we can deploy

Building some plots#

Let us first load the Palmer penguin dataset (Gorman et al.) which contains measurements about a number of penguin species:

penguins = pd.read_csv('data/penguins.csv')
studyName Sample Number Species Region Island Stage Individual ID Clutch Completion Date Egg Culmen Length (mm) Culmen Depth (mm) Flipper Length (mm) Body Mass (g) Sex Delta 15 N (o/oo) Delta 13 C (o/oo) Comments
0 PAL0708 1 Adelie Penguin Anvers Torgersen Adult, 1 Egg Stage N1A1 Yes 11/11/07 39.1 18.7 181.0 3750.0 MALE NaN NaN Not enough blood for isotopes.
1 PAL0708 2 Adelie Penguin Anvers Torgersen Adult, 1 Egg Stage N1A2 Yes 11/11/07 39.5 17.4 186.0 3800.0 FEMALE 8.94956 -24.69454 NaN
2 PAL0708 3 Adelie Penguin Anvers Torgersen Adult, 1 Egg Stage N2A1 Yes 11/16/07 40.3 18.0 195.0 3250.0 FEMALE 8.36821 -25.33302 NaN
3 PAL0708 5 Adelie Penguin Anvers Torgersen Adult, 1 Egg Stage N3A1 Yes 11/16/07 36.7 19.3 193.0 3450.0 FEMALE 8.76651 -25.32426 NaN
4 PAL0708 6 Adelie Penguin Anvers Torgersen Adult, 1 Egg Stage N3A2 Yes 11/16/07 39.3 20.6 190.0 3650.0 MALE 8.66496 -25.29805 NaN

This diagram provides some background about what these measurements mean:


Next we define an explicit colormapping for each species:

colors = {
    'Adelie Penguin': '#1f77b4',
    'Gentoo penguin': '#ff7f0e',
    'Chinstrap penguin': '#2ca02c'

Now we can start plotting the data with hvPlot, which provides a familiar API to pandas .plot users but generates interactive plots.

We start with a simple scatter plot of the culmen (think bill) length and depth for each species:

scatter = penguins.hvplot.points(
    'Culmen Length (mm)', 'Culmen Depth (mm)', c='Species',
    cmap=colors, responsive=True, min_height=300


Next we generate a histogram of the body mass colored by species:

histogram = penguins.hvplot.hist(
    'Body Mass (g)', by='Species', color=hv.dim('Species').categorize(colors),
    legend=False, alpha=0.5, responsive=True, min_height=300

Next we count the number of individuals of each species and generate a bar plot:

bars =
    'Species', 'Sample Number', c='Species', cmap=colors,
    responsive=True, min_height=300


Finally we generate violin plots of the flipper length of each species split by the sex:

violin = penguins.hvplot.violin(
    'Flipper Length (mm)', by=['Species', 'Sex'], cmap='Category20',
    responsive=True, min_height=300, legend='bottom_right'


Linking the plots#

hvPlot let us build interactive plots very quickly but what if we want to gain deeper insights about this data by selecting along one dimension and seeing that selection appear on other plots? Using HoloViews we can easily compose and link these plots:

ls = hv.link_selections.instance()

ls(scatter.opts(show_legend=False) + bars + histogram + violin).cols(2)

Building a dashboard#

As a final step we will compose these plots into a dashboard using Panel, so as a first step we will load the Palmer penguins logo:

logo = pn.panel('data/logo.png', height=60)

Next we define use some functionality on the link_selections object to display the count of currently selected penguins:

def count(selected):
    return f"## {len(selected)}/{len(penguins)} penguins selected"

selected = pn.pane.Markdown(
    pn.bind(count, ls.selection_param(penguins)),
    align='center', width=400, margin=(0, 100, 0, 0)


Now we will compose these two items into a Row which will serve as the header of our dashboard:

welcome = "## Welcome and meet the Palmer penguins!"

penguins_art = pn.pane.PNG('./data/lter_penguins.png', height=160)

credit = "### Artwork by @allison_horst"

instructions = """
Use the box-select and lasso-select tools to select a subset of penguins
and reveal more information about the selected subgroup through the power
of cross-filtering.

license = """
### License

Data are available by CC-0 license in accordance with the Palmer Station LTER Data Policy and the LTER Data Access Policy for Type I data."
art = pn.Column(
    welcome, penguins_art, credit, instructions, license,