Skip to main content

Conduct Cohort Analysis

Track a cohort of projects across a set of metrics over time. New to OSO? Check out our Getting Started guide to set up your API access.

Getting Started

Before running any analysis, you'll need to set up your environment:

Start your Python notebook with the following:

import os
import pandas as pd
from pyoso import Client

OSO_API_KEY = os.environ['OSO_API_KEY']
client = Client(api_key=OSO_API_KEY)

For more details on setting up Python notebooks, see our guide on writing Python notebooks.

Explore Ecosystem Projects

Discover Projects Deployed in an Ecosystem from the OSS Directory

The OSS Directory is a curated registry of open-source software projects. The query below filters this directory to highlight projects with verifiable onchain deployments on Arbitrum One.

query = """
SELECT
project_id,
artifact_name,
artifact_type
FROM int_artifacts_by_project_in_ossd
WHERE
artifact_source = 'ARBITRUM_ONE'
AND artifact_type IN ('CONTRACT', 'FACTORY', 'DEPLOYER')
LIMIT 10
"""
df = client.to_pandas(query)

Track Developer Activity for Arbitrum-Deployed Projects

By joining with timeseries metrics, we can observe GitHub activity trends for each project across 2024 and 2025.

query = """
WITH arb_projects AS (
SELECT
distinct project_id
FROM int_artifacts_by_project_in_ossd
WHERE
artifact_source = 'ARBITRUM_ONE'
AND artifact_type IN ('CONTRACT', 'FACTORY', 'DEPLOYER')
)
SELECT
distinct p.display_name as Name,
m.metric_name as Metric,
ts.sample_date as Date,
ts.amount as Value
FROM metrics_v0 m
JOIN timeseries_metrics_by_project_v0 ts
on m.metric_id = ts.metric_id
JOIN projects_v1 p
on p.project_id = ts.project_id
JOIN arb_projects a
on p.project_id = a.project_id
WHERE
metric_name = 'GITHUB_active_developers_monthly'
AND YEAR(ts.sample_date) in (2024, 2025)
"""
df_arb = client.to_pandas(query)

Explore a Project Cohort

Analyze GitHub Activity of Arbitrum Stylus Grant Program Projects

In contrast to the previous query which filtered projects based on onchain deployment on Arbitrum One, this query focuses on all Github-related metrics for projects that are part of the ‘arb-stylus’ collection—a set representing participants in the Arbitrum Stylus grant program.

query = """
SELECT
distinct p.display_name as Name,
m.metric_name as Metric,
ts.sample_date as Date,
ts.amount as Value
FROM metrics_v0 m
JOIN timeseries_metrics_by_project_v0 ts
on m.metric_id = ts.metric_id
JOIN projects_v1 p
on p.project_id = ts.project_id
JOIN projects_by_collection_v1 pc
on p.project_id = pc.project_id
WHERE
metric_name like 'GITHUB_%'
AND YEAR(ts.sample_date) in (2024, 2025)
AND pc.collection_name = 'arb-stylus'
"""
df_stylus = client.to_pandas(query)

Visualize Results

Heatmap of monthly developer activity

This heatmap provides a visual overview of monthly active developer activity for each project in the collection.

# Create a heatmap of active developers by project and month
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime
import numpy as np

# Filter for active developers metric
df_active_devs = df_stylus[df_stylus['Metric'] == 'GITHUB_active_developers_monthly'].copy()

# Convert date to month-year format
df_active_devs['Month'] = pd.to_datetime(df_active_devs['Date']).dt.strftime('%Y-%m')

# Pivot the data for heatmap
heatmap_data = df_active_devs.pivot(index='Name', columns='Month', values='Value')

# Fill NaN values with 0 for visualization
heatmap_data = heatmap_data.fillna(0)

# Create the heatmap
fig = go.Figure(data=go.Heatmap(
z=heatmap_data.values,
x=heatmap_data.columns,
y=heatmap_data.index,
colorscale='Viridis',
colorbar=dict(title='Active Developers'),
hoverongaps=False,
text=heatmap_data.values, # Show values on hover
texttemplate='%{text:.0f}', # Format as integers
textfont={"size": 10},
hovertemplate='Project: %{y}<br>Month: %{x}<br>Active Developers: %{z}<extra></extra>'
))

# Update layout
fig.update_layout(
title='Monthly Active Developers by Project',
xaxis_title='Month',
yaxis_title='Project',
height=800, # Adjust height based on number of projects
width=1200, # Adjust width based on number of months
xaxis=dict(tickangle=45),
margin=dict(l=200) # Increase left margin for project names
)

# Show the plot
fig.show()

Developer Activity Heatma

You can replace the metric with another measure—such as GITHUB_commits_monthly or GITHUB_merged_pull_requests_monthly—to explore different dimensions of developer engagement.