Analyze Funding Data
Analyze the funding history of your favorite projects. New to OSO? Check out our Getting Started guide to set up your API access.
This tutorial shows how to quickly lookup the funding history of your favorite projects, and then explore funding from specific sources (e.g. Gitcoin Grants, Open Collective, etc.) You can find the code for this tutorial here.
Getting Started
Before running any analysis, you'll need to set up your environment:
from dotenv import load_dotenv
import os
import pandas as pd
from pyoso import Client
load_dotenv()
OSO_API_KEY = os.environ['OSO_API_KEY']
client = Client(api_key=OSO_API_KEY)
For more details on setting up Python notebooks, see our guide on writing Python notebooks.
Available Funding Metrics
First, let's see what funding metrics are available in the OSO data lake:
client.to_pandas("""
SELECT metric_name
FROM metrics_v0
WHERE metric_name LIKE '%_funding_%'
ORDER BY 1
""")
This query shows us all the funding-related metrics, including:
- Gitcoin donations and matching funds (see Gitcoin dataset)
- Open Collective funding (see Open Collective dataset)
- Overall OSS funding metrics (see OSS Funding guide)
Each metric is available at different time intervals (daily, weekly, monthly, quarterly, biannually, yearly, and over all time).
Aggregated Analysis
Total Funding by Source
Let's look at the total funding amounts across different sources:
client.to_pandas("""
SELECT
m.metric_name,
SUM(km.amount) AS total_amount_in_usd
FROM key_metrics_by_project_v0 AS km
JOIN metrics_v0 AS m ON km.metric_id = m.metric_id
WHERE m.metric_name LIKE '%_funding_%'
GROUP BY 1
ORDER BY 2 DESC
""")
This shows us the relative distribution of funding across different sources, with:
- Overall OSS funding typically being the largest category
- Gitcoin matching funds often being the second largest
- Gitcoin donations usually being the third largest
Top Funded Projects
Let's identify the projects that have received the most funding:
client.to_pandas("""
SELECT
p.display_name AS project_display_name,
SUM(km.amount) AS total_amount_in_usd
FROM key_metrics_by_project_v0 AS km
JOIN metrics_v0 AS m ON km.metric_id = m.metric_id
JOIN projects_v1 AS p ON km.project_id = p.project_id
WHERE m.metric_name LIKE '%_funding_awarded_over_all_time'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10
""")
This query will show you the top 10 projects by total funding received, helping identify which projects have been most successful in securing funding.
Gitcoin-Specific Funding
Let's look at the top projects funded through Gitcoin donations:
client.to_pandas("""
SELECT
p.display_name AS project_display_name,
SUM(km.amount) AS total_amount_in_usd
FROM key_metrics_by_project_v0 AS km
JOIN metrics_v0 AS m ON km.metric_id = m.metric_id
JOIN projects_v1 AS p ON km.project_id = p.project_id
WHERE m.metric_name = 'GITCOIN_DONATIONS_funding_awarded_over_all_time'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10
""")
This analysis reveals the projects that have received the most funding through Gitcoin's donation platform.
Historical Analysis
Let's look at funding trends before 2022:
client.to_pandas("""
SELECT
p.display_name AS project_display_name,
SUM(tm.amount) AS total_amount_in_usd
FROM timeseries_metrics_by_project_v0 AS tm
JOIN metrics_v0 AS m ON tm.metric_id = m.metric_id
JOIN projects_v1 AS p ON tm.project_id = p.project_id
WHERE m.metric_name = 'GITCOIN_DONATIONS_funding_awarded_yearly'
AND tm.sample_date < DATE '2022-01-01'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 10
""")
This historical view shows the distribution of funding across projects in earlier years, helping understand how funding patterns have evolved over time.
Detailed Analysis of Funding Events
Individual Gitcoin Donors to Projects
Let's look at detailed funding events for a specific project (e.g., Revoke):
client.to_pandas("""
SELECT
time,
round_number,
round_name,
event_source,
donor_address,
amount_in_usd
FROM int_events__gitcoin_funding
WHERE gitcoin_group_project_name = 'revokecash'
ORDER BY amount_in_usd DESC
LIMIT 10
""")
This shows individual funding events, including:
- The timing and size of donations
- Matching fund allocations from different rounds
- Various donations from different donors
Funders and Grant Pools
Let's analyze the funding sources and their impact:
client.to_pandas("""
SELECT
from_funder_name,
COUNT(DISTINCT grant_pool_name) AS grant_pools,
SUM(amount) AS amount_in_usd
FROM stg_ossd__current_funding
GROUP BY 1
ORDER BY 3 DESC
""")
This analysis helps identify:
- The most active funders in the ecosystem
- The number of grant pools managed by each funder
- The total funding distributed by each source
Project-to-Project Funding
Finally, let's look at funding flows between projects:
query = """
SELECT
fp.project_id AS from_project_id,
tp.project_id AS to_project_id,
fp.display_name AS funder,
tp.display_name AS project,
SUM(e.amount) AS amount
FROM int_events_daily__funding AS e
JOIN artifacts_by_project_v1 AS fa
ON e.from_artifact_id = fa.artifact_id
JOIN artifacts_by_project_v1 AS ta
ON e.to_artifact_id = ta.artifact_id
JOIN projects_v1 AS fp
ON fa.project_id = fp.project_id
JOIN projects_v1 AS tp
ON ta.project_id = tp.project_id
GROUP BY 1,2,3,4
"""
df = client.to_pandas(query)
This shows the network of funding relationships between projects, helping identify:
- Which projects are funding others
- The magnitude of funding flows
- The diversity of funding sources for each project
Next Steps
- Time Series Analysis: Create visualizations of funding trends over time
- Funding Impact: Correlate funding with development activity
- Network Analysis: Map the funding network between projects
- Comparative Analysis: Compare funding patterns across different ecosystems
For more examples of funding analysis, check out our Insights Repo.