Skip to main content

Deep Dive into a Project

Analyze detailed metrics for a specific project. New to OSO? Check out our Getting Started guide to set up your BigQuery or API access.

tip

All projects are defined as YAML files in OSS Directory. View our current projects here.

BigQuery

If you haven't already, then the first step is to subscribe to OSO public datasets in BigQuery. You can do this by clicking the "Subscribe" button on our Datasets page.

The following queries should work if you copy-paste them into your BigQuery console.

Find a project

Search for projects by name:

select
project_id,
project_name,
display_name
from `oso_production.projects_v1`
where lower(display_name) like lower('%merkle%')

Find a project by artifact

Find projects associated with specific artifacts (e.g., GitHub repositories, contracts):

select
project_id,
project_name,
artifact_namespace as github_owner,
artifact_name as github_repo
from `oso_production.artifacts_by_project_v1`
where
artifact_source = 'GITHUB'
and artifact_namespace like '%uniswap%'

Code metrics

Get code metrics for a specific project:

select
project_name,
display_name,
star_count,
fork_count,
commit_count_6_months,
contributor_count_6_months
from `oso_production.code_metrics_by_project_v1`
where project_name = 'opensource-observer'

Timeseries metrics

Get historical metrics for a project:

select
tm.sample_date,
m.metric_name,
tm.amount
from `oso_production.timeseries_metrics_by_project_v0` as tm
join `oso_production.metrics_v0` as m
on tm.metric_id = m.metric_id
join `oso_production.projects_v1` as p
on tm.project_id = p.project_id
where p.project_name = 'wevm'
order by sample_date desc

Code contributors

Get all contributors to a project's GitHub repositories:

select
te.time,
a.artifact_name as code_contributor,
abp.artifact_name as github_repo,
te.event_type,
te.amount
from `oso_production.timeseries_events_by_artifact_v0` as te
join `oso_production.artifacts_by_project_v1` as abp
on te.to_artifact_id = abp.artifact_id
join `oso_production.artifacts_v1` a
on te.from_artifact_id = a.artifact_id
where
abp.project_name = 'ipfs'
and te.event_type = 'COMMIT_CODE'
order by te.time desc

GraphQL

The following queries should work if you copy-paste them into our GraphQL sandbox. For more information on how to use the GraphQL API, check out our GraphQL guide.

Find a project

Search for projects by name:

query FindProject {
oso_projectsV1(where: { display_name: { _ilike: "%ethereum%" } }) {
projectId
projectName
displayName
}
}

Find a project by artifact

Find projects associated with specific artifacts:

query FindProjectByArtifact {
oso_artifactsByProjectV1(
where: {
artifactNamespace: { _ilike: "%uniswap%" }
artifactSource: { _eq: "GITHUB" }
}
) {
projectId
projectName
artifactNamespace
artifactName
}
}

Code metrics

Get code metrics for a specific project:

query CodeMetricsForProject {
oso_codeMetricsByProjectV1(
where: { projectName: { _eq: "opensource-observer" } }
) {
projectName
displayName
starCount
forkCount
commitCount6Months
contributorCount6Months
}
}

Timeseries metrics

Get historical metrics for a project. This query returns two tables, timeseriesMetrics and metrics, which can be joined client-side on metricId.

query TimeseriesMetrics {
timeseriesMetrics: oso_timeseriesMetricsByProjectV0(
where: {
projectId: { _eq: "Erx9J64anc8oSeN-wDKm0sojJf8ONrFVYbQ7GFnqSyc=" }
}
) {
sampleDate
metricId
amount
}
metrics: oso_metricsV0 {
metricId
metricName
}
}

Python

See our guide on writing Python notebooks for more information on how to connect to BigQuery and query data. Our Insights Repo is full of examples too.

Connect to BigQuery

You can use the following to connect to BigQuery:

from google.cloud import bigquery
import pandas as pd
import os

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = # PATH TO YOUR CREDENTIALS JSON
GCP_PROJECT = # YOUR GCP PROJECT NAME

client = bigquery.Client(GCP_PROJECT)

Find a project

Search for projects by name:

query = """
select
project_id,
project_name,
display_name
from `oso_production.projects_v1`
where lower(display_name) like lower('%merkle%')
"""
df = client.query(query).to_dataframe()

Find a project by artifact

Find projects associated with specific artifacts:

query = """
select
project_id,
project_name,
artifact_namespace as github_owner,
artifact_name as github_repo
from `oso_production.artifacts_by_project_v1`
where
artifact_source = 'GITHUB'
and artifact_namespace like '%uniswap%'
"""
df = client.query(query).to_dataframe()

Code metrics

Get code metrics for a specific project:

query = """
select
project_name,
display_name,
star_count,
fork_count,
commit_count_6_months,
contributor_count_6_months
from `oso_production.code_metrics_by_project_v1`
where project_name = 'opensource-observer'
"""
df = client.query(query).to_dataframe()

Timeseries metrics

Get historical metrics for a project:

query = """
select
tm.sample_date,
m.metric_name,
tm.amount
from `oso_production.timeseries_metrics_by_project_v0` as tm
join `oso_production.metrics_v0` as m
on tm.metric_id = m.metric_id
join `oso_production.projects_v1` as p
on tm.project_id = p.project_id
where p.project_name = 'wevm'
order by sample_date desc
"""
df = client.query(query).to_dataframe()

Code contributors

Get all contributors to a project's GitHub repositories:

query = """
select
te.time,
a.artifact_name as code_contributor,
abp.artifact_name as github_repo,
te.event_type,
te.amount
from `oso_production.timeseries_events_by_artifact_v0` as te
join `oso_production.artifacts_by_project_v1` as abp
on te.to_artifact_id = abp.artifact_id
join `oso_production.artifacts_v1` a
on te.from_artifact_id = a.artifact_id
where
abp.project_name = 'ipfs'
and te.event_type = 'COMMIT_CODE'
order by te.time desc
"""
df = client.query(query).to_dataframe()

Adding projects

Projects are defined as YAML files in our OSS Directory repo. You can add or update your own projects or project artifacts by submitting a pull request.

For more information on how projects work, see our guide here.