Skip to main content

OSO Data Portal: free live datasets open to the public

· 3 min read
Raymond Cheng
Co-Founder

At Open Source Observer, we have been committed to building everything in the open from the very beginning. Today, we take that openness to the next level by launching the OSO Data Exchange on Google BigQuery. Here, we will publish every data set we have as live, up-to-date, and free to use datasets. In addition to sharing every model in the OSO production data pipeline, we are sharing source data for blocks/transactions/traces across 7 chains in the OP Superchain (including Optimism, Base, Frax, Metal, Mode, PGN, Zora), Gitcoin Data, and OpenRank. This builds on the existing BigQuery public data ecosystem that includes GitHub, Ethereum, Farcaster, and Lens data. To learn more, check out the data portal here:

opensource.observer/data

data portal

A huge thank you to our data partners from Goldsky, Gitcoin, and Karma3 Labs for helping make this happen!

Why are we doing this?

We are living in a world where open ecosystems are out-innovating closed ecosystems. As firm believers in the power of open source, we are growing a network of impact data scientists to help demonstrate and visualize the immense impact that open source software is having on the world, starting with crypto ecosystems. We hope that better understanding of the impact of open source will lead to better appreciation, higher engagement, and ultimately deeper investments into open source technologies. For example, OSO data is used in the latest Optimism retrofunding round to distribute >$10M to builders.

What can I do with this data?

We have written several guides on how to leverage this data, from immediately making queries in BigQuery, to exploring the data in a Python notebook, to integrating with a 3rd party data tool, like Hex, or Tableau. If you end up using this data, we only ask that you share what you’ve learned and tag us @OSObserver.

You should start by subscribing to a dataset from our Data Overview. Once you’ve subscribed, you can run queries in your favorite data tool.

For example to get a summary of code activity for Uniswap,

select *
from `YOUR_PROJECT_NAME.oso_production.code_metrics_by_project_v1`
where project_name = 'uniswap'

Or lookup deployed contracts from a particular address on Base:

select
traces.block_timestamp,
traces.transaction_hash,
txs.from_address as originating_address,
txs.to_address as originating_contract,
traces.from_address as factory_address,
traces.to_address as contract_address
from `YOUR_PROJECT_NAME.superchain.base_traces` as traces
inner join transactions as txs
on txs.hash = traces.transaction_hash
where
LOWER(traces.from_address) != "0x3fab184622dc19b6109349b94811493bf2a45362"
and LOWER(trace_type) in ("create", "create2")

Or lookup vitalik.eth’s Gitcoin passport score:

select
passport_address,
last_score_timestamp,
evidence_rawScore,
evidence_threshold,
from YOUR_PROJECT_NAME.gitcoin.passport_scores
where passport_address = '0xd8da6bf26964af9d7eed9e03e53415d37aa96045'

When you’ve developed a novel impact metrics or data model, we encourage you to contribute back to the OSO data pipeline, which is continuously deployed from our open source repository.

How can I help?

If you love supporting open source like we do, reach out to us over email or Discord. We are hiring founding engineers, onboarding new data sources, and working with our partners to evolve our understanding of impact.