Skip to main content

Get Started

info

There are two easy ways of accessing OSO datasets: through our GraphQL API and through our data warehouse on BigQuery. For live integrations, you'll want API access. For exploratory analysis and impact data science, it's best to go direct to the data warehouse.

OSO's data warehouse is currently located in BigQuery on Google Cloud (GCP). Every data model is made publicly available by a BigQuery dataset.

See our data exchange on Google Analytics Hub for a full list of public data sets.

Sign up for Google Cloud

Navigate to Google Cloud and log in. If this is your first time here, you can sign up for a free cloud account using your existing Google account. If you already have a GCP account, skip to the query.

GCP Signup

First, select a country and agree to the terms of service. Then, you need to enter your payment information for verification and answer a few marketing questions.

GCP Billing

tip

GCP offers a free tier that includes $300 in credits. After that, it is easy to stay in the free tier provided you remain under the 1TB per month limit for BigQuery data processed (more on that later).

Finally, you will be brought to the admin console where you can create a new project. Feel free to name this GCP project anything you'd like. (Or you can simply leave the default project name 'My First Project'.)

Make your first query

Go to the BigQuery Console. Navigate to BigQuery from the left-hand menu and then click on BigQuery Studio from the hover menu.

GCP Admin

The console features an Explorer frame on the left-hand side, which lists all the datasets available to you, and a Studio Console which has tabs for organizing your work. This will be your workspace for querying the OSO dataset.

GCP Welcome

Open a new tab by clicking on the + icon on the top right of the console to Create SQL Query.

From here you will be able to write any SQL you'd like any OSO dataset. For example, you can query the oso_playground dataset for a sample of collections like this:

SELECT *
FROM `opensource-observer.oso_playground.collections_v1`

Click Run to execute your query. The results will appear in a table at the bottom of the console.

GCP Query

The console will help you complete your query as you type, and will also provide you with a preview of the results and computation time. You can save your queries, download the results, and even make simple visualizations directly from the console.

tip

To explore all the OSO datasets available, see here.

  • oso contains all production data. This can be quite large depending on the dataset.
  • oso_playground contains only the last 2 weeks for every dataset. We recommend using this for development and testing.

Next steps

Now that you're set up, there are many ways to contribute to OSO and integrate the data with your application:

If you think you'll be an ongoing contributor to OSO, please apply to join the Kariba Data Collective.

Membership is free but we want to keep the community close-knit and mission-aligned. As the community grows, we want to reward the most useful contributions and in so doing create a new job category for impact data science.

Join the Data Collective