Skip to main content

Get Started

info

There are two easy ways of accessing OSO datasets: through our GraphQL API and through our data warehouse on BigQuery. For integrations, you'll need API access. For exploratory analysis and impact data science, it's best to go direct to the data warehouse.

Generate an API key


The OSO GraphQL API serves impact metrics for OSS projects, collections, and artifacts. Access to the OSO GraphQL API is necessary for any integration with OSO datasets.

First, navigate to www.opensource.observer and create a new account.

If you already have an account, log in. Then create a new personal API key:

  1. Go to Account settings
  2. In the "API Keys" section, click "+ New"
  3. Give your key a label - this is just for you, usually to describe a key's purpose.
  4. You should see your brand new key. Immediately save this value, as you'll never see it again after refreshing the page.
  5. Click "Create" to save the key.

You can create as many keys as you like.

generate API key

Login to BigQuery


OSO's data warehouse is currently located on BigQuery and is available publicly by referencing it as opensource-observer.oso. If you're looking to explore the data, or to contribute to our public set of models, you will need to have an account with GCP (Google Cloud Platform).

Sign up free

If this is your first time getting into GCP, you can do so by going here.

From there you'll want to click on "Start free" or "Get started for free". You will then be prompted to login with your Google account.

GCP Signup

Once you're logged in, you can then proceed to setting up your account. First, select a country and agree to the terms of service. Then, you need to enter your payment information for verification.

GCP Billing

tip

GCP offers a free tier that includes $300 in credits. After that, it is easy to stay in the free tier provided you remain under the 1TB per month limit for BigQuery data processed (more on that later).

After you've created your account, you will then be asked a few marketing questions from Google. Fill these out as appropriate.

Finally, you will be brought to the admin console where you can create a new project. Feel free to name this GCP project anything you'd like. (Or you can simply leave the default project name 'My First Project'.)

Navigate to BigQuery from the left-hand menu and then click on BigQuery Studio from the hover menu. This will take you to the BigQuery Console.

GCP Admin

The console features an Explorer frame on the left-hand side, which lists all the datasets available to you, and a Studio Console which has tabs for organizing your work. This will be your workspace for querying the OSO dataset. If this is your first time, you will likely see a welcome message on the first tab in the Studio Console. Now you're ready to start exploring OSO datasets!

GCP Welcome

Query the oso_playground dataset

If you just created your GCP account by following the steps above, then you'll already be in the BigQuery Console. However, if you're just joining us because you already have an account, then go directly to the BigQuery Console by clicking here.

Close the first tab on the console or simply navigate to the second tab, which should display a blank query editor. Alternatively, you can open a new tab by clicking on the + icon on the top right of the console to Create SQL Query.

From here you will be able to write any SQL you'd like against the OSO dataset. For example, you can query the oso_playground dataset for all the collections in that dataset like this:

SELECT *
FROM `opensource-observer.oso_playground.collections`

Click Run to execute your query. The results will appear in a table at the bottom of the console.

GCP Query

The console will help you complete your query as you type, and will also provide you with a preview of the results and computation time. You can save your queries, download the results, and even make simple visualizations directly from the console.

tip

To explore all the OSO datasets available, see here.

  • oso contains all production data. This can be quite large depending on the dataset.
  • oso_playground contains only the last 2 weeks for every datast. We recommend using this for development and testing.

Join the Kariba Data Collective


Now that you're set up, there are many ways to contribute to OSO and integrate the data with your application:

If you think you'll be an ongoing contributor to OSO, please apply to join the Kariba Data Collective.

Membership is free but we want to keep the community close-knit and mission-aligned. As the community grows, we want to reward the most useful contributions and in so doing create a new job category for impact data science.

Join the Data Collective