Get Started
There are two easy ways of accessing OSO datasets: through our GraphQL API and through our data warehouse on BigQuery. For integrations, you'll need API access. For exploratory analysis and impact data science, it's best to go direct to the data warehouse.
Generate an API key
The OSO GraphQL API serves impact metrics for OSS projects, collections, and artifacts. Access to the OSO GraphQL API is necessary for any integration with OSO datasets.
First, navigate to www.opensource.observer and create a new account.
If you already have an account, log in. Then create a new personal API key:
- Go to Account settings
- In the "API Keys" section, click "+ New"
- Give your key a label - this is just for you, usually to describe a key's purpose.
- You should see your brand new key. Immediately save this value, as you'll never see it again after refreshing the page.
- Click "Create" to save the key.
You can create as many keys as you like.
Login to BigQuery
OSO's data warehouse is currently located on BigQuery and is available publicly by
referencing it as opensource-observer.oso
. If you're looking to explore the
data, or to contribute to our public set of models, you will need to have an
account with GCP (Google Cloud Platform).
Sign up free
If this is your first time getting into GCP, you can do so by going here.
From there you'll want to click on "Start free" or "Get started for free". You will then be prompted to login with your Google account.
Once you're logged in, you can then proceed to setting up your account. First, select a country and agree to the terms of service. Then, you need to enter your payment information for verification.
GCP offers a free tier that includes $300 in credits. After that, it is easy to stay in the free tier provided you remain under the 1TB per month limit for BigQuery data processed (more on that later).
After you've created your account, you will then be asked a few marketing questions from Google. Fill these out as appropriate.
Finally, you will be brought to the admin console where you can create a new project. Feel free to name this GCP project anything you'd like. (Or you can simply leave the default project name 'My First Project'.)
Navigate to BigQuery from the left-hand menu and then click on BigQuery Studio from the hover menu. This will take you to the BigQuery Console.
The console features an Explorer frame on the left-hand side, which lists all the datasets available to you, and a Studio Console which has tabs for organizing your work. This will be your workspace for querying the OSO dataset. If this is your first time, you will likely see a welcome message on the first tab in the Studio Console. Now you're ready to start exploring OSO datasets!
Query the oso_playground
dataset
If you just created your GCP account by following the steps above, then you'll already be in the BigQuery Console. However, if you're just joining us because you already have an account, then go directly to the BigQuery Console by clicking here.
Close the first tab on the console or simply navigate to the second tab, which should display a blank query editor. Alternatively, you can open a new tab by clicking on the +
icon on the top right of the console to Create SQL Query
.
From here you will be able to write any SQL you'd like against the OSO dataset. For example, you can query the oso_playground
dataset for all the collections in that dataset like this:
SELECT *
FROM `opensource-observer.oso_playground.collections`
Click Run to execute your query. The results will appear in a table at the bottom of the console.
The console will help you complete your query as you type, and will also provide you with a preview of the results and computation time. You can save your queries, download the results, and even make simple visualizations directly from the console.
To explore all the OSO datasets available, see here.
- oso contains all production data. This can be quite large depending on the dataset.
- oso_playground contains only the last 2 weeks for every datast. We recommend using this for development and testing.
Join the Kariba Data Collective
Now that you're set up, there are many ways to contribute to OSO and integrate the data with your application:
- Do Data Science over OSO data sets
- Propose an impact model to run in our data pipeline
- Query the OSO API for metrics and impact vectors from your web app
If you think you'll be an ongoing contributor to OSO, please apply to join the Kariba Data Collective.
Membership is free but we want to keep the community close-knit and mission-aligned. As the community grows, we want to reward the most useful contributions and in so doing create a new job category for impact data science.
Join the Data Collective