ADS-B API

The Contrails API enables authorized users to access a common ADS-B dataset for contrails research.

The underlying ADS-B data is provided by Spire Aviation.

E-mail api@contrails.org with subject Common ADS-B Access to learn more about how your organization can participate in this program.

[1]:
import os
[2]:
# Load API key
# (contact api@contrails.org if you need an API key)
URL = "https://api.contrails.org"
API_KEY = os.environ["CONTRAILS_API_KEY"]
HEADERS = {"x-api-key": API_KEY}

Telemetry

GET /v1/adsb/telemetry

Note this endpoint can take up to 30 seconds to return depending on bandwidth

This endpoint returns 1 hour range of all global ADS-B telemetry data as an Apache Parquet file.

Input date must be an ISO 8601 datetime string (UTC) with hourly resolution, e.g. "2025-01-06T00". Any minute or second resolution is ignored.

See the ADS-B schema for the description of each data key in the Parquet file.

[3]:
import requests  # pip install requests
import matplotlib.pyplot as plt  # pip install matplotlib
import pandas as pd  # pip install pandas

Get data for a single hour

[4]:
params = {
    "date": "2025-01-24T02"  # ISO 8601 (UTC)
}

r = requests.get(f"{URL}/v1/adsb/telemetry", params=params, headers=HEADERS)
print(f"HTTP Response Code: {r.status_code} {r.reason}\n")

# write out response content as parquet file
with open(f"{params['date']}.pq", "wb") as f:
    f.write(r.content)
HTTP Response Code: 200 OK

[5]:
# read parquet file with pandas
df = pd.read_parquet(f"{params['date']}.pq")

print("Number of unique flights:", df["flight_id"].nunique())
print("Number of unique waypoints:", len(df["flight_id"]))

df.head()
Number of unique flights: 17103
Number of unique waypoints: 1093322
[5]:
timestamp latitude longitude collection_type altitude_baro icao_address flight_id callsign tail_number flight_number aircraft_type_icao airline_iata departure_airport_icao departure_scheduled_time arrival_airport_icao arrival_scheduled_time
0 2025-01-24 02:59:59 37.882629 -80.429001 terrestrial 31000 A91986 93a5cd24-1e7b-4dca-a07d-ad391a2e8237 PDT5701 N686AE AA5701 E145 PT KCLT 2025-01-24 01:53:00 KERI 2025-01-24 03:48:00
1 2025-01-24 02:59:59 36.193588 -112.395912 terrestrial 35000 AB415E d1dfe570-9e39-4323-a6cf-f4cf602b4149 SCX618 N824SY SY618 B738 SY KPSP 2025-01-24 02:24:00 KMSP 2025-01-24 05:41:00
2 2025-01-24 02:59:59 -44.230362 171.841019 terrestrial 33950 C81D8E 12d6d993-c01e-4553-80e1-944a34119f69 ANZ689 ZK-OAB NZ689 A320 NZ NZWN 2025-01-24 02:05:00 NZDN 2025-01-24 03:25:00
3 2025-01-24 02:59:59 43.008354 26.135494 terrestrial 38000 4B187F 0e7d48c3-a4e2-4489-aaa3-4c9b9bea05c2 SWR155 HB-JHF LX155 A333 LX VABB 2025-01-23 19:50:00 LSZH 2025-01-24 05:10:00
4 2025-01-24 02:59:59 28.975525 -109.411362 terrestrial 37000 0D09D5 b4050af1-1fc8-4997-ac5a-1d46b690c869 VOI1743 XA-VLU Y41743 A321 Y4 KLAS 2025-01-24 01:31:00 MMGL 2025-01-24 04:43:00
[6]:
# select single flight and plot
flight_id = df.iloc[0]["flight_id"]
flight = df.loc[df["flight_id"] == flight_id]
flight.plot.scatter(x="longitude", y="latitude", c="altitude_baro", cmap="bwr", s=2);
../_images/notebooks_adsb_api_8_0.png

Aggregate data over multiple hours

[7]:
start = "2025-01-15T02"
end = "2025-01-15T03"
times = pd.date_range(start=start, end=end, freq="h")
times_str = [t.strftime("%Y-%m-%dT%H") for t in times]
[8]:
for t in times_str:
    print(f"Downloading hour: {t}")

    r = requests.get(f"{URL}/v1/adsb/telemetry", params={"date": t}, headers=HEADERS)
    print(f"HTTP Response Code: {r.status_code} {r.reason}\n")

    # write out response content as parquet file
    with open(f"{t}.pq", "wb") as f:
        f.write(r.content)
Downloading hour: 2025-01-15T02
HTTP Response Code: 200 OK

Downloading hour: 2025-01-15T03
HTTP Response Code: 200 OK

[9]:
dfs = []
for t in times_str:
    dfs.append(pd.read_parquet(f"{t}.pq"))

df = pd.concat(dfs)

print("Number of unique flights:", df["flight_id"].nunique())
print("Number of unique waypoints:", len(df["flight_id"]))
Number of unique flights: 21240
Number of unique waypoints: 1944787
[10]:
# select single flight and plot
flight_id = df.iloc[0]["flight_id"]
flight = df.loc[df["flight_id"] == flight_id]
flight.plot.scatter(x="longitude", y="latitude", c="altitude_baro", cmap="bwr", s=2);
../_images/notebooks_adsb_api_13_0.png

Bulk Load ADS-B into external datastore

This section requires a fresh notebook kernel. Restart the kernel if you have already run the section above.

This section will provide a tutorial that covers:

  • Fetching a range of ADS-B data from the Contrails API

  • Loading those data into an external database/datastore

This tutorial will focus on loading data into a Google BigQuery table. The same approach can be adapted to load these data into other database / datastores.

This process is useful if you want to perform advanced queries on the dataset.

Prerequisites

You must have a Google Cloud account, and the Google Cloud CLI (gcloud) installed on your machine.

You must also have set up a BigQuery table and given your account the required permissions to load data into this table.

[1]:
import json
import os
from pathlib import Path

# NOTE: grequests *must* be imported before requests, or you will see a MonekyPatchWarning
import grequests  # pip install grequests (for parallel REST requests)
import pandas as pd  # pip install pandas

from google.cloud import bigquery  # pip install google-cloud-bigquery
from google.cloud.bigquery import LoadJobConfig
[2]:
# Load API key
URL = "https://api.contrails.org"
API_KEY = os.environ["CONTRAILS_API_KEY"]
HEADERS = {"x-api-key": API_KEY}

Download ADS-B data files to your machine

Set target hours for ADS-B data, then fetch ADS-B data from the Contrails API in a parallel, saving parquet files to the local machine.

[3]:
# 6 hours of data
start = "2025-01-16T00"
end = "2025-01-16T06"
times = pd.date_range(start=start, end=end, freq="h")
times_str = [t.strftime("%Y-%m-%dT%H") for t in times]
[4]:
# Use `grequests` to send out parallel API requests
# (this cell can take minutes to evaluate depending on bandwidth)
req = (
    grequests.get(f"{URL}/v1/adsb/telemetry", params={"date": t}, headers=HEADERS)
    for t in times_str
)
responses = grequests.map(req, size=25)

# create local directory to store local parquet files
os.makedirs("adsb", exist_ok=True)

# Write out each hour as a parquet file in subdirectory `adsb`
for t, r in zip(times_str, responses):
    print(f"{t}: {r.status_code} {r.reason}")

    # write out response content as parquet file
    path = Path(f"adsb/{t}.pq")
    with open(path, "wb") as f:
        f.write(r.content)
2025-01-16T00: 200 OK
2025-01-16T01: 200 OK
2025-01-16T02: 200 OK
2025-01-16T03: 200 OK
2025-01-16T04: 200 OK
2025-01-16T05: 200 OK
2025-01-16T06: 200 OK

(Optional) Create the target BigQuery table

If a target BigQuery table does not exist, then create one prior to inserting the target data.

The table must have a schema compatible with the fields present in the parquet ADS-B data.

You can create a table using the bq mk command (bq comes bundled with the gcloud CLI).

bq mk --table project_id:dataset_id.table_id adsb-schema.json
  • project_id is the GCP project ID for your account.

  • dataset_id is the BigQuery dataset where you want to create a new table.

If the dataset does not already exist, you will have to create it first with the `bq mk --dataset command <https://cloud.google.com/bigquery/docs/datasets#bq>`__ (or via the web Console…)

  • table_id is the table name for the new table you are creating.

  • adsb-schema.json is the filepath to a local JSON file with the schema definition for the new table. Download the ADS-B schema provided in the documentation - this schema is compatible with the BigQuery API

curl -X GET https://apidocs.contrails.org/_static/adsb-schema.json > adsb-schema.json
[5]:
# !bq mk --table project_id:dataset_id.table_id adsb-schema.json

Load data into a BigQuery table

Assuming you have an empty BigQuery table created, the following loads local data into the BigQuery table on file at a time.

PRO TIP

To maximize BigQuery load speed, consider moving the dataset into a Google Cloud Storage Bucket.

See client.load_table_from_uri(..) or the `bq load command <https://cloud.google.com/bigquery/docs/batch-loading-data#permissions-load-data-from-cloud-storage>`__.

Uploading from a GCS bucket will increase upload speed both due to the bucket being in the Google network (high uplink speed), and the commands above supporting wildcards for GCS URI paths.

[6]:
# Initialize BigQuery client
client = bigquery.Client()  # Uses your default GCP "project" - see `gcloud config list`

# Create table reference
project_id = "<project_id>"  # REPLACE WITH YOUR GCP PROJECT
dataset_id = "<dataset_id>"  # REPLACE WITH YOUR BQ DATASET
table_id = "<table_id>"  # REPLACE WITH YOUR BQ TABLE
bigquery_id = f"{project_id}.{dataset_id}.{table_id}"

# Load schema
with open("adsb-schema.json", "r") as f:
    schema = json.load(f)

# Configure the loading job
job_config = LoadJobConfig(source_format=bigquery.SourceFormat.PARQUET, schema=schema)

for t in times_str:
    # read in parquet file
    path = Path(f"adsb/{t}.pq")
    print(f"Loading {t}")

    # Open the local parquet file
    with open(path, "rb") as f:
        # Start the load job
        load_job = client.load_table_from_file(f, bigquery_id, job_config=job_config)

    # Wait for job completion
    load_job.result()

    print(f"Loaded {load_job.output_rows} rows into {bigquery_id}")
Loading 2025-01-16T00
Loaded 1158825 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T01
Loaded 1197240 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T02
Loaded 1111672 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T03
Loaded 966330 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T04
Loaded 895878 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T05
Loaded 736327 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T06
Loaded 601999 rows into contrails-301217.sandbox.adsb3