ADS-B API¶
The Contrails API enables authorized users to access a common ADS-B dataset for contrails research.
The underlying ADS-B data is provided by Spire Aviation.
E-mail api@contrails.org with subject Common ADS-B Access to learn more about how your organization can participate in this program.
[1]:
import os
[2]:
# Load API key
# (contact api@contrails.org if you need an API key)
URL = "https://api.contrails.org"
API_KEY = os.environ["CONTRAILS_API_KEY"]
HEADERS = {"x-api-key": API_KEY}
Telemetry¶
Note this endpoint can take up to 30 seconds to return depending on bandwidth
This endpoint returns 1 hour range of all global ADS-B telemetry data as an Apache Parquet file.
Input date must be an ISO 8601 datetime string (UTC) with hourly resolution, e.g. "2025-01-06T00"
. Any minute or second resolution is ignored.
See the ADS-B schema for the description of each data key in the Parquet file.
[3]:
import requests # pip install requests
import matplotlib.pyplot as plt # pip install matplotlib
import pandas as pd # pip install pandas
Get data for a single hour¶
[4]:
params = {
"date": "2025-01-24T02" # ISO 8601 (UTC)
}
r = requests.get(f"{URL}/v1/adsb/telemetry", params=params, headers=HEADERS)
print(f"HTTP Response Code: {r.status_code} {r.reason}\n")
# write out response content as parquet file
with open(f"{params['date']}.pq", "wb") as f:
f.write(r.content)
HTTP Response Code: 200 OK
[5]:
# read parquet file with pandas
df = pd.read_parquet(f"{params['date']}.pq")
print("Number of unique flights:", df["flight_id"].nunique())
print("Number of unique waypoints:", len(df["flight_id"]))
df.head()
Number of unique flights: 17103
Number of unique waypoints: 1093322
[5]:
timestamp | latitude | longitude | collection_type | altitude_baro | icao_address | flight_id | callsign | tail_number | flight_number | aircraft_type_icao | airline_iata | departure_airport_icao | departure_scheduled_time | arrival_airport_icao | arrival_scheduled_time | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2025-01-24 02:59:59 | 37.882629 | -80.429001 | terrestrial | 31000 | A91986 | 93a5cd24-1e7b-4dca-a07d-ad391a2e8237 | PDT5701 | N686AE | AA5701 | E145 | PT | KCLT | 2025-01-24 01:53:00 | KERI | 2025-01-24 03:48:00 |
1 | 2025-01-24 02:59:59 | 36.193588 | -112.395912 | terrestrial | 35000 | AB415E | d1dfe570-9e39-4323-a6cf-f4cf602b4149 | SCX618 | N824SY | SY618 | B738 | SY | KPSP | 2025-01-24 02:24:00 | KMSP | 2025-01-24 05:41:00 |
2 | 2025-01-24 02:59:59 | -44.230362 | 171.841019 | terrestrial | 33950 | C81D8E | 12d6d993-c01e-4553-80e1-944a34119f69 | ANZ689 | ZK-OAB | NZ689 | A320 | NZ | NZWN | 2025-01-24 02:05:00 | NZDN | 2025-01-24 03:25:00 |
3 | 2025-01-24 02:59:59 | 43.008354 | 26.135494 | terrestrial | 38000 | 4B187F | 0e7d48c3-a4e2-4489-aaa3-4c9b9bea05c2 | SWR155 | HB-JHF | LX155 | A333 | LX | VABB | 2025-01-23 19:50:00 | LSZH | 2025-01-24 05:10:00 |
4 | 2025-01-24 02:59:59 | 28.975525 | -109.411362 | terrestrial | 37000 | 0D09D5 | b4050af1-1fc8-4997-ac5a-1d46b690c869 | VOI1743 | XA-VLU | Y41743 | A321 | Y4 | KLAS | 2025-01-24 01:31:00 | MMGL | 2025-01-24 04:43:00 |
[6]:
# select single flight and plot
flight_id = df.iloc[0]["flight_id"]
flight = df.loc[df["flight_id"] == flight_id]
flight.plot.scatter(x="longitude", y="latitude", c="altitude_baro", cmap="bwr", s=2);

Aggregate data over multiple hours¶
[7]:
start = "2025-01-15T02"
end = "2025-01-15T03"
times = pd.date_range(start=start, end=end, freq="h")
times_str = [t.strftime("%Y-%m-%dT%H") for t in times]
[8]:
for t in times_str:
print(f"Downloading hour: {t}")
r = requests.get(f"{URL}/v1/adsb/telemetry", params={"date": t}, headers=HEADERS)
print(f"HTTP Response Code: {r.status_code} {r.reason}\n")
# write out response content as parquet file
with open(f"{t}.pq", "wb") as f:
f.write(r.content)
Downloading hour: 2025-01-15T02
HTTP Response Code: 200 OK
Downloading hour: 2025-01-15T03
HTTP Response Code: 200 OK
[9]:
dfs = []
for t in times_str:
dfs.append(pd.read_parquet(f"{t}.pq"))
df = pd.concat(dfs)
print("Number of unique flights:", df["flight_id"].nunique())
print("Number of unique waypoints:", len(df["flight_id"]))
Number of unique flights: 21240
Number of unique waypoints: 1944787
[10]:
# select single flight and plot
flight_id = df.iloc[0]["flight_id"]
flight = df.loc[df["flight_id"] == flight_id]
flight.plot.scatter(x="longitude", y="latitude", c="altitude_baro", cmap="bwr", s=2);

Bulk Load ADS-B into external datastore¶
This section requires a fresh notebook kernel. Restart the kernel if you have already run the section above.
This section will provide a tutorial that covers:
Fetching a range of ADS-B data from the Contrails API
Loading those data into an external database/datastore
This tutorial will focus on loading data into a Google BigQuery table. The same approach can be adapted to load these data into other database / datastores.
This process is useful if you want to perform advanced queries on the dataset.
Prerequisites¶
You must have a Google Cloud account, and the Google Cloud CLI (gcloud
) installed on your machine.
You must also have set up a BigQuery table and given your account the required permissions to load data into this table.
[1]:
import json
import os
from pathlib import Path
# NOTE: grequests *must* be imported before requests, or you will see a MonekyPatchWarning
import grequests # pip install grequests (for parallel REST requests)
import pandas as pd # pip install pandas
from google.cloud import bigquery # pip install google-cloud-bigquery
from google.cloud.bigquery import LoadJobConfig
[2]:
# Load API key
URL = "https://api.contrails.org"
API_KEY = os.environ["CONTRAILS_API_KEY"]
HEADERS = {"x-api-key": API_KEY}
Download ADS-B data files to your machine¶
Set target hours for ADS-B data, then fetch ADS-B data from the Contrails API in a parallel, saving parquet files to the local machine.
[3]:
# 6 hours of data
start = "2025-01-16T00"
end = "2025-01-16T06"
times = pd.date_range(start=start, end=end, freq="h")
times_str = [t.strftime("%Y-%m-%dT%H") for t in times]
[4]:
# Use `grequests` to send out parallel API requests
# (this cell can take minutes to evaluate depending on bandwidth)
req = (
grequests.get(f"{URL}/v1/adsb/telemetry", params={"date": t}, headers=HEADERS)
for t in times_str
)
responses = grequests.map(req, size=25)
# create local directory to store local parquet files
os.makedirs("adsb", exist_ok=True)
# Write out each hour as a parquet file in subdirectory `adsb`
for t, r in zip(times_str, responses):
print(f"{t}: {r.status_code} {r.reason}")
# write out response content as parquet file
path = Path(f"adsb/{t}.pq")
with open(path, "wb") as f:
f.write(r.content)
2025-01-16T00: 200 OK
2025-01-16T01: 200 OK
2025-01-16T02: 200 OK
2025-01-16T03: 200 OK
2025-01-16T04: 200 OK
2025-01-16T05: 200 OK
2025-01-16T06: 200 OK
(Optional) Create the target BigQuery table¶
If a target BigQuery table does not exist, then create one prior to inserting the target data.
The table must have a schema compatible with the fields present in the parquet ADS-B data.
You can create a table using the bq mk
command (bq
comes bundled with the gcloud
CLI).
bq mk --table project_id:dataset_id.table_id adsb-schema.json
project_id
is the GCP project ID for your account.dataset_id
is the BigQuery dataset where you want to create a new table.
If the dataset does not already exist, you will have to create it first with the
`bq mk --dataset
command <https://cloud.google.com/bigquery/docs/datasets#bq>`__ (or via the web Console…)
table_id
is the table name for the new table you are creating.adsb-schema.json
is the filepath to a local JSON file with the schema definition for the new table. Download the ADS-B schema provided in the documentation - this schema is compatible with the BigQuery API
curl -X GET https://apidocs.contrails.org/_static/adsb-schema.json > adsb-schema.json
[5]:
# !bq mk --table project_id:dataset_id.table_id adsb-schema.json
Load data into a BigQuery table¶
Assuming you have an empty BigQuery table created, the following loads local data into the BigQuery table on file at a time.
PRO TIP
To maximize BigQuery load speed, consider moving the dataset into a Google Cloud Storage Bucket.
See client.load_table_from_uri(..) or the
`bq load
command <https://cloud.google.com/bigquery/docs/batch-loading-data#permissions-load-data-from-cloud-storage>`__.Uploading from a GCS bucket will increase upload speed both due to the bucket being in the Google network (high uplink speed), and the commands above supporting wildcards for GCS URI paths.
[6]:
# Initialize BigQuery client
client = bigquery.Client() # Uses your default GCP "project" - see `gcloud config list`
# Create table reference
project_id = "<project_id>" # REPLACE WITH YOUR GCP PROJECT
dataset_id = "<dataset_id>" # REPLACE WITH YOUR BQ DATASET
table_id = "<table_id>" # REPLACE WITH YOUR BQ TABLE
bigquery_id = f"{project_id}.{dataset_id}.{table_id}"
# Load schema
with open("adsb-schema.json", "r") as f:
schema = json.load(f)
# Configure the loading job
job_config = LoadJobConfig(source_format=bigquery.SourceFormat.PARQUET, schema=schema)
for t in times_str:
# read in parquet file
path = Path(f"adsb/{t}.pq")
print(f"Loading {t}")
# Open the local parquet file
with open(path, "rb") as f:
# Start the load job
load_job = client.load_table_from_file(f, bigquery_id, job_config=job_config)
# Wait for job completion
load_job.result()
print(f"Loaded {load_job.output_rows} rows into {bigquery_id}")
Loading 2025-01-16T00
Loaded 1158825 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T01
Loaded 1197240 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T02
Loaded 1111672 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T03
Loaded 966330 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T04
Loaded 895878 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T05
Loaded 736327 rows into contrails-301217.sandbox.adsb3
Loading 2025-01-16T06
Loaded 601999 rows into contrails-301217.sandbox.adsb3