Opening Data with Xarray¶
This guide shows how to use obspec-utils readers to open cloud-hosted datasets with xarray.
Overview¶
obspec-utils readers provide a file-like interface (read, seek, tell) that xarray can use directly with engines like h5netcdf. Combined with obstore for cloud storage access, this enables efficient reading of HDF5 and NetCDF files from S3, GCS, or Azure.
Quick Start¶
This example opens a NetCDF file from the NASA Earth Exchange (NEX) Data Collection on AWS Open Data:
import xarray as xr
from obstore.store import S3Store
from obspec_utils.readers import EagerStoreReader
# Access public AWS Open Data (no credentials needed)
store = S3Store(
bucket="nasanex",
aws_region="us-west-2",
skip_signature=True, # Anonymous access
)
with EagerStoreReader(store, "NEX-GDDP/BCSD/rcp85/day/atmos/tasmax/r1i1p1/v1.0/tasmax_day_BCSD_rcp85_r1i1p1_inmcm4_2100.nc") as reader:
ds = xr.open_dataset(reader, engine="h5netcdf")
print(ds)
<xarray.Dataset> Size: 2GB
Dimensions: (time: 365, lat: 720, lon: 1440)
Coordinates:
* time (time) object 3kB 2100-01-01 12:00:00 ... 2100-12-31 12:00:00
* lat (lat) float32 3kB -89.88 -89.62 -89.38 -89.12 ... 89.38 89.62 89.88
* lon (lon) float32 6kB 0.125 0.375 0.625 0.875 ... 359.4 359.6 359.9
Data variables:
tasmax (time, lat, lon) float32 2GB ...
Attributes: (12/34)
parent_experiment: historical
parent_experiment_id: historical
parent_experiment_rip: r1i1p1
Conventions: CF-1.4
institution: NASA Earth Exchange, NASA Ames Research C...
institute_id: NASA-Ames
... ...
project_id: NEXGDDP
table_id: Table day (12 November 2010)
source: BCSD 2014
creation_date: 2015-01-07T20:33:31Z
forcing: N/A
product: output