Skip to content

Glob

obspec_utils.glob.glob

glob(store: List, pattern: str) -> Iterator[str]

Match paths against a glob pattern using the obspec List primitive.

Parameters:

  • store (List) –

    Any store implementing the obspec.List protocol.

  • pattern (str) –

    Glob pattern to match. Supports:

    • * : matches any characters within a single path segment
    • ** : matches any number of path segments (recursive)
    • ? : matches exactly one character
    • [abc] : matches characters in set
    • [a-z] : matches characters in range
    • [!abc] : matches characters NOT in set

Yields:

  • str

    Paths of matching objects.

Examples:

Find all NetCDF files in a directory:

paths = list(glob(store, "data/2024/*.nc"))

Find all NetCDF files recursively:

paths = list(glob(store, "data/**/*.nc"))

Find files with single-character suffix:

paths = list(glob(store, "data/file?.nc"))
See Also

glob_objects : Returns full ObjectMeta instead of just paths. glob_async : Async version of this function.

obspec_utils.glob.glob_objects

glob_objects(store: List, pattern: str) -> Iterator[ObjectMeta]

Match paths against a glob pattern, returning full object metadata.

Same as glob, but yields ObjectMeta dicts containing:

  • path: str - The full path to the object
  • last_modified: datetime - The last modified time
  • size: int - The size in bytes
  • e_tag: str | None - The unique identifier (ETag)
  • version: str | None - A version indicator

Parameters:

  • store (List) –

    Any store implementing the obspec.List protocol.

  • pattern (str) –

    Glob pattern to match. See glob for supported patterns.

Yields:

  • ObjectMeta

    Metadata for each matching object.

Examples:

Get file sizes for matching objects:

total_size = sum(obj["size"] for obj in glob_objects(store, "data/**/*.nc"))

Find recently modified files:

from datetime import datetime, timedelta, timezone

cutoff = datetime.now(timezone.utc) - timedelta(days=7)
recent = [
    obj for obj in glob_objects(store, "data/**/*.nc")
    if obj["last_modified"] > cutoff
]
See Also

glob : Returns just paths instead of full metadata. glob_objects_async : Async version of this function.

obspec_utils.glob.glob_async async

glob_async(store: ListAsync, pattern: str) -> AsyncIterator[str]

Async version of glob.

Match paths against a glob pattern using the obspec ListAsync primitive.

Parameters:

Yields:

  • str

    Paths of matching objects.

Examples:

async def process_files(store):
    async for path in glob_async(store, "data/**/*.nc"):
        await process(path)
See Also

glob : Sync version of this function. glob_objects_async : Returns full ObjectMeta instead of just paths.

obspec_utils.glob.glob_objects_async async

glob_objects_async(store: ListAsync, pattern: str) -> AsyncIterator[ObjectMeta]

Async version of glob_objects.

Match paths against a glob pattern, returning full object metadata.

Parameters:

Yields:

  • ObjectMeta

    Metadata for each matching object.

Examples:

async def get_total_size(store):
    total = 0
    async for obj in glob_objects_async(store, "data/**/*.nc"):
        total += obj["size"]
    return total
See Also

glob_objects : Sync version of this function. glob_async : Returns just paths instead of full metadata.