Glob
obspec_utils.glob.glob ¶
Match paths against a glob pattern using the obspec List primitive.
Parameters:
-
store(List) –Any store implementing the obspec.List protocol.
-
pattern(str) –Glob pattern to match. Supports:
*: matches any characters within a single path segment**: matches any number of path segments (recursive)?: matches exactly one character[abc]: matches characters in set[a-z]: matches characters in range[!abc]: matches characters NOT in set
Yields:
-
str–Paths of matching objects.
Examples:
Find all NetCDF files in a directory:
paths = list(glob(store, "data/2024/*.nc"))
Find all NetCDF files recursively:
paths = list(glob(store, "data/**/*.nc"))
Find files with single-character suffix:
paths = list(glob(store, "data/file?.nc"))
See Also
glob_objects : Returns full ObjectMeta instead of just paths. glob_async : Async version of this function.
obspec_utils.glob.glob_objects ¶
glob_objects(store: List, pattern: str) -> Iterator[ObjectMeta]
Match paths against a glob pattern, returning full object metadata.
Same as glob, but yields ObjectMeta dicts containing:
path: str - The full path to the objectlast_modified: datetime - The last modified timesize: int - The size in bytese_tag: str | None - The unique identifier (ETag)version: str | None - A version indicator
Parameters:
-
store(List) –Any store implementing the obspec.List protocol.
-
pattern(str) –Glob pattern to match. See glob for supported patterns.
Yields:
-
ObjectMeta–Metadata for each matching object.
Examples:
Get file sizes for matching objects:
total_size = sum(obj["size"] for obj in glob_objects(store, "data/**/*.nc"))
Find recently modified files:
from datetime import datetime, timedelta, timezone
cutoff = datetime.now(timezone.utc) - timedelta(days=7)
recent = [
obj for obj in glob_objects(store, "data/**/*.nc")
if obj["last_modified"] > cutoff
]
See Also
glob : Returns just paths instead of full metadata. glob_objects_async : Async version of this function.
obspec_utils.glob.glob_async
async
¶
glob_async(store: ListAsync, pattern: str) -> AsyncIterator[str]
Async version of glob.
Match paths against a glob pattern using the obspec ListAsync primitive.
Parameters:
-
store(ListAsync) –Any store implementing the obspec.ListAsync protocol.
-
pattern(str) –Glob pattern to match. See glob for supported patterns.
Yields:
-
str–Paths of matching objects.
Examples:
async def process_files(store):
async for path in glob_async(store, "data/**/*.nc"):
await process(path)
See Also
glob : Sync version of this function. glob_objects_async : Returns full ObjectMeta instead of just paths.
obspec_utils.glob.glob_objects_async
async
¶
glob_objects_async(store: ListAsync, pattern: str) -> AsyncIterator[ObjectMeta]
Async version of glob_objects.
Match paths against a glob pattern, returning full object metadata.
Parameters:
-
store(ListAsync) –Any store implementing the obspec.ListAsync protocol.
-
pattern(str) –Glob pattern to match. See glob for supported patterns.
Yields:
-
ObjectMeta–Metadata for each matching object.
Examples:
async def get_total_size(store):
total = 0
async for obj in glob_objects_async(store, "data/**/*.nc"):
total += obj["size"]
return total
See Also
glob_objects : Sync version of this function. glob_async : Returns just paths instead of full metadata.