Input File Generation

In addition to post-processing model output, gcmprocpy bundles two utilities that build the geophysical forcing / boundary-condition NetCDF files that drive a TIE-GCM run:

gpigen — Geophysical Indices (GPI): daily 10.7 cm solar flux (f107d), its running average (f107a), and the 3-hourly Kp index, from GFZ Potsdam.
imfgen — Interplanetary Magnetic Field / solar-wind boundary conditions: bx/by/bz, solar-wind density and velocity, from OMNI 1-minute data or a BCWIND HDF5 file.

Both are available as console commands (gpigen / imfgen) and as a Python API under gcmprocpy.gpigen / gcmprocpy.imfgen. Each generate_* function returns an xarray.Dataset, so the data can be inspected or post-processed before being written to NetCDF.

Note

These tools require network access to fetch the source data (the GFZ API for gpigen; the OMNI FTP server for imfgen), or a local copy of the source files. gpigen additionally depends on requests and imfgen on h5py; both are installed automatically with gcmprocpy.

GPI (gpigen)

Each output file holds, on a daily grid (ndays):

year_day — YYYYDDD integer (4-digit year + 3-digit day of year)
f107d — daily 10.7 cm solar flux
f107a — running-average 10.7 cm solar flux (default 81-day centered)
kp — 3-hourly Kp index, shaped (ndays, 8)

Writes <prefix>_<begYYYYDDD>-<endYYYYDDD>.nc into --output-dir.

Mode: CLI

# Full series 1960-01-01 -> yesterday, 81-day centered avg, JSON API (defaults)
gpigen

# Arbitrary date range
gpigen --start 2024-01-01 --end 2024-06-01

# 27-day trailing average
gpigen --window 27 --trailing --prefix gpi_27avg

# Parse the raw 1932-onward text file instead of the JSON API, and write plots
gpigen --source textfile --plots

gpigen

Build TIEGCM GPI NetCDF files from GFZ Potsdam data.

usage: gpigen [-h] [--version] [--start START] [--end END]
              [--source {json,textfile}] [--window WINDOW] [--trailing]
              [--status STATUS] [--output-dir OUTPUT_DIR] [--output OUTPUT]
              [--prefix PREFIX] [--no-write] [--plots] [--plots-dir PLOTS_DIR]
              [--cache-dir CACHE_DIR] [-q]

-h, --help: show this help message and exit

--version: show program’s version number and exit

--start <start>: Start date (YYYY-MM-DD, YYYYDDD, or ISO). Default: 1960-01-01.

--end <end>: End date (inclusive). Default: yesterday.

--source {json,textfile}: Data source. Default: json (GFZ JSON API).

--window <window>: Averaging window in days for f107a. Default: 81.

--trailing: Use a trailing average instead of centered.

--status <status>: GFZ ‘status’ query param (json source). Default: def.

--output-dir <output_dir>: Directory for the output .nc file. Default: cwd.

--output <output>: Explicit output path (overrides –output-dir/–prefix).

--prefix <prefix>: Filename prefix: <prefix>_<beg>-<end>.nc. Default: gpi.

--no-write: Build the dataset but do not write a file.

--plots: Also write per-year f107d/f107a PNGs (needs the ‘plot’ extra).

--plots-dir <plots_dir>: Directory for plots. Default: ./plots.

--cache-dir <cache_dir>: Where to store the downloaded text file (textfile source).

-q, --quiet: Suppress progress.

Mode: API

from gcmprocpy import gpigen

ds = gpigen.generate_gpi(
    start="2024-01-01",   # YYYY-MM-DD, YYYYDDD, ISO, or datetime
    end=None,             # default: yesterday
    source="json",        # "json" (GFZ API) or "textfile"
    window=81,            # averaging window in days
    centered=True,        # centered vs trailing
)

path = gpigen.save_gpi(ds, output_dir=".")     # write NetCDF
gpigen.make_plots(ds, output_dir="plots")      # optional per-year PNGs

The top-level entry points generate_gpi, save_gpi and make_plots are also re-exported directly on the gcmprocpy namespace.

gcmprocpy.gpigen.core.generate_gpi(start='1960-01-01', end=None, source='json', window=81, centered=True, status='def', cache_dir=None, verbose=False)[source]

Generate a GPI xarray.Dataset for [start, end].

Parameters:

start (date-like or None) – Inclusive bounds (YYYY-MM-DD, YYYYDDD, ISO, or datetime). end=None defaults to yesterday. The output begins at start but, for a centered window, ends window // 2 days before end (a centered average needs future data that does not yet exist).
end (date-like or None) – Inclusive bounds (YYYY-MM-DD, YYYYDDD, ISO, or datetime). end=None defaults to yesterday. The output begins at start but, for a centered window, ends window // 2 days before end (a centered average needs future data that does not yet exist).
source ({"json", "textfile"}) – GFZ JSON API (default) or the locally-parsed 1932-onward text file.
window (int) – Averaging window in days (default 81).
centered (bool) – Centered (default) vs trailing average for f107a.
status (str) – GFZ status query param for the JSON API (default "def").
cache_dir (str or None) – Where to drop the downloaded text file (textfile source only).
verbose (bool) – Print progress.

Returns:

Use gpigen.save_gpi() to write it to NetCDF.

Return type:

xarray.Dataset

gcmprocpy.gpigen.dataset.save_gpi(ds, output_dir='.', prefix='gpi', path=None)[source]

Write ds to NetCDF and return the path written.

path overrides the auto-generated name; otherwise the file is <output_dir>/<prefix>_<beg>-<end>.nc.

gcmprocpy.gpigen.dataset.build_dataset(year_day, f107d, f107a, kp, window, centered, missing_dates)[source]: Build the GPI xarray.Dataset with the TIEGCM global attributes.

gcmprocpy.gpigen.dataset.gpi_filename(ds, prefix='gpi')[source]: <prefix>_<begYYYYDDD>-<endYYYYDDD>.nc from the dataset’s bounds.

gcmprocpy.gpigen.plotting.make_plots(ds, output_dir='plots')[source]

Write f107d_<year>.png and f107a_<year>.png for each year.

Returns the list of files written. Importing matplotlib lazily keeps it an optional dependency for the core pipeline.

IMF / Solar-Wind Boundary Conditions (imfgen)

Each output file holds, on a per-minute grid (ndata):

bx, by, bz — IMF components (nT)
swden — solar-wind proton density (cm^-3)
swvel — solar-wind flow speed (km/s)
a 0/1 *Mask quality flag for each channel (bxMask, byMask, bzMask, denMask, velMask; 0 = linearly interpolated)
date — YYYYDDD.frac (year, day-of-year, fractional day)
timestamp — ISO string YYYY-MM-DDTHH:MM:SS

Mode: CLI

# Full OMNI series 1982-01-01 -> yesterday, 10-min trailing average (defaults).
# Fetches missing omni_min<year>.asc files over FTP into --cache-dir.
imfgen --cache-dir ./omni_asc

# A specific range, one continuous file
imfgen --start 2020-01-01 --end 2020-12-31 --cache-dir ./omni_asc

# Reproduce the legacy per-year files (imf_OMNI_YYYY001-YYYYddd.nc)
imfgen --split-years --cache-dir ./omni_asc --output-dir .

# Convert a BCWIND HDF5 file
imfgen --source bcwind --bcwind-path bcwind.h5

imfgen

Build TIEGCM IMF NetCDF files from OMNI or BCWIND data.

usage: imfgen [-h] [--version] [--source {omni,bcwind}] [--start START]
              [--end END] [--window WINDOW] [--cache-dir CACHE_DIR]
              [--bcwind-path BCWIND_PATH] [--no-download]
              [--output-dir OUTPUT_DIR] [--output OUTPUT] [--prefix PREFIX]
              [--split-years] [--no-write] [-q]

-h, --help: show this help message and exit

--version: show program’s version number and exit

--source {omni,bcwind}: Data source. Default: omni (OMNI 1-minute ASCII).

--start <start>: Start date (YYYY-MM-DD, YYYYDDD, or ISO). omni default: 1982-01-01.

--end <end>: End date (inclusive). omni default: yesterday.

--window <window>: Trailing-average window in minutes (omni). Default: 10.

--cache-dir <cache_dir>: Directory for omni_min<year>.asc files (omni). Default: cwd.

--bcwind-path <bcwind_path>: Path to the BCWIND HDF5 file (required for –source bcwind).

--no-download: Do not fetch missing OMNI files over FTP; use local files only.

--output-dir <output_dir>: Directory for the output .nc file(s). Default: cwd.

--output <output>: Explicit output path (single-file mode; overrides –output-dir/–prefix).

--prefix <prefix>: Filename prefix: <prefix>_<beg>-<end>.nc. Default: imf_OMNI / imf_bcwind.

--split-years: Write one file per calendar year instead of a single range file.

--no-write: Build the dataset but do not write a file.

-q, --quiet: Suppress progress.

Mode: API

from gcmprocpy import imfgen

# OMNI -> one continuous Dataset for the range
ds = imfgen.generate_imf(
    start="2020-01-01",   # YYYY-MM-DD, YYYYDDD, ISO, or datetime
    end=None,             # default: yesterday
    source="omni",        # "omni" (default) or "bcwind"
    window=10,            # trailing-average window, minutes
    cache_dir="./omni_asc",
)
path = imfgen.save_imf(ds, output_dir=".")          # write NetCDF

# Per-year files (each interpolated within its own year), like the originals
for ds_year in imfgen.generate_imf_years(start="1982-01-01", cache_dir="./omni_asc"):
    imfgen.save_imf(ds_year, output_dir=".")

# BCWIND HDF5 -> Dataset
ds = imfgen.generate_imf(source="bcwind", bcwind_path="bcwind.h5")

The top-level entry points generate_imf, generate_imf_years and save_imf are also re-exported directly on the gcmprocpy namespace.

gcmprocpy.imfgen.core.generate_imf(start=None, end=None, source='omni', window=10, cache_dir=None, bcwind_path=None, download=True, verbose=False)[source]

Generate an IMF xarray.Dataset.

Parameters:

start (date-like or None) – Inclusive bounds (YYYY-MM-DD, YYYYDDD, ISO, or datetime). For omni they default to 1982-01-01 and yesterday. For bcwind they optionally filter the file (default: the file’s full span).
end (date-like or None) – Inclusive bounds (YYYY-MM-DD, YYYYDDD, ISO, or datetime). For omni they default to 1982-01-01 and yesterday. For bcwind they optionally filter the file (default: the file’s full span).
source ({"omni", "bcwind"}) – OMNI 1-minute ASCII (default) or a BCWIND HDF5 file.
window (int) – Trailing-average window in minutes for the OMNI pipeline (default 10). Ignored for bcwind (raw pass-through).
cache_dir (str or None) – Directory holding / receiving omni_min<year>.asc files (OMNI only).
bcwind_path (str or None) – Path to the BCWIND HDF5 file (required when source="bcwind").
download (bool) – Fetch missing OMNI year files over FTP (default True).
verbose (bool) – Print progress.

Returns:

Use imfgen.save_imf() to write it to NetCDF.

Return type:

xarray.Dataset

gcmprocpy.imfgen.core.generate_imf_years(start=None, end=None, window=10, cache_dir=None, download=True, verbose=False)[source]

Yield one OMNI Dataset per calendar year in [start, end].

Each year is generated independently (its own within-year interpolation), so the per-year files reproduce the legacy imf_OMNI_YYYY001-YYYYddd.nc files exactly. This is what imfgen --split-years writes. (BCWIND files are a single span and are not split.)

gcmprocpy.imfgen.dataset.save_imf(ds, output_dir='.', prefix=None, path=None)[source]

Write ds to NetCDF and return the path written.

path overrides the auto-generated <prefix>_<beg>-<end>.nc name. (For per-year output, generate each year with imfgen.generate_imf_years() and call this once per dataset – see imfgen --split-years.)

gcmprocpy.imfgen.dataset.build_dataset(processed, dates, timestamps, source='omni', source_path=None)[source]

Build the IMF xarray.Dataset.

processed maps each channel in CHANNELS to (values, mask). dates is the YYYYDDD.frac float array; timestamps the ISO strings.

gcmprocpy.imfgen.dataset.imf_filename(ds, prefix=None)[source]: <prefix>_<begYYYYDDD>-<endYYYYDDD>.nc from the dataset’s bounds.