GHOST | API Reference

Commands

Install: pip install ghost-hsi

Command	Description
`ghost train_spt`	Full GHOST pipeline — Spectral Partition Tree + ensemble models
`ghost train`	Flat model training, no SPT tree (baseline)
`ghost predict`	Inference on test split, compute metrics
`ghost visualize`	Generate 3-panel segmentation figure
`ghost convert_to_mat`	Convert ENVI, TIFF, GeoTIFF, HDF5/NetCDF to `.mat` — requires `pip install ghost-hsi[convert]`
`ghost demo`	Print bundled Indian Pines file paths and example command
`ghost version`	Print installed version
`ghost flower`	Easter egg

ghost train_spt

Primary training command. Builds the Spectral Partition Tree and trains per-node ensemble models.

$ Terminal

ghost train_spt \
  --data /path/to/data.mat \
  --gt   /path/to/labels.mat \
  --loss dice --routing forest \
  --base_filters 32 --num_filters 8 \
  --ensembles 5 --leaf_ensembles 3 \
  --epochs 400 --patience 50 --min_epochs 40 \
  --out-dir runs/my_experiment

Required

Flag	Description
`--data`	Path to hyperspectral `.mat` file. Shape: `(H, W, Bands)`
`--gt`	Path to ground truth `.mat` file. Shape: `(H, W)`, integer class IDs, 0 = background

Data

Flag	Default	Description
`--train_ratio`	`0.2`	Fraction of labeled pixels per class for training
`--val_ratio`	`0.1`	Fraction of labeled pixels per class for validation

Model

Flag	Default	Description
`--base_filters`	`32`	U-Net base filters. Channel progression: f → 2f → 4f → 8f → 16f
`--num_filters`	`8`	Spectral 3D conv filters per layer
`--num_blocks`	`3`	Number of 3D conv blocks

Training

Flag	Default	Description
`--epochs`	`400`	Base epoch budget for root node. Child nodes receive scaled budgets
`--lr`	`1e-4`	Learning rate (AdamW)
`--loss`	`ce`	Loss function: `ce`, `dice`, `focal`, `squared_ce`
`--focal_gamma`	`2.0`	Gamma for focal loss only
`--patience`	`50`	Early stop after N epochs without improvement
`--min_epochs`	`40`	Never early-stop before this epoch
`--warmup_epochs`	`0`	Linear LR warmup epochs
`--val_interval`	`20`	Validate every N epochs
`--seed`	`42`	Random seed

SPT (Spectral Partition Tree)

Flag	Default	Description
`--depth`	`auto`	Tree depth. `auto`: stops at depth 3 or SAM < 0.05. `full`: always recurse. Integer: fixed max depth
`--ensembles`	`5`	Ensemble size per internal node
`--leaf_ensembles`	`3`	Ensemble size per leaf node (≤2 classes)

Routing

Flag	Default	Description
`--routing`	`forest`	Routing mode: `forest` (recommended), `hybrid`, `soft`. Note: `hybrid` and `soft` require the SSSR module which is currently broken — use `forest`
`--d_model`	`64`	SSM fingerprint dimensionality (SSSR only)
`--d_state`	`16`	SSM filters per branch (SSSR only)
`--ssm_epochs`	`300`	SSM pretraining epochs. Set to `1` when using `--routing forest` to skip
`--ssm_lr`	`1e-3`	SSM pretraining learning rate
`--ssm_save`	`ssm_pretrained.pt`	SSM weights save path (inside `--out-dir`)
`--ssm_load`	`None`	Load pre-existing SSM weights, skip pretraining

Output

Flag	Default	Description
`--out-dir`	`.`	Output directory (created if needed)
`--save`	`spt_models.pkl`	Model bundle filename

Output Files

File	Description
`spt_models.pkl`	Complete model bundle: SPT tree structure + all ensemble model weights + SSM state
`ssm_pretrained.pt`	Standalone SSM encoder weights
`training_history.csv`	Epoch-by-epoch metrics for all nodes

ghost train

Flat model training — no SPT tree. Useful as a baseline to compare against the full pipeline, or for datasets with very few classes where the tree structure adds no benefit.

Accepts the same flags as ghost train_spt for: --data, --gt, --train_ratio, --val_ratio, --base_filters, --num_filters, --num_blocks, --epochs, --lr, --seed, --out-dir, --save.

Additional Flags

Flag	Default	Description
`--fp16`	`False`	Mixed precision training. Reduces VRAM ~40%
`--log`	`training_log.csv`	Per-epoch log filename

Output Files

File	Description
`best_model.pth`	Best model weights (by val mIoU)
`training_log.csv`	epoch, train_loss, val_loss, val_oa, val_miou, …
`test_results.csv`	Final test metrics

ghost predict

Run inference on the test split using a trained model. Outputs metric CSVs per routing mode.

$ Terminal

ghost predict \
  --data  /path/to/data.mat \
  --gt    /path/to/labels.mat \
  --model runs/my_experiment/spt_models.pkl \
  --routing forest --out-dir runs/my_experiment

Required

Flag	Description
`--data`	Hyperspectral data `.mat` file
`--gt`	Ground truth `.mat` file
`--model`	Path to `spt_models.pkl` from `ghost train_spt`

Optional

Flag	Default	Description
`--routing`	`all`	`forest`, `hybrid`, `soft`, or `all` (runs all three). Use `forest` — others require SSSR which is currently broken
`--ssm_load`	`None`	Standalone SSM weights. Falls back to embedded state in pkl
`--train_ratio`	`0.2`	Must match training value
`--val_ratio`	`0.1`	Must match training value
`--seed`	`42`	Must match training value
`--out-dir`	`.`	Output directory

Output Files

File	Description
`test_results_forest.csv`	OA, mIoU, Dice, Precision, Recall — forest routing
`test_results_hybrid.csv`	Same for hybrid routing
`test_results_soft.csv`	Same for soft routing

ghost visualize

Generate a 3-panel PNG: false colour composite | ground truth | GHOST prediction.

$ Terminal

ghost visualize \
  --data    /path/to/data.mat \
  --gt      /path/to/labels.mat \
  --model   runs/my_experiment/spt_models.pkl \
  --dataset indian_pines --routing forest \
  --title   "GHOST — Indian Pines" \
  --out-dir runs/my_experiment

Required

Flag	Description
`--data`	Hyperspectral data `.mat` file
`--gt`	Ground truth `.mat` file
`--model`	Path to `spt_models.pkl`

Optional

Flag	Default	Description
`--routing`	`forest`	Routing mode for the prediction panel
`--dataset`	`None`	Dataset name for class labels: `indian_pines`, `pavia`, `salinas`
`--r_band`	Bands×0.75	Band index for red channel in false colour
`--g_band`	Bands×0.50	Band index for green channel
`--b_band`	Bands×0.25	Band index for blue channel
`--title`	`GHOST Segmentation`	Figure title
`--ssm_load`	`None`	Standalone SSM weights
`--train_ratio`	`0.2`	Must match training value
`--val_ratio`	`0.1`	Must match training value
`--seed`	`42`	Must match training value
`--out-dir`	`.`	Output directory

Output Files

File	Description
`segmentation_forest.png`	3-panel figure, 180 DPI, dark background

ghost convert_to_mat

Convert ENVI, TIFF, GeoTIFF, and HDF5/NetCDF hyperspectral images to .mat format with zero data loss. Requires the optional convert extras: pip install ghost-hsi[convert].

$ Terminal

pip install ghost-hsi[convert]

ghost convert_to_mat \
  --img image.hdr \
  --gt  labels.tif \
  --out-dir converted/

Required

Flag	Description
`--img`	Path to the hyperspectral image. Supported: ENVI (`.hdr`), TIFF/GeoTIFF (`.tif`), HDF5/NetCDF (`.h5`, `.nc`). Format is auto-detected from extension.
`--out-dir`	Output directory (created if needed). Converted `.mat` and JSON sidecar are written here.

Optional

Flag	Default	Description
`--gt`	`None`	Ground truth label file. Accepts `.mat`, `.png`, `.tif`, `.hdr`.
`--crop`	`None`	Spatial crop before conversion: `Y X H W` (top-left origin, height, width).

HDF5 dataset path syntax

For HDF5 files with multiple datasets, specify the dataset path with a colon separator:

$ Terminal

ghost convert_to_mat \
  --img file.h5:/dataset/path \
  --out-dir converted/

Output Files

File	Description
`<name>.mat`	Converted hyperspectral cube, shape `(H, W, Bands)`
`<name>_meta.json`	Metadata sidecar: CRS, wavelengths, spatial transforms, band names
`<name>_gt.mat`	Converted ground truth (only if `--gt` is provided)

Dependencies

Package	Used for
`spectral`	ENVI (.hdr) reading
`rasterio`	TIFF / GeoTIFF reading, CRS and transform extraction
`h5py`	HDF5 / NetCDF reading

None of these are installed by pip install ghost-hsi alone. All three are pulled in by pip install ghost-hsi[convert].

Data Format

v0.1.x accepts .mat files (MATLAB / HDF5 format). Keys inside the file are auto-detected by array dimensionality — no configuration needed.

File	Shape	Description
Data file (`--data`)	`(H, W, Bands)`	The hyperspectral cube. Any band count.
Ground truth (`--gt`)	`(H, W)`	Integer class labels. 0 = background (unlabeled).

Indian Pines is bundled with the package — run ghost demo to get the file paths. For other standard datasets (Indian Pines, Pavia University, Salinas Valley), see the GIC group at UPV/EHU.

ENVI, GeoTIFF, and HDF5 files can be converted to .mat using ghost convert_to_mat (v0.1.7+). Native format support without conversion is planned for v0.2.x.