What Is HPO?
Hyperparameter Optimization (HPO) is the process of automatically searching for the best training setup for a model.
Instead of manually trying a few values, HPO explores combinations of:
- model parameters (for example hidden width,
tau,dt), - optimization parameters (learning rate, optimizer choice),
- regularization settings (activity penalties, hardware-aware penalties),
- training strategy options (scheduler, early stopping policy).
The output of HPO is not just one "good run", but a ranked set of trials with metrics, so you can choose the best tradeoff for your objective.
Why HPO Is Important for SNNs
SNNs are usually more sensitive than standard ANN pipelines to training hyperparameters. Small changes in temporal dynamics can strongly affect convergence and final metrics.
Typical SNN-sensitive parameters include:
- membrane/leak time constants (
tau), - timestep (
dt) and simulation horizon, - surrogate-gradient settings,
- spike activity regularization strengths,
- hardware-aware constraints (weight limits, sign topology terms).
Because these parameters interact with each other, tuning them one-by-one is often misleading. HPO lets you optimize them jointly.
Why HPO Beats Manual Grid Search
Manual grid search is simple, but it becomes inefficient very quickly.
1) Better optimization efficiency
- Grid search wastes many trials on unpromising regions.
- HPO can use smarter sampling strategies (for example random/adaptive search).
- Trial schedulers (for example ASHA) stop poor trials early and reallocate compute to better candidates.
2) Better parallelization
- Grid search is often run serially or with ad-hoc scripts.
- HPO frameworks schedule many trials in parallel with controlled CPU/GPU resources.
- This reduces wall-clock time and improves reproducibility of large tuning campaigns.
Why This Matters Even More for Small Networks
When training networks that can be deployed on Neuronova's chips, networks are relatively small. A single training job may not saturate a modern GPU by itself, and parts of the pipeline remain CPU-bound (data loading, orchestration, metric/logging steps). Moreover, the less the parameters of the network are, the more they are fundamental: this implies that inizializations and parameter choice is one of the crucial factor to reach high metrics.
Running multiple HPO trials concurrently helps maximize total machine utilization:
- better GPU occupancy through parallel trials,
- better CPU utilization for data and trial management,
- less idle hardware time between experiments.
So even if each model is small, HPO is still a practical way to convert available compute into faster iteration and better final models.
HPO in NWAVE: HPOWrapper (Ray Tune-Based)
The provided HPOWrapper in nwavesdk.optim.hpo is built on top of Ray Tune.
This gives you:
- a structured API to bind tunable parameters across model/loss/metric/optimizer/scheduler namespaces,
- parallel trial execution with explicit resource control (
cpu,gpu, max concurrent trials), - trial scheduling and early-stopping policies (for example ASHA),
- consistent metric reporting and checkpointing across runs.
In short, HPOWrapper keeps your SNN training code customizable while delegating the heavy orchestration work to Ray Tune.
For full setup and usage details, continue with the training guide.