BNGsim backend hook spawns a fresh helper process per atomic job — parameter scans pay N× Python import overhead

## Summary

The BNGsim backend-hook route delegates each *atomic* simulation job to the
JSON helper by having BNG2.pl `system()`-spawn a **fresh Python process**
(`python -m bionetgen.core.tools.bngsim_backend_helper JOB.json`). Each spawn
pays the full cost of starting a Python interpreter and `import bngsim`
(~0.45 s) plus importing the `bionetgen` machinery.

For a single large simulation this is amortized and fine. For a
**`parameter_scan`** — where BNG2.pl calls `simulate` (and thus the hook)
once per scan point — the per-job startup is paid N times and dominates the
runtime. The integration ends up *slower* than the subprocess path it is
meant to accelerate.

## Measurements

`harmonicOscillator.bngl` and
`ATG_update_mTORC1_assembly_more_complete_scheme.bngl` are both 200-point
`parameter_scan`s (`n_scan_pts=>200`).

| Run | Time |
|---|---|
| `import bngsim` (warm, already built) | 0.45 s |
| bngsim route, `n_scan_pts=2` | 5.42 s |
| bngsim route, `n_scan_pts=20` | 14.98 s |
| bngsim route, `n_scan_pts=200` | ~138 s |
| **subprocess** route, `n_scan_pts=200` (full run) | **~10 s** |

Linear fit over the scan-point count: **~0.53 s per scan point**, of which
~0.45 s is interpreter + `import` startup and only ~0.08 s is actual BNGsim
work. So ~85 % of a parameter scan's bngsim-route runtime is process/import
overhead. At 200 points that is ~100 s of pure overhead.

Subprocess is fast here because BNG2.pl calls `run_network` — a precompiled C
binary — once per point, and binary spawn is ~1 ms.

This surfaced in the parity sweep: both models registered as spurious
`ERROR` (timeout) at ~185 s against the 180 s budget. (Worked around in the
sweep harness with a per-model `TIMEOUT_OVERRIDES` entry — that is a harness
band-aid, not a fix.)

## Root cause

`scripts/apply_bngsim_backend_hook.py` — both hook bodies (`_NETWORK_BODY`,
`_NF_BODY`) end with:

```perl
my $rc = system(@helper_command, $job_file);
```

i.e. one cold Python process per atomic job. `BNGCLI` advertises the helper
as `BIONETGEN_BNGSIM_BACKEND_HELPER_PYTHON` + `_MODULE`; the helper
(`bngsim_backend_helper.py`) processes exactly one job per invocation and
exits.

## Proposed fix — a persistent helper

Run **one** long-lived helper process for the whole BNG2.pl run instead of
one per job, so `import bngsim` is paid once.

1. `bngsim_backend_helper.py`: add a `serve` mode — bind a Unix-domain
   socket, accept one connection per job, read a job-file path, run the
   existing `execute_backend_payload`, write back the one-line JSON result.
2. `cli.py` (`BNGCLI`): before running BNG2.pl, spawn the helper in `serve`
   mode, wait for the socket to be ready, advertise its path in a new env
   var (e.g. `BIONETGEN_BNGSIM_BACKEND_HELPER_SOCKET`); tear it down after.
3. The two Perl hook bodies: if the socket env var is set, send the job path
   over the socket and read the reply; otherwise fall back to the current
   `system()` spawn (preserving correctness if the persistent helper is
   unavailable).

Estimated effect for these models: per-point cost ~0.53 s → ~0.08 s, i.e.
~138 s → ~20 s — and it speeds up *every* scan-heavy model. Scope: ~3 files
+ re-vendoring the Perl hook + tests; the per-job `system()` fallback keeps
it safe.

## Notes

- BNGsim's own numerics are not at fault — the per-point ODE integration is
  a few ms. This is purely the backend-hook process boundary.
- The "8–30× speedup" rationale for the integration holds for single large
  simulations; this issue is specifically the many-small-jobs (scan) regime.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BNGsim backend hook spawns a fresh helper process per atomic job — parameter scans pay N× Python import overhead #101

Summary

Measurements

Root cause

Proposed fix — a persistent helper

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Run	Time
`import bngsim` (warm, already built)	0.45 s
bngsim route, `n_scan_pts=2`	5.42 s
bngsim route, `n_scan_pts=20`	14.98 s
bngsim route, `n_scan_pts=200`	~138 s
subprocess route, `n_scan_pts=200` (full run)	~10 s

BNGsim backend hook spawns a fresh helper process per atomic job — parameter scans pay N× Python import overhead #101

Description

Summary

Measurements

Root cause

Proposed fix — a persistent helper

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions