Skip to content

Indexer hangs (deadlock) and blows up RAM on a mid-size MQL5 codebase #775

Description

@xzbnm

Version

v0.8.1

Platform

Windows (x64)

Install channel

GitHub release archive / install.sh / install.ps1

Binary variant

standard

What happened, and what did you expect?

Title

Indexer hangs (deadlock) and blows up RAM on a mid-size MQL5 codebase (v0.8.1)

Summary

Indexing a MetaTrader 5 MQL5 project never finishes. Two failure modes, both fatal:

  1. RAM blowup during parallel extraction (up to ~62 GB on a 64 GB machine — once it filled RAM and crashed the whole PC).
  2. Deadlock — the run freezes partway (~22–26 of 112 files), no more log output, no progress, the graph phase is never reached.

Version

  • codebase-memory-mcp 0.8.1 (confirmed the latest release on GitHub).

Environment

  • Windows 10, 64 GB RAM, 24 logical CPUs.
  • Run via CLI: codebase-memory-mcp cli index_repository '{"repo_path":"...","mode":"fast"}'.

What the project is (language)

  • MQL5 = MetaQuotes Language 5, the language for MetaTrader 5 trading robots. It is syntactically very close to C++ (classes, OOP, #include, etc.).
  • Files: .mq5 (main programs) and .mqh (headers).
  • Since there is no native MQL5 parser, they are mapped to the C++ extractor in .codebase-memory.json:
    {"extra_extensions": {".mq5": "cpp", ".mqh": "cpp"}}
  • Size: ~112 files, ~48,000 lines total. Largest file ~5,200 lines. Heavy cross-#include structure (many files include a big central header).

Steps to reproduce

  1. Point the indexer at a large MQL5 (or generally C++-heavy, deeply cross-included) repo with the .mq5/.mqh → cpp mapping.
  2. index_repository with mode: fast.
  3. Watch RAM and the log.

Observations

  • Log shows pipeline.mode mode=parallel workers=24 (= number of logical CPUs) and parallel.mem.budget total_mb=32458 — but actual RAM goes to 42–62 GB, so the budget is not enforced.
  • With the cpp mapping, a few specific files take 10–12 minutes each to "extract". Example: a 392-line file (Gold_SSI_TSI_SIGNAL.mqh) took 726,345 ms — its only #include pulls a 5,229-line header. DrawOrderFunc.mqh took 597,887 ms. This looked like an O(n²)-style symbol-resolution blowup, made much worse by duplicate class definitions across backup/old copies of files (removing those duplicates dropped that same header to 103 ms).
  • After cleaning duplicates and fixing 2 circular includes, RAM stayed lower (~35–45 GB) but the run still deadlocked around file 22–26.
  • Switching the mapping to {".mq5":"c",".mqh":"c"} made per-file extraction fast again (2.5 s max, RAM ~22 GB) — but it STILL deadlocked at the same point (22/112), frozen for 16+ minutes on the same 4 files:
    Defines_generalFunc.mqh, DrawOrderFunc.mqh, DrawRectangleSR.mqh, PositionsManagment.mqh.
    (These files show parallel.extract.file.start with no matching parallel.extract.file.done, memory flat, no new log lines.)
  • Conclusion: the deadlock is independent of the language mapping (cpp vs c), of memory, and of the file cleanup. It looks like a worker-pool / parallel-extraction deadlock in the tool itself.

Things I tried that did NOT help

  • GOMAXPROCS=2 → ignored, still workers=24.
  • Limiting the process CPU affinity to 4 cores → ignored, still workers=24.
  • mode: fast, delete_project + fresh reindex, killing all other codebase-memory processes first.
  • There is no config option for worker count / concurrency, and no exclude/ignore option (config list only exposes auto_index and auto_index_limit); .gitignore is not honored for source dirs.

Also worth noting

  • auto_index = true re-triggered the RAM blowup automatically every time the MCP server reconnected. Setting auto_index = false was needed to stop it.

Feature requests / suggested fixes

  1. A --workers / concurrency cap (config or flag) so peak RAM is bounded on large files.
  2. Actually enforce mem_budget during parallel extraction.
  3. An exclude/ignore option (or honor .gitignore) to skip specific files/dirs.
  4. Investigate the deadlock in parallel extraction — it reproduces on the 4 files above regardless of cpp/c mapping.

Reproduction

  1. Code being indexed: a private MQL5 (MetaTrader 5) trading codebase — can't share publicly.
    Structure that triggers it: ~112 .mq5/.mqh files mapped to cpp, with heavy cross-#includes
    and one central ~5,200-line header (ManagmentClass.mqh) that many files pull in.
    Minimal pattern that seems to matter:

// ManagmentClass.mqh — ~5,200 lines, many classes
// Gold_SSI_TSI_SIGNAL.mqh (392 lines): #include "..\trading\ManagmentClass.mqh"
// DrawOrderFunc.mqh: includes 6 other module headers (Defines, sync, DB, history, ...)

.codebase-memory.json: {"extra_extensions": {".mq5": "cpp", ".mqh": "cpp"}}

  1. Command:
    codebase-memory-mcp cli index_repository '{"repo_path":"C:/.../MQL5/Experts/BM","mode":"fast"}'

  2. Result vs Expected:

  • RESULT: extraction runs to ~22 of 112 files, then freezes forever — no more
    parallel.extract.file.done, never reaches a graph/persist phase, RAM flat (~22 GB with the
    c mapping) or climbing to 42–62 GB (with the cpp mapping), no new log lines for 15–40 min.
    The same 4 files are stuck (started, never finished): Defines_generalFunc.mqh,
    DrawOrderFunc.mqh, DrawRectangleSR.mqh, PositionsManagment.mqh.
  • EXPECTED: indexing completes and writes the graph — as it does in seconds for my 3 other repos
    (NestJS backend, two Next.js frontends).

Note: I can privately share those 4 .mqh files if that helps you reproduce — a public minimal
repro isn't available since the code is proprietary.

Logs


Diagnostics trajectory (memory / performance / leak issues)


Project scale (if relevant)

~112 files, ~40,000 lines (MQL5). Indexing never completes (deadlocks ~22/112), so no final node/edge count.

Confirmations

  • I searched existing issues and this is not a duplicate.
  • My reproduction uses shareable code (a dummy snippet or a public OSS repository), not proprietary code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions