Known Issue Archive

Harrier keeps durable issue knowledge in Git so recurring EMR/Spark failures can move from ad hoc investigation to reviewed classifier coverage.

The archive is intentionally split into two layers:

Human-readable issue records in knowledgebase/issues/KB-####-slug.md.
A machine-readable index in knowledgebase/issues/index.json.

This keeps fixes reviewable in pull requests while giving future MCP tools a stable retrieval surface.

When To Archive

Create a known issue record when:

Harrier returns UNKNOWN but the report contains useful evidence.
A production or demo issue recurs with the same log, metric, DB, IAM, or deployment signal.
A new classifier rule is added and needs a durable explanation, validation history, and rollback guidance.

Do not archive secrets, full log files, customer data, tokens, passwords, or unredacted SQL values. Store short signatures and links to redacted reports instead.

Workflow

Run Harrier and save the validation or investigation report.
Create a draft KB record:

scripts/archive_known_issue.py create \
  --title "Executor OOM after skewed join" \
  --source-report ../harrier-emr-demo-lab/.harrier-demo/validation/executor_oom-20260529T004326Z.json \
  --signal "Container killed by YARN for exceeding memory limits" \
  --tag spark \
  --tag memory

Fill in the markdown sections that require human judgment.
If the issue has a stable signal, promote it into the deterministic rule system:
Add or reuse a FindingCategory.
Add PatternRule / ClassificationRule.
Add a recommendation factory.
Add MCP unit tests.
Add a demo scenario and expected findings in harrier-emr-demo-lab.
Run the full validation suite.
Mark the KB status as promoted once the classifier and demo coverage are merged.

Check The Archive

scripts/archive_known_issue.py check

The check verifies that index entries are well-formed, IDs are unique, and indexed markdown files exist.

Future MCP Retrieval

The index is designed so a future MCP tool can retrieve records by:

finding category
log signature
metric signal
deploy mode
source scenario
recommendation type
KB id

Until then, the archive is still useful as a reviewable, searchable runbook history.