runtime-profile.json

runtime-profile.json captures runtime evidence from the legacy system. While static analysis reveals structure, runtime profiling reveals behavior — which code paths are actually used, which database queries are slow, and which functions are never called in production.

Full Example

{
  "collection": {
    "collector": "opentelemetry",
    "environment": "production",
    "period": {
      "start": "2026-01-01",
      "end": "2026-01-31"
    },
    "evidenceTier": "production-apm"
  },
  "hotPaths": [
    {
      "endpoint": "/api/invoices",
      "method": "GET",
      "callCount": 48200,
      "latency": {
        "p50": 45,
        "p95": 320,
        "p99": 890
      },
      "context": "Invoicing"
    },
    {
      "endpoint": "/api/accounts/balance",
      "method": "GET",
      "callCount": 31500,
      "latency": {
        "p50": 12,
        "p95": 85,
        "p99": 210
      },
      "context": "Accounts"
    },
    {
      "endpoint": "/api/tax/calculate",
      "method": "POST",
      "callCount": 15800,
      "latency": {
        "p50": 120,
        "p95": 780,
        "p99": 2100
      },
      "context": "Tax"
    }
  ],
  "databasePatterns": {
    "totalQueries": 2450000,
    "nPlusOneDetected": [
      {
        "endpoint": "/api/invoices",
        "pattern": "SELECT invoice; then N x SELECT line_items WHERE invoice_id = ?",
        "occurrences": 48200,
        "context": "Invoicing"
      }
    ],
    "slowQueries": [
      {
        "query": "SELECT * FROM gl_entries WHERE posting_date BETWEEN ? AND ? ORDER BY posting_date",
        "avgDuration": 1200,
        "callCount": 3200,
        "context": "Accounts"
      }
    ],
    "queriesPerContext": {
      "Invoicing": 890000,
      "Accounts": 720000,
      "Tax": 340000,
      "Stock": 280000,
      "HR": 120000,
      "Assets": 100000
    }
  },
  "deadCode": {
    "unreachedEndpoints": [
      {
        "endpoint": "/api/legacy/import-csv",
        "method": "POST",
        "lastCalled": null,
        "context": "Invoicing"
      },
      {
        "endpoint": "/api/reports/quarterly-old",
        "method": "GET",
        "lastCalled": "2024-03-15",
        "context": "Accounts"
      }
    ],
    "unreachedFunctions": [
      {
        "function": "recalculate_all_balances",
        "file": "accounts/utils.py",
        "lastCalled": null
      },
      {
        "function": "export_to_tally",
        "file": "integrations/tally.py",
        "lastCalled": "2023-11-01"
      }
    ],
    "totalDeadLoc": 4200
  },
  "resourceUsage": {
    "memory": {
      "avgMb": 512,
      "peakMb": 1840,
      "peakEndpoint": "/api/reports/annual-summary"
    },
    "cpu": {
      "avgPercent": 35,
      "peakPercent": 92,
      "peakEndpoint": "/api/tax/year-end-reconciliation"
    }
  }
}

Field Reference

`collection` Object

Metadata about how and when the runtime data was collected.

Field	Type	Required	Description
`collector`	`string`	Yes	Tool used: `"opentelemetry"`, `"datadog"`, `"newrelic"`, `"custom"`, `"log-analysis"`
`environment`	`string`	Yes	Where data was collected: `"production"`, `"staging"`, `"development"`
`period`	`object`	Yes	Collection time window
`period.start`	`string`	Yes	ISO 8601 date. Start of collection period.
`period.end`	`string`	Yes	ISO 8601 date. End of collection period.
`evidenceTier`	`string`	Yes	Confidence level of the data. See Evidence Tiers.

`hotPaths` Array

Endpoints ranked by usage. Each entry represents a frequently called code path.

Field	Type	Required	Description
`endpoint`	`string`	Yes	URL path of the endpoint
`method`	`string`	Yes	HTTP method: `"GET"`, `"POST"`, `"PUT"`, `"DELETE"`, `"PATCH"`
`callCount`	`number`	Yes	Total invocations during the collection period
`latency`	`object`	Yes	Response time percentiles in milliseconds
`latency.p50`	`number`	Yes	Median response time (ms)
`latency.p95`	`number`	Yes	95th percentile response time (ms)
`latency.p99`	`number`	Yes	99th percentile response time (ms)
`context`	`string`	No	Bounded context this endpoint belongs to (matches `domains.json`)

`databasePatterns` Object

Database query analysis from the collection period.

Field	Type	Required	Description
`totalQueries`	`number`	Yes	Total database queries during the collection period
`nPlusOneDetected`	`array`	No	Detected N+1 query patterns
`slowQueries`	`array`	No	Queries exceeding a threshold (default: 500ms)
`queriesPerContext`	`object`	No	Query count grouped by bounded context

`nPlusOneDetected` Items

Field	Type	Required	Description
`endpoint`	`string`	Yes	Endpoint triggering the N+1 pattern
`pattern`	`string`	Yes	Human-readable description of the query pattern
`occurrences`	`number`	Yes	How many times this pattern was triggered
`context`	`string`	No	Bounded context

`slowQueries` Items

Field	Type	Required	Description
`query`	`string`	Yes	SQL query (parameterized, no actual values)
`avgDuration`	`number`	Yes	Average execution time in milliseconds
`callCount`	`number`	Yes	Total executions during the collection period
`context`	`string`	No	Bounded context

`deadCode` Object

Code paths with zero or negligible usage in the collection period.

Field	Type	Required	Description
`unreachedEndpoints`	`array`	No	Endpoints with no calls during the collection period
`unreachedFunctions`	`array`	No	Functions with no invocations
`totalDeadLoc`	`number`	No	Estimated lines of code in dead paths

`unreachedEndpoints` Items

Field	Type	Required	Description
`endpoint`	`string`	Yes	URL path
`method`	`string`	Yes	HTTP method
`lastCalled`	`string \| null`	No	ISO 8601 date of last known call, or `null` if never called
`context`	`string`	No	Bounded context

`unreachedFunctions` Items

Field	Type	Required	Description
`function`	`string`	Yes	Function name
`file`	`string`	Yes	File path relative to project root
`lastCalled`	`string \| null`	No	ISO 8601 date of last known call, or `null` if never called

`resourceUsage` Object

System resource consumption during the collection period.

Field	Type	Required	Description
`memory`	`object`	No	Memory usage metrics
`memory.avgMb`	`number`	Yes	Average memory usage in MB
`memory.peakMb`	`number`	Yes	Peak memory usage in MB
`memory.peakEndpoint`	`string`	No	Endpoint that triggered peak memory
`cpu`	`object`	No	CPU usage metrics
`cpu.avgPercent`	`number`	Yes	Average CPU utilization (0-100)
`cpu.peakPercent`	`number`	Yes	Peak CPU utilization (0-100)
`cpu.peakEndpoint`	`string`	No	Endpoint that triggered peak CPU

Evidence Tiers

Not all runtime data carries the same confidence. Data from production APM is more reliable than estimates from static analysis. The evidenceTier field communicates this.

Tier	Confidence	Source	Use Case
`production-apm`	Highest	APM running in production for 30+ days	Gold standard. Reflects actual user behavior.
`staging-load-test`	High	Load testing in staging environment	Simulated traffic. Good for latency and resource usage.
`dev-profiling`	Medium	Developer profiling sessions	Limited scenarios. Useful for identifying slow queries.
`static-analysis`	Lowest	Inferred from code structure (no runtime data)	Fallback when no runtime access. Call counts are estimates.

When consuming runtime data, agents and tools should weight decisions based on the tier. A production-apm hot path ranking carries more authority than a static-analysis estimate.

How This Feeds the Spec

Runtime profile data influences other spec files:

Source Data	Feeds Into	How
Hot paths	`complexity.json`	Hotspot rankings incorporate actual usage, not just LOC or cyclomatic complexity
Dead code	`extraction-plan.json`	Dead endpoints deprioritized or skipped entirely
N+1 queries	`complexity.json`	Database anti-patterns increase complexity scores
Call counts	`parity-tests.json`	High-traffic paths get higher parity test coverage
Resource usage	`extraction-plan.json`	Resource-heavy endpoints flagged for performance improvement during extraction

Collection Methods

The recommended approach. Add the OpenTelemetry SDK to the legacy system and export traces to a collector.

# Example: Python legacy app with OTEL auto-instrumentation
pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-instrument --traces_exporter otlp python app.py

Export traces as JSON files to the Modernizer’s profiling/traces/ directory, then process:

npx modernizespec profile --traces-dir ./profiling/traces/ --period 30d

If the legacy system already has Datadog or New Relic, export metrics via their APIs.

# Datadog: export endpoint metrics
npx modernizespec profile --source datadog \
  --api-key $DD_API_KEY \
  --period 30d

# New Relic: export via NRQL
npx modernizespec profile --source newrelic \
  --api-key $NR_API_KEY \
  --account-id $NR_ACCOUNT \
  --period 30d

The SDK normalizes vendor-specific formats into the standard runtime-profile.json structure.

For systems without APM, add lightweight request logging:

# Minimal middleware that logs endpoint, method, duration
import time, json

def profiling_middleware(get_response):
    def middleware(request):
        start = time.time()
        response = get_response(request)
        duration = (time.time() - start) * 1000
        with open("/tmp/profile.jsonl", "a") as f:
            json.dump({
                "endpoint": request.path,
                "method": request.method,
                "duration_ms": duration,
                "timestamp": time.time()
            }, f)
            f.write("\n")
        return response
    return middleware

Then process the JSONL file:

npx modernizespec profile --source jsonl --file /tmp/profile.jsonl

When instrumenting the legacy system is not feasible, analyze existing access logs.

# Parse nginx/Apache access logs
npx modernizespec profile --source access-log \
  --file /var/log/nginx/access.log \
  --format nginx

# Parse application logs
npx modernizespec profile --source app-log \
  --file /var/log/app/requests.log \
  --pattern '{method} {endpoint} {status} {duration}ms'

Log analysis produces a static-analysis or dev-profiling evidence tier since it lacks full trace context.

Agent Behavior

When an AI agent discovers runtime-profile.json:

Read the evidenceTier to understand data confidence
Use hotPaths to prioritize which bounded contexts to work on first
Check deadCode before extracting — dead endpoints can be skipped
Review databasePatterns for N+1 queries that should be fixed during extraction
Cross-reference with parity-tests.json to ensure hot paths have adequate test coverage

Next Steps

parity-tests.json — Behavior preservation test inventory
complexity.json — Extraction difficulty assessment
Modernizer App — Where runtime profiling fits in the architecture

runtime-profile.json

Full Example

Field Reference

collection Object

hotPaths Array

databasePatterns Object

nPlusOneDetected Items

slowQueries Items

deadCode Object

unreachedEndpoints Items

unreachedFunctions Items

resourceUsage Object