Strangler Fig Pattern

The strangler fig pattern — first described by Martin Fowler in 2004 — is the dominant migration strategy for legacy modernization. Named after strangler fig trees that grow around a host tree, eventually replacing it entirely while the host continues to function, the pattern lets you replace system components incrementally while both old and new systems run in parallel.

ModernizeSpec’s extraction-plan.json and migration-state.json are designed around this pattern: they track what has been extracted, what is in progress, and what remains.

The Pattern

Core Mechanism

┌─────────────────────────────────────────────────────┐
│                      Router                          │
│   (feature flag, URL path, API version, etc.)        │
└───────────────┬─────────────────┬───────────────────┘
                │                 │
        ┌───────▼───────┐ ┌──────▼───────┐
        │  New System   │ │ Legacy System │
        │  (extracted)  │ │ (remaining)   │
        │               │ │               │
        │  Module A ✓   │ │  Module A ✗   │
        │  Module B ✓   │ │  Module B ✗   │
        │               │ │  Module C     │
        │               │ │  Module D     │
        └───────────────┘ └──────────────┘

Traffic for extracted modules routes to the new system. Everything else continues through the legacy system. Over time, more modules move to the new side until the legacy system handles nothing.

Why It Works

Benefit	Explanation
Continuous delivery	Each extraction is independently deployable
Reversible	If the new module fails, route traffic back to legacy
Incremental confidence	Each module builds on proven infrastructure
Business continuity	The system never goes down for migration
Team parallelism	Multiple teams can extract different modules simultaneously

State Tracking

Every component in the legacy system progresses through a defined lifecycle. ModernizeSpec’s migration-state.json tracks this progression:

State	Description	Entry Criteria	Exit Criteria
Not Started	Legacy code untouched	Default state	Analysis begins
In Analysis	Understanding behavior, mapping dependencies	Team assigned	Parity tests written
Extracting	Building the new implementation	Parity tests exist	New code compiles and passes unit tests
Testing	Running parity tests against new implementation	New code exists	Parity confidence > threshold
Shadowing	New system processes real traffic, results discarded	Parity tests pass	Shadow results match legacy for N days
Live	New system handles production traffic	Shadow validation passes	Monitoring confirms stability
Legacy Removed	Old code deleted, migration complete	New system stable for N weeks	Old code removed from codebase

State Transitions

Not Started ──▶ In Analysis ──▶ Extracting ──▶ Testing
                                                  │
                                                  ▼
Legacy Removed ◀── Live ◀── Shadowing ◀──────────┘

Every transition should be recorded with a timestamp, the actor (human or agent), and any notes. This creates an audit trail of the migration.

Anti-Corruption Layer

The anti-corruption layer (ACL) is the boundary between old and new systems. It translates data formats, protocols, and domain concepts so that legacy abstractions do not leak into the new system.

The ACL exists for one reason: the new system should be designed as if the legacy system does not exist. Domain models, naming conventions, and data structures in the new system should reflect the ideal design, not the legacy layout.

The ACL handles the translation between ideal and legacy at the boundary.

Legacy Concept	New Concept	ACL Responsibility
`doctype_name` (string reference)	Typed entity ID	Parse string, validate, convert to typed ID
Framework hook chain	Domain event	Map `on_submit` → `InvoiceSubmitted` event
Implicit multi-tenancy	Explicit tenant context	Extract company from session, pass as parameter
String-based permissions	Type-safe RBAC	Convert `has_permission("Sales Invoice", "write")` to typed check
Flat dictionary response	Typed DTO	Parse dict, validate fields, construct typed response

Place the ACL at the integration boundary — typically in adapter classes or gateway services:

┌──────────────────┐     ┌─────────────────┐     ┌──────────────┐
│  New System       │────▶│  ACL Adapter     │────▶│ Legacy API   │
│  (typed, clean)   │     │  (translates)    │     │ (stringly)   │
└──────────────────┘     └─────────────────┘     └──────────────┘

The new system never imports legacy code directly. All interaction goes through the ACL.

ACL Lifecycle

The ACL is temporary. As modules move from legacy to new:

Phase 1: ACL translates between new module and legacy system
Phase 2: ACL shrinks as more modules migrate (fewer translations needed)
Phase 3: ACL removed entirely when legacy system is decommissioned

Design the ACL for easy removal — thin adapter layers, not deep abstractions.

Dual-Write Infrastructure

During the transition period, both systems may need access to the same data. Three strategies handle this:

Write New, Sync to Legacy

New system is the primary writer. A sync mechanism propagates changes to the legacy database so legacy UI and reports continue to work.

Best when: New system launches first for a specific capability.

Write Legacy, Event to New

Legacy system continues as primary writer. Events (database triggers, CDC, application events) feed changes to the new system’s data store.

Best when: Legacy system cannot be modified to call new APIs.

Shadow Mode

Both systems process the same request. Legacy result is returned to the user. New system result is logged for comparison. No user impact.

Best when: Building confidence before cutover.

Dual-Write Risks

Risk	Mitigation
Data divergence	Reconciliation jobs that detect and alert on mismatches
Ordering problems	Sequence numbers or timestamps on all writes
Partial failures	Outbox pattern — persist intent, then sync asynchronously
Performance overhead	Shadow mode adds latency; budget for it or make it async

Seam Identification

Michael Feathers introduced the concept of seams in Working Effectively with Legacy Code (Chapter 4) — points in a system where behavior can be redirected without modifying the code at that location. In the context of the strangler fig pattern, seams are the insertion points where you intercept requests and route them to either the legacy or new system.

Types of Seams

Seam Type	Where to Find It	Example
API endpoint	HTTP routes, RPC definitions	Route `/api/invoice` to new service
Message queue	Event bus, pub/sub topics	Subscribe new consumer to `invoice.created` topic
Database view	Views that abstract table access	Replace view to read from new tables
Configuration switch	Feature flags, environment variables	`USE_NEW_TAX_ENGINE=true`
Interface/protocol	Dependency injection points	Swap `LegacyTaxCalculator` for `NewTaxCalculator`
Preprocessing	Middleware, interceptors, filters	Insert translation layer before request reaches legacy handler

Finding Seams in Legacy Code

Map all entry points — HTTP routes, CLI commands, scheduled jobs, event handlers
Identify the narrowest point where a request crosses a module boundary
Verify the seam is clean — no side effects from the interception itself
Test the seam — route one request through and verify both paths work

Feature Flags

Feature flags control which implementation handles each request. They enable gradual rollout and instant rollback.

Rollout Strategy

Phase	Flag State	Traffic
Development	`off`	0% to new system
Internal testing	`internal-only`	Employees only
Canary	`percentage:5`	5% of production traffic
Gradual	`percentage:25` → `50` → `75`	Increasing production traffic
Full rollout	`on`	100% to new system
Cleanup	Flag removed	Legacy code removed

Flag Hygiene

Flags are temporary. Every flag should have:

A creation date
An expected removal date
An owner responsible for cleanup
A maximum lifetime (typically 30-90 days after full rollout)

Stale flags accumulate as technical debt. Track them in migration-state.json alongside component states.

Extraction Sequencing

The order in which you extract modules matters. Dependencies between modules create constraints.

Sequencing Rules

Foundation first — shared utilities, core data models, configuration
Leaf modules second — modules with no dependents (nothing else imports them)
High-value modules early — maximize business value delivered
High-coupling modules last — they require the most ACL work

Dependency-Aware Ordering

Extract in this order:

Phase 1: Foundation
  ├── Currency (0 dependencies)
  ├── Mode of Payment (0 dependencies)
  └── UOM (Unit of Measure, 0 dependencies)

Phase 2: Core Calculations
  ├── Tax Calculator (depends on: Tax Rule ✓ from Phase 1)
  └── Pricing Rule Engine (depends on: Currency ✓ from Phase 1)

Phase 3: Transaction Engine
  ├── GL Entry Engine (depends on: Tax Calculator ✓, Currency ✓)
  └── Payment Allocation (depends on: GL Entry ✓)

Phase 4: Business Documents
  └── Sales Invoice (depends on: GL Entry ✓, Tax ✓, Payment ✓)

Each phase depends only on components extracted in prior phases. This maps directly to extraction-plan.json’s phases[] with dependencies[].

ERPNext Example

The reference implementation validated this sequence: Mode of Payment (19 tests, zero dependencies) → Tax Calculator (24 tests, depends on tax rules) → GL Entry Engine (32 tests, depends on both). Each extraction was independently testable and deployable.

Feeding the Spec

The strangler fig pattern maps directly to two ModernizeSpec files:

Pattern Concept	Spec File	Spec Field
Extraction phases	`extraction-plan.json`	`phases[]`
Phase dependencies	`extraction-plan.json`	`phases[].dependencies[]`
Risk scoring	`extraction-plan.json`	`phases[].risk`
Component states	`migration-state.json`	`components[].state`
State transitions	`migration-state.json`	`components[].history[]`
Feature flag status	`migration-state.json`	`components[].featureFlag`
Overall progress	`migration-state.json`	`summary.percentComplete`