Splitting by Layer
Separating “all controllers” from “all services” from “all repositories” creates distributed monoliths. Split by business capability, not technical layer.
Domain decomposition is the process of dividing a monolithic system into bounded contexts — self-contained areas with distinct data ownership, vocabulary, and lifecycle. The output feeds directly into ModernizeSpec’s domains.json, which AI agents use to understand where business boundaries fall in the legacy code.
A bounded context is a region of the codebase where a particular domain model applies consistently. Inside the boundary, terms have precise meanings. Across boundaries, the same word may mean different things.
| Signal | What to Look For | Tool |
|---|---|---|
| Distinct data ownership | Tables/entities not shared with other areas | Entity relationship atlas |
| Separate vocabulary | Terms used only within one area of the code | Keyword frequency analysis |
| Independent lifecycle | Code that changes together but independently of other areas | Git co-change analysis |
| Minimal cross-boundary calls | Few function calls or imports crossing the proposed boundary | Dependency graph |
| Separate UI sections | Distinct pages, menus, or navigation groups | UI inventory |
domains.json with entities, capabilities, and coupling scoresSplitting by Layer
Separating “all controllers” from “all services” from “all repositories” creates distributed monoliths. Split by business capability, not technical layer.
Too Many Contexts
Every entity does not need its own bounded context. Group related entities that share a lifecycle and consistency boundary.
Ignoring Coupling Data
Drawing boundaries on a whiteboard without measuring actual code coupling produces aspirational architecture, not actionable extraction plans.
An aggregate is a cluster of entities treated as a single unit for data changes. The aggregate root is the entry point — all modifications to the cluster go through it. Aggregates define transaction boundaries in the new system.
Look for parent-child table relationships where children cannot exist without the parent:
Example: Invoice (root) owns InvoiceLineItem (member) and TaxDetail (member), but references Customer (separate aggregate).
Examine write operations in the codebase. Code that updates multiple entities within a single transaction reveals an aggregate:
begin_transaction() update(invoice) update(invoice.items) # Same aggregate update(invoice.taxes) # Same aggregate update(gl_entries) # Different aggregate -- cross-aggregate effectcommit()The entities modified together inside a transaction are the aggregate. Cross-aggregate effects should eventually become events, not transactions.
Ask: “If I delete entity X, what else must be deleted?” The answer defines the aggregate boundary.
PurchaseOrder must also delete its PurchaseOrderItem rows, they are the same aggregate.PurchaseOrder should not delete the Supplier, they are separate aggregates.| Size | Typical | Risk |
|---|---|---|
| 1-3 entities | Healthy | None |
| 4-7 entities | Acceptable | Monitor for unnecessary coupling |
| 8-15 entities | Large | Consider splitting into sub-aggregates |
| 16+ entities | Too large | Almost certainly hiding multiple concerns |
Legacy systems, especially web applications, are often organized around UI pages or screens. This creates modules that mix multiple business capabilities because a single page may touch invoicing, payments, customer management, and reporting.
┌─────────────────────────────────────────────────┐│ "Invoice Page" Module ││ ││ ┌───────────┐ ┌───────────┐ ┌────────────────┐ ││ │ Invoicing │ │ Payments │ │ Customer Info │ ││ │ Logic │ │ Logic │ │ Display │ ││ └───────────┘ └───────────┘ └────────────────┘ ││ ┌───────────┐ ┌───────────┐ ││ │ Tax Calc │ │ Reporting │ ││ │ │ │ Summary │ ││ └───────────┘ └───────────┘ │└─────────────────────────────────────────────────┘Five capabilities in one module. Extracting “Invoicing” requires untangling it from four other concerns.
┌──────────────┐ ┌──────────────┐ ┌──────────────┐│ Invoicing │ │ Payments │ │ Taxation ││ │ │ │ │ ││ Create │ │ Record │ │ Calculate ││ Submit │ │ Allocate │ │ Apply rules ││ Cancel │ │ Reconcile │ │ Report │└──────────────┘ └──────────────┘ └──────────────┘┌──────────────┐ ┌──────────────┐│ Customers │ │ Reporting ││ │ │ ││ Profile │ │ Generate ││ Credit limit │ │ Export ││ History │ │ Schedule │└──────────────┘ └──────────────┘Each capability becomes a bounded context with clear data ownership. The “Invoice Page” in the new system assembles data from multiple capabilities at the UI layer.
domains.json as contexts[].capabilities[]For large codebases, manually classifying every file is impractical. Use automated classification to assign each code artifact to a business domain, then refine manually.
| Strategy | How It Works | Accuracy | Effort |
|---|---|---|---|
| Namespace/path | Use directory structure as initial classification | Low-Medium | Automated |
| Keyword matching | Match function/class names to domain glossaries | Medium | Semi-automated |
| Import clustering | Group files that import each other heavily | Medium-High | Automated |
| Co-change analysis | Files that change together in git belong together | High | Automated (needs history) |
| AI-assisted | LLM reads code and assigns domain labels | Medium-High | Semi-automated |
accounts/*.py likely belongs to the Accounts domainaccounts/ that import primarily from stock/ may actually belong to the Stock domainaccounts/tax_calculator.py always changes alongside selling/invoice.py, they may be the same bounded contextERPNext’s 521 doctypes are organized into 21 modules, but module boundaries do not align with domain boundaries:
| Module Boundary | Domain Reality |
|---|---|
accounts/ contains GL Entry, Tax Rule, Payment Entry | These are 3 distinct domains: General Ledger, Taxation, Payments |
stock/ contains Stock Entry, Warehouse, Valuation | Valuation crosses into Accounts (it posts GL entries) |
hr/ contains Payroll Entry | Payroll crosses into Accounts (it creates journal entries) |
selling/ and buying/ are separate modules | Both use the same transaction controller (accounts_controller.py) |
The decomposition revealed 21 modules but approximately 35 distinct bounded contexts when analyzed by actual data ownership and coupling patterns.
In framework-heavy systems, controller classes accumulate cross-cutting concerns through inheritance. A single controller file may contain logic for validation, calculation, persistence, authorization, and event handling — all mixed together.
The controller hierarchy for a Sales Invoice spans four levels:
| Level | Class | Lines | Methods | Concern |
|---|---|---|---|---|
| 1 | Document (Frappe) | ~2,000 | ~80 | Framework ORM, permissions, workflow |
| 2 | TransactionBase | ~600 | ~25 | Shared transaction logic |
| 3 | AccountsController | 4,412 | 168 | Financial calculations, GL posting, tax |
| 4 | SalesInvoice | ~1,800 | ~60 | Invoice-specific behavior |
Methods in AccountsController (level 3) serve every financial transaction type — Sales Invoice, Purchase Invoice, Payment Entry, Journal Entry. This means extracting “Invoicing” as a bounded context requires untangling shared methods from invoice-specific ones.
| Method Category | Extraction Strategy |
|---|---|
Domain-specific (e.g., validate_invoice_dates) | Move to the target bounded context directly |
Shared calculation (e.g., calculate_taxes_and_totals) | Extract as a shared domain service, referenced by multiple contexts |
Framework hook (e.g., on_submit, validate) | Replace with domain events in the new system |
Cross-cutting (e.g., update_stock_ledger) | Extract as an integration event between bounded contexts |
Domain decomposition produces the core data for domains.json:
| Analysis Output | Spec Field | Example |
|---|---|---|
| Bounded context list | contexts[] | { "id": "invoicing", "name": "Invoicing" } |
| Entity grouping | contexts[].entities[] | ["SalesInvoice", "SalesInvoiceItem", "SalesTaxesAndCharges"] |
| Capability mapping | contexts[].capabilities[] | ["create-invoice", "submit-invoice", "cancel-invoice"] |
| Coupling scores | contexts[].coupling[] | { "target": "general-ledger", "score": 0.72 } |
| Cross-domain dependencies | relationships[] | { "from": "invoicing", "to": "taxation", "type": "depends-on" } |