1. Multi-Agent Pipelines
Analyze, translate, validate, iterate — the dominant architecture for AI-assisted migration.
This research surveys the AI-assisted code migration landscape as of February 2026, covering enterprise platforms, open-source tools, documented success stories, and the capability boundaries of current LLMs for migration tasks.
| Tool | Vendor | Maturity | Key Metric |
|---|---|---|---|
| Amazon Q Transform | AWS | Most mature | 1.1B lines analyzed, 810K hours saved |
| IBM watsonx Code Assistant for Z | IBM | Production | COBOL-to-Java with Granite models |
| GitHub Copilot Workspace | Microsoft | GA | Agentic multi-step migration |
| Google Gemini Code Assist + codmod | GA | Large-scale automated codemod | |
| Microsoft CAMF | Microsoft Research | Research | Code-Aware Model Framework for migrations |
AWS Transform is the most production-proven platform (1.1 billion lines across Amazon’s internal Java migration). Microsoft and IBM are investing in agentic multi-step approaches.
| Tool | Focus | Approach |
|---|---|---|
| OpenRewrite / Moderne | Java/JVM framework upgrades | Deterministic AST-based recipes |
| OpenHands | General-purpose AI coding agent | Agentic loop with sandbox execution |
| vFunction | Monolith decomposition | Automated service boundary detection |
| Metric | Value |
|---|---|
| Scale | 3,500 files |
| Timeline | 6 weeks (estimated 18 months manually) |
| Team | 6 engineers |
| Compression ratio | ~12x faster |
Approach: State-machine pipeline (analyze, transform, validate, iterate). Retry loops with validation errors fed back to the LLM. Rich context: 40K-100K token prompts pulling up to 50 related files. Simple retry outperformed sophisticated prompt engineering.
| Metric | Value |
|---|---|
| Scale | 30,000+ production applications |
| Savings | 4,500 developer-years of effort |
| Tool | Amazon Q Transform |
| Cost savings | $260M estimated |
The largest documented AI-assisted migration. Same-language version upgrade with well-defined transformation rules.
| Metric | Value |
|---|---|
| Timeline | Hours instead of months |
| Approach | Copilot-assisted with CAMF |
Applied to Xbox platform components. Same-ecosystem upgrade with strong tooling support.
| Organization | Migration | Result |
|---|---|---|
| Commonwealth Bank of Australia | COBOL to Java | 40M lines scoped |
| US Air Force | COBOL to Java | Proof of concept |
| Various (IBM) | COBOL to Java | watsonx Code Assistant |
No publicly documented case of an AI-assisted full ERP system migration exists. The closest examples are module-level extractions and framework upgrades within the same language.
| Capability | Confidence | Notes |
|---|---|---|
| Type mapping (Python to Go/Kotlin) | High | Mechanical, well-understood |
| Simple function translation | High | Pure functions, no side effects |
| Data model extraction from JSON | High | ERPNext DocType JSONs are ideal input |
| Test generation from specifications | Medium-High | Table-driven tests, edge cases |
| Business rule extraction | Medium | Depends on code clarity |
| Controller method translation | Medium | Requires rich context |
| Limitation | Impact | Notes |
|---|---|---|
| Cross-file dependency resolution | Critical | Hooks-driven execution paths are invisible |
| Framework behavior replication | Critical | ORM, permissions, workflow — emergent behavior |
| Dynamic typing inference | High | frappe.get_value() returns untyped data |
| Multi-tenancy architecture | High | Bench/site model is framework-level |
| Regional compliance logic | High | Scattered across hooks and overrides |
| Performance optimization | Medium | AI produces correct but unoptimized code |
| Metric | Range |
|---|---|
| Correct translations (general) | 2.1% - 47.3% |
| Correct translations (specialized/fine-tuned) | 90%+ |
| Scalability cliff | ~100 lines per translation unit |
| Asymmetric proficiency | Better translating TO Python than FROM Python |
Naive AI translation produces syntactically valid but unidiomatic code — dubbed “Jobol” (Java that reads like COBOL). For cross-language ERP migration, this manifests as target language code that mirrors Python patterns instead of using idiomatic constructs.
| Tool | Max Context | Key Feature |
|---|---|---|
| Sourcegraph Cody | 1M tokens | 100K+ repo support, enterprise-grade |
| Augment Code | 200K tokens | 500K files, 70% win rate over Copilot |
| Cursor | 272K tokens | 50K files, IDE-integrated |
| CodeCompass | Qdrant-based | Open-source, MCP-integrated |
| GitHub Copilot | 8K-64K effective | Ubiquitous but limited context |
For complex legacy codebases, standard IDE-level tools see individual files. The complexity requires seeing the full dependency graph — structural indexing, semantic search, and cross-file dependency resolution.
| Approach | Method |
|---|---|
| Quantized Code Llama + LoRA | Automated metamodel generation from code |
| Prompt-to-model pipelines (Qlerify) | Natural language to domain model |
| Mono2Micro + Context Mapper | Trace analysis to service boundaries |
Eric Evans’s key insight: AI works for generic subdomains (CRUD, reporting, basic validation) but core domains need human involvement. For ERP systems, the accounting engine, tax calculation, and stock valuation are core domains requiring human judgment.
| Approach | Tool | Use Case |
|---|---|---|
| Shadow testing / traffic mirroring | Diffy, GoReplay, Speedscale | Run both systems in parallel, diff responses |
| Feature flags (4 modes) | LaunchDarkly, custom | NEW, OLD, MIRROR, PARITY modes for gradual cutover |
| Cross-database data diffing | Datafold | Compare data outputs between systems |
| Dual Run | Google Cloud | Run legacy and modern simultaneously |
| Consumer-driven contracts | Pact | Define expected behavior, verify both comply |
| AI-generated test suites | Custom (3-agent approach) | Extract logic, generate tests, generate passing code |
| Migration Type | Manual Timeline | AI-Assisted Timeline | Compression |
|---|---|---|---|
| Test framework migration | 18 months | 6 weeks (Airbnb) | ~12x |
| Java version upgrade (large) | Years | Months (Amazon) | ~5-10x |
| .NET framework upgrade | Months | Hours (Microsoft) | ~100x |
| Legacy codebase migration | 12-26 months | 4-12 months | ~2-3x |
| ERP system migration | 12-24+ months | Unknown | Unknown |
AI compression ratios are highest for homogeneous, repetitive transformations (Java 8 to 17, Enzyme to RTL) and lowest for heterogeneous, business-logic-heavy transformations (ERP migration).
| Statistic | Source |
|---|---|
| 57% of migrations take longer than anticipated | Industry surveys |
| 83% fail or exceed budgets | Gartner |
| Vendor accuracy claims require verification on your codebase | Microsoft (early 2026) |
1. Multi-Agent Pipelines
Analyze, translate, validate, iterate — the dominant architecture for AI-assisted migration.
2. Retry Over Prompting
Simple retry with error feedback outperforms sophisticated prompt engineering.
3. Rich Context Wins
Choosing the right related files matters more than prompt engineering.
4. Specialized Models
Specialized models outperform general-purpose LLMs for targeted language pairs.
5. Strangler Fig + AI
The lowest-risk combination for incremental legacy replacement.
6. Parity Is Non-Negotiable
Shadow testing and parity validation before production cutover.
7. Smaller Teams, Higher Skills
5-10x fewer people, but they need architectural judgment.
8. Verify Vendor Claims
Gap between demo results and production results is significant.
| Metric | Value |
|---|---|
| AI-in-ERP market (2023) | $4.5B |
| AI-in-ERP market (projected 2033) | $46.5B |
| SAP S/4HANA migration rate | 37% of 35,000 customers |
| Typical ERP migration timeline | 12-24+ months |
No documented case of AI-assisted full ERP system migration exists publicly. PearlThoughts’ ERPNext experiment is among the earliest documented attempts at AI-assisted ERP cross-language migration.