Blog

AI Can Migrate Code. Who Validates That the Business Still Works?

Before modernizing legacy systems with AI, assess your verification capacity. In critical systems, modernization requires testing, observability, human judgment, and a methodology to move forward with evidence.

Photo of Federico Toledo with the headline “Migrating Faster Isn’t Enough If Risk Remains Invisible.”

Legacy system modernization is back at the center of the technology conversation.

For years, many organizations knew they had to modernize their core systems. They also knew these projects could take too long, require massive investment, and expose the business to a level of risk that was hard to accept.

AI changes part of that equation.

Today, teams can use agents to read large codebases, reconstruct business logic while migrating to a new technology stack, generate documentation, propose test scenarios, identify dependencies, and give decision-makers more context.

The market reaction brought the topic into focus. In February 2026, Anthropic published two pieces on AI-powered legacy modernization:

  • The Code Modernization Playbook, where it argues that AI agents specialized in software development tasks can help modernize legacy systems that once seemed too complex, risky, or costly. 
  • An article on COBOL modernization with Claude Code, where it states that tools like Claude Code can automate phases of exploration and analysis, map dependencies, document flows, identify risks, and support migration planning for COBOL systems. 

That same month, after the COBOL article, Reuters reported that IBM suffered its biggest one-day stock drop since 2000, with shares falling 13.2% in a single day. 

For those of us who work in software quality, that market reaction reinforces something we have seen in real projects: AI can accelerate code understanding and transformation, but the ability to validate behavior still defines the real risk of a migration.

In critical systems, faster migration creates value when teams can also prove that the business still works as expected.

Are you evaluating a legacy migration with AI?
At Abstracta, we help teams assess their verification capacity before moving forward. AI can accelerate modernization, but in critical systems, control depends on evidence.
Contact us to migrate faster, reduce uncertainty, and define a Quality Engineering strategy proportional to risk.

Why Migrate Legacy Systems

A migration can respond to different technical, operational, economic, and strategic reasons. Sometimes the trigger is a platform reaching end of support. Other times, the cost of maintaining old infrastructure, expensive licenses, or highly specific technical knowledge starts to weigh too heavily.

There may also be reasons related to scale, performance, security, compliance, integration with new systems, vendor changes, user experience improvement, availability, resilience, or digital transformation.

ReasonWhat Usually Happens
Technological obsolescenceThe platform, language, framework, or infrastructure stops receiving support.
CostsMaintaining licenses, hardware, or specialized knowledge becomes increasingly expensive.
Scale and performanceThe system no longer responds well to the current volume of users, data, or transactions.
Security and complianceThe existing technology makes it harder to meet new regulations, audits, or standards.
IntegrationThe system needs to connect with APIs, digital channels, analytics tools, or modern platforms.
Business continuityThe organization needs greater availability, resilience, and recovery capacity.

“Migration aims to preserve the functional value of a system that is important to the organization while updating its technological foundation,” spelled out Guillermo Amorin, AI & Development Manager at Abstracta.

Guillermo explained that process can affect architecture, data, programs, interfaces, infrastructure, and operations. That’s why every technical decision must be paired with a quality question:

How will we prove that the behavior the business depends on still works?

The Key Before Migrating

The answer to that question starts before transforming code. “In a critical system, the organization needs to know which behavior must be preserved, where that behavior is expressed today, and what evidence can be used to compare it against the new version,” highlighted Guillermo.

That means identifying business rules, flows, data, integrations, exceptions, batch processes, permissions, reports, and control points. It also means defining tests, traces, metrics, and acceptance criteria to decide whether the modernized system can sustain operations.

Overall, before migrating, the organization needs to understand whether it can observe, compare, and verify that the business still works as expected. That verification capacity will define how governable the migration will be and what level of risk the team can take on at each stage.

Legacy Code Still Supports Real Businesses

When we talk about legacy systems, the image that often comes to mind is old software that nobody wants to touch. But the reality is usually more interesting: many legacy systems run every day. They process transactions, calculate interest, move money, manage accounts, handle inventory, connect teams, and support operations that cannot stop.

In banking, insurance, retail, government, healthcare, and telecommunications, a lot of value still runs on legacy technologies: systems written in COBOL or RPG, mainframe platforms, old databases, batch processes for closing periods or large-scale calculations, and monolithic architectures.

Many legacy systems have important strengths in their current operation:

  • They are stable and support critical processes every day.
  • They process large volumes of transactions.
  • They are deeply integrated into the business.
  • They respond to special rules, exceptions, and real situations accumulated over years.
  • They preserve operational knowledge that is often not fully documented.

The challenge usually appears when the organization needs to change something:

  • Migrate to the cloud or to new infrastructure.
  • Replace a platform that is reaching the end of support.
  • Change vendors or reduce technological dependency.
  • Modernize the architecture so the system is easier to maintain.
  • Integrate the system with new digital channels, APIs, or platforms.
  • Reduce operating, licensing, or infrastructure costs.
  • Support a need for scale, availability, or resilience.

That is when difficult questions appear.

Where does the business logic live?
Which flows are truly critical?
Which historical rules need to be preserved?
How do we detect a difference between the current system and the new one?
Who can validate whether an outcome is acceptable for the business?

The answers are usually distributed. Part of them lives in the code. Another part lives with people who know the business. There are also signals in logs, databases, historical incidents, batch processes, integrations, and behaviors the system has accumulated over years.

In a migration, that dispersion of knowledge is a central part of the risk. The harder it is to understand and observe current behavior, the more important it becomes to build evidence before making changes.

The Hard Part Is Knowing What Changed

At this point, Michael Feathers’s definition in Working Effectively with Legacy Code: legacy code is code without tests.

For a migration, that idea is very concrete. If we don’t have tests, observability, or a clear way to compare behaviors, we depend too much on intuition, memory, and partial validations.

The current system may have technical debt, historical rules, and exceptions that are hard to explain. It’s also the system that supports real operations today.

That’s why every difference between the current system and the new one needs analysis: it may be a new defect, an intended correction, a historical bug that came to light, or a rule the business needs to preserve.

A few years ago, we experienced this closely in a project that still helps us explain the problem.

A Story That Still Explains the Problem

A few years ago, we worked on a core RPG system for a supermarket chain in Latin America.

The company had started as a small store. Over time, it grew to 10 locations. Later, it decided to expand into a new geographic area.

The problem was a historical constraint: the store identifier in the core system could only manage up to 10 locations.

At first glance, it seemed like a limited change: allowing the system to manage one more store. In practice, however, that data appeared in central business flows: accounts, balances, money movements, internal operations, transactions, and daily processes.

Gathering the knowledge was not easy: the documentation was outdated or simply didn’t exist, and there were parts of the system that nobody knew how to explain. Some modules hadn’t been touched for years, some areas of the code made the team cautious, and the behavior of certain features was not completely clear.

We spent a year and a half interviewing key people, reconstructing the system logic, and writing tests. Then came the code change, which took about a month and a half.

More than 90% of the project was testing.

The project went well, and that experience left us with a very relevant idea: many migrations are postponed or become very expensive because validating the change can be more complex than writing the change.

AI is beginning to change that math. It gives us new capabilities to understand, explore, and verify systems that used to require a lot of manual work.

What Changes with AI: More Capacity to Understand and Verify

With AI, we gain more reach to study legacy systems before modifying them.

In these systems, business logic is often distributed across old files, scripts, batch processes, integrations, incomplete comments, and decisions accumulated over years. In many cases, the code is the most reliable source for reconstructing how the system works.

An agent can help navigate that volume of information, identify dependencies, detect business rules, and turn scattered knowledge into inputs the team can review. That capacity changes the starting point. 

Tero: Agents to Talk to the Code

With Tero, our open source framework for building AI agents that work with context, we developed capabilities to explore large systems through natural language questions.

The idea is to expand the team’s reach. An agent can help understand flows, dependencies, business rules, critical paths, and possible scenarios based on evidence from the system itself.

We can ask concrete questions:

  • What happens when interest is calculated on overdue debt?
  • Which tables are updated when this operation is confirmed?
  • Which parts of the code participate in this flow?
  • Which APIs are involved in this operation?
  • Where is this data validated?
  • Which rules affect this balance?
  • Which processes run before this file is generated?

The answers work as working hypotheses. They help the team investigate better, prioritize critical flows, design scenarios, and decide which behavior needs validation. 

For Quality Engineering, that’s the central point: understanding the system better to prioritize critical flows, design better scenarios, and decide which behavior needs validation.

Human analysis remains at the center, while AI helps find signals, organize information, and turn hidden knowledge into concrete inputs to validate behavior.

Before comparing the current system with the migrated one, we need to understand which behavior matters, where the logic lives, and which paths deserve priority. Talking to the code helps build that map.

What Exactly Does Testing Mean in a Legacy Migration?

In a legacy migration, validation focuses on proving equivalence: the new version must preserve the critical behavior of the current system.

That current system is already in production. It processes real operations, supports business processes, and reflects rules accumulated over years. It may also have technical debt, exceptions that are hard to explain, and historical bugs the business has already learned to handle.

That is why migration combines risk, complexity, and uncertainty. The team needs to identify which behavior to preserve, which differences to accept, and which signals to review before moving forward.

Characterization testing gives us a practical way to approach that problem: we capture how the current system behaves, run the same scenarios in the migrated version, and compare results.

The critical point appears when we compare outputs. Every difference between the current system and the new version needs classification: it may be a defect introduced by the migration, an expected correction, a historical bug that came to light, or a business rule the organization needs to preserve.

Validation Happens at Two Levels

1. Functional level: business behavior

The team needs to understand how the current system behaves in real flows. This includes documenting processes, interviewing users or business stakeholders, validating existing documentation, and designing risk-based test cases.

At this point, the focus is not only on showing equivalence, but also on understanding the business and the system in order to design tests that can verify correctness. This is also where teams may find existing bugs or parts of the system that are no longer in use.

2. Technical level: automation and evidence

The team needs to analyze code, logs, dependencies, data, environment constraints, integrations, and the real possibilities for automation. This level makes it possible to generate tests, review coverage, define traces, and detect technical limits before comparing the current system with the new version.

This is a complementary approach, where the focus is on characterization testing: covering the legacy system, saving the initial and final states with all outputs, and then comparing them with the execution in the migrated system.

AI Helps, but It Doesn’t Replace Judgment

Generating tests from code can provide evidence, but those tests need review. A model can create weak assertions, invent mocks, hardcode data, or cover paths that do not represent real system usage.

In addition, the biggest blocker is often not in the code, but in access to environments, repositories, licenses, real test data, or people with system knowledge who also support daily operations.

That’s why AI adds value when it helps teams understand, prioritize, and generate evidence. In a critical migration, that evidence needs methodology, human experience, and clear criteria to decide which behavior to preserve, which differences to accept, and which signals to review before moving forward.

Before Migrating, We Measure Verification Capacity

Characterization testing helps capture and compare behavior. However, as mentioned above, before executing a migration, we need to know whether the organization has the real conditions to do it with evidence.

To answer that, we look at the system across several dimensions:

DimensionWhat We Look to Understand
Functional testsWhether critical flows are covered and can be repeated.
Integration, APIs, and contractsWhether the boundaries between systems are clear and validated.
Continuous regressionWhether we can detect behavior changes early.
Performance and other baselinesWhether we have a reference point before migrating.
ObservabilityWhether we can see logs, traces, data, and effects during execution.
Incidents and traceabilityWhether we know the system’s historical failures, causes, and sensitive areas.
ReversibilityWhether there is a realistic way to roll back.
Available knowledgeWhether we know who can validate rules, exceptions, and business decisions.

Not all dimensions carry the same weight. Tests on critical flows, continuous regression, and reversibility usually define the risk ceiling: if there’s no minimum testing safety net, if we cannot run controls in a repeatable way, or if there’s no realistic way to roll back, the migration needs preparation before moving forward.

That diagnosis makes it possible to classify the risk of each part of the system. When verification capacity is low, the first step is to build a minimum evidence network: approval tests on critical flows, a behavior baseline, repeatable execution, basic observability, and a rollback plan.

When verification capacity is more mature, the team can move forward with incremental slices, quality gates, and more automated comparisons between the current system and the new one.

The conversation with the business also changes. The team can discuss evidence: which behavior is covered, which differences appeared, who can validate them, and what level of risk the organization accepts before moving forward.

A Slice-Based Methodology

A full migration needs a way to govern risk with evidence. At Abstracta, we have been formalizing this approach as a methodology based on phases and slices.

A slice can be a vertical unit of business value, a technical layer, a critical flow, or a part of the system with clear boundaries. Each slice has its own diagnosis, risk profile, and strategy.

The methodology is organized into 6 phases:

  • Discovery and baseline. We build a verifiable picture of the current system.
  • Migration strategy. We define what changes, in what order, and through which approach.
  • Behavior validation. We create the testing and evidence network to compare the current system with the new one.
  • Implementation. We execute the change gradually and use early testing to adjust the strategy.
  • Cutover and rollout. We expand traffic with evidence, monitoring, and a prepared rollback.
  • Stabilization and handover. We close the cycle with operations, learning, documentation, and knowledge transfer.

The key is to treat each part of the system according to its risk. An accounting core, a batch process, a reporting module, and an API gateway can coexist within the same system and require different strategies.

A slice with little evidence first needs a safety net. A slice with tests, CI, observability, and rollback can move forward with more demanding and automated gates.

This way, migration stops being a giant project that is hard to govern. Each part has a diagnosis, required actions, and evidence of progress.

Conclusion: AI-Powered Quality Engineering to Migrate with Control

The risk of a legacy migration is defined by the ability to verify behavior in the new environment. When the team can capture baselines, compare outputs, observe effects, and review differences with people who know the business, the migration moves forward with evidence.

AI expands that capacity: it helps explore codebases, document rules, prioritize scenarios, and generate tests. Control appears when Quality Engineering turns that output into verifiable evidence: characterization testing, regression, observability, performance, rollback, and acceptance criteria.

We are very excited about what AI makes possible in this field. Using it well also requires looking honestly at its limits. That’s why software quality needs to be at the center of the strategy: in diagnosis, planning, implementation, and rollout.

In critical systems, responsibility remains human. Modernizing with control means using AI to accelerate the work and using quality engineering to prove that the business still works as it should.

At Abstracta, we work at that intersection: AI-powered quality engineering, human experience, context-aware agents, observability, and validation methodologies.

Want to modernize critical systems with AI and Quality Engineering? Contact us

FAQs About AI-Powered Legacy Modernization

Illustration of a person thinking beside a large FAQs label.

What Is AI-Powered Legacy Modernization?

AI-powered legacy modernization uses agents and language models to help teams understand, document, test, transform, or migrate legacy systems. In critical systems, its value depends on combining AI with validation, observability, and human review.


How Do You Validate a Legacy System Migration?

A legacy system migration is validated by comparing the behavior of the current system with the behavior of the migrated version. To do this, teams usually need functional tests, data validations, observability, performance baselines, rollback, and business review when unexpected differences appear.


What Is Characterization Testing in a Migration?

Characterization testing consists of capturing how an existing system behaves and using that behavior as a reference during the change. In a migration, it helps detect whether the new version preserves the behavior that matters to the business.


How Do AI Agents Help QA Teams During a Legacy Migration?

AI agents can help explore repositories, understand business rules, identify dependencies, generate test ideas, create documentation, and connect test execution with technical context such as logs, traces, database changes, and code paths.


About Abstracta

Illustration of connected software quality, AI, development, and collaboration workflows.


With nearly 2 decades of experience and a global presence, Abstracta is a technology company that helps organizations deliver high-quality software faster by combining AI-powered quality engineering with deep human expertise.

Our expertise spans across industries and complex delivery environments. That’s why we’ve built robust partnerships with industry leaders, such as Microsoft, Datadog, Tricentis, Perforce BlazeMeter, Sauce Labs, and PractiTest.

Want to modernize critical systems with AI and Quality Engineering? Contact us

Illustration of two people sending a message, inviting readers to contact Abstracta.

Follow us on LinkedIn & X to be part of our community!

Recommended for You

AI Upskilling for Finance

21 QA Metrics Every High-Performing Software Team Should Track

API Testing Strategies in Fintech: Real Challenges and Solutions

Tags In
545 / 545