Making AI work at scale requires coordination across systems, data flows, and controls, along with clear accountability for how decisions are made and acted upon

Imagine inheriting responsibility for a railway network that has been expanded continuously for 40 years without ever shutting down. Over time, new lines were added. Signalling systems were upgraded. Routing changes were introduced using better tools. But access to newer tools never reduced the complexity of making changes without disruption.
This is because the real challenge was never the tools. It was understanding how the network had evolved, where its critical dependencies lay, how disruptions in one part could cascade across the system, and which changes could be made without compromising overall reliability. Enterprise IT environments are built in a similar way, conditioned by years of change that cannot simply be undone or rebuilt from scratch.
Yet much of the current AI conversation assumes that introducing newer tools or the latest GenAI capabilities can make such environments easier to modify overnight, particularly following recent demonstrations of GenAI tools that claim to read and modernise legacy COBOL systems that continue to run critical enterprise workloads.
What most narratives miss in this context is the fact that many of the most visible, “pathbreaking” AI demonstrations today take place in what are essentially greenfield environments, where workflows can be modified without constraint.
Enterprise modernisation, however, happens in brownfield systems, environments that have been in continuous use for years while being extended to support new products, higher volumes, and global operations. These were originally built to automate record-keeping across industries such as banking, insurance and payments, and were never replaced when digital channels emerged, only layered over. This meant that the underlying business logic remained, repeatedly modified and rarely documented, until much of it ended up buried in millions of disorganised lines of code.
Relearning that logic can take years and carries real risk, which is why many modernisation efforts slow down, narrow in scope, or settle for re-hosting and stack refreshes that cut infrastructure costs while leaving the underlying complexity largely untouched.
Claims that GenAI tools can independently modernise legacy environments may hold at the application layer, but translating those changes into production-ready systems typically requires delivery capabilities that account for system-specific constraints across live enterprise environments—capabilities that remain beyond the scope of standalone GenAI tools today.
This is because changes to legacy systems can only be efficiently carried out within tried-and-tested frameworks that account for runtime dependencies, upstream and downstream integrations, access controls, data governance policies, and compliance requirements. It is within this context that legacy systems can begin to be examined more systematically as sources of insight. By analysing source code alongside configuration files, manuals, audit trails, and historical changes, delivery teams can reconstruct how systems behave and why.
Once comprehension is no longer the bottleneck, modernisation takes on a different shape. It no longer needs to be a risky, one-time exercise. Change can happen incrementally. Intelligence can be separated from execution and reused.
Core systems can be reshaped over time without destabilising what already works. Technical debt stops compounding by default because understanding is continuously refreshed rather than rediscovered.
This also reframes a common anxiety around AI: the fear of replacing engineers.
In high-stakes environments, no one is handing mission-critical systems over to autonomous decision-making overnight. What organisations are asking for instead is speed with reliability. They want to make steady progress on backlogs while running systems that no longer depend on institutional memory.
The deeper impact shows up when these AI capabilities extend beyond development into how systems are operated.
As testing, maintenance, and day-to-day operations become more automated, effort-heavy models begin to weaken. When issues can be anticipated rather than reacted to, and systems stabilised instead of constantly fixed, pricing based on headcount or ticket volumes starts to lose relevance. Outcomes such as uptime, speed to market, reduced operational risk and business performance begin to matter more. This goes beyond incremental efficiency and changes how value is defined and assessed.
But this kind of progress does not show up through dramatic announcements. We have seen this pattern before with the internet boom, where early market and industry attention focussed on short-term volatility and visible disruption, while the deeper transformation unfolded gradually inside organisations.
AI is following the same arc.
It is important, however, to note that making AI work at scale requires coordination across systems, data flows, and controls, along with clear accountability for how decisions are made and acted upon. Just as importantly, it requires thoughtful ways of combining automation with human judgement in environments where reliability and resilience cannot be compromised.
The bottom line is that AI is not here to simply write code faster or replace decades of software overnight. It is here to help organisations finally understand what they are running, and to give them a safer, measured path to evolve it.
(The author is the chief executive officer and managing director, Mphasis. Views are personal.)