Inside Visa's AI Engine: Software Innovations That Process 76,000 Transactions Per Second

Updated: March 23 2025 21:27
In a presentation at NVIDIA GTC 2025, Sarah Laszlo, head of the machine learning platform at Visa, pulled back the curtain on how one of the world's largest payment processors has engineered its software infrastructure to thrive in the generative AI era. Visa's journey offers valuable insights for any organization looking to leverage AI while managing finite hardware resources and stringent data security requirements.

The Scale of Visa's Data Challenge

To appreciate Visa's AI innovations, we must first understand the massive scale at which they operate:

Processing up to 76,000 transactions per second
Operating at 99.99% availability
Serving over 4.5 billion Visa cards
Supporting 130+ million merchant locations across 200+ countries
Preventing more than $27 billion in fraud annually

This transaction volume creates a uniquely valuable dataset that spans decades. Unlike the text and image data that powers many AI systems, Visa's data is mixed-type, non-syntactic, and tabular - presenting both extraordinary opportunities and unique challenges.

AI Principles as the Foundation

At Visa, all AI work is guided by five core principles:

Security - Payment data safety is paramount
Control - Both Visa and users maintain control over data
Value - Data is used only when it benefits stakeholders
Fairness - Models must be accurate, explainable, and safe
Accountability - Every employee must uphold these principles

These principles, particularly the commitment to security, directly influence Visa's technology stack. To maintain the highest levels of data security, Visa handles sensitive data exclusively in their own data centers rather than in the cloud. This creates a fundamental constraint: finite hardware resources that cannot be frequently refreshed or easily scaled.

Three Software Innovations That Power Visa's AI

Facing these constraints, Visa has developed innovative software solutions to maximize the performance of their GPU fleet. Laszlo highlighted three key innovations:

1. Virtual GPU (VGPU) Solution

Visa noticed that many users, particularly those running Jupyter notebooks, would claim an entire GPU but use only a fraction of its capacity. This led to a situation where teams would report being unable to get GPU resources despite their overall GPU quota never being fully utilized.

The solution was implementing virtual GPUs, allowing users to provision as little as one-tenth of a physical GPU. This increased the number of notebooks that could run on GPUs by a factor of 10, dramatically improving utilization. Technically, this was achieved by:

Using a fork of the open-source Volcano library
Implementing a Kubernetes-compatible mechanism to isolate memory and compute between processes
Creating a transparent experience where users could select VGPU instances the same way they would select any other instance type

2. Ray Framework Adoption

Visa observed friction at the boundaries between different tools in their AI/ML workflow. Data scientists were using separate tools for data conditioning (Spark), model training (Dask), and model serving (various options).
To streamline this process, Visa adopted the Ray framework across their entire ML stack:

Ray Data for data operations
Ray Train for distributed model training
A custom variant of Ray Serve (called ACE Serve) for low-latency model serving

The results were remarkable, particularly for XGBoost models, which train approximately 8x faster when distributed over a pod of GPUs. What used to take hours now takes minutes, significantly improving the development workflow.

3. Performance Controller with Systalyze

The third innovation addresses the challenge of training foundation-scale models on Visa's unique transaction data. Unlike image or language models where the hyperparameter space is well understood, transaction foundation models represent uncharted territory.

In collaboration with MIT-founded startup Systalyze, Visa implemented a performance controller that:

Manages the rapidly changing state-of-the-art software environment
Helps keep hardware optimized for best performance
Provides guidance on hyperparameter optimization

In a proof of concept, this solution delivered an 8x improvement in training iteration time for large language models compared to base Kubernetes on Q-bray. Over a year, this translates to millions of dollars in GPU hour savings.

Visa's AI Journey

Visa's approach to AI has evolved through distinct eras:

1990s: Pioneering statistical and machine learning models for risk and fraud
2010s: Expanding into gradient boosting and deep learning (the "predictive AI era")
2020s: Early adoption of generative AI techniques
Future: Already exploring small language models and quantum ML

The current generative AI era builds on Visa's unique data advantages. Even a few transactions can reveal significant patterns about consumer behavior. For example, knowing that someone bought premium coffee and concert tickets at midnight provides insights into their preferences and lifestyle.

By building foundation models on this transaction data, Visa can create personalized representations of cardholders without exposing their raw personal information - enabling better fraud detection and personalized services while maintaining privacy.

Visa's journey offers several important takeaways for organizations building AI infrastructure:

Guide technology with principles: Visa's AI principles directly shape their infrastructure decisions.
Innovate to maximize finite resources: When you can't easily add hardware, software innovation becomes essential.
Eliminate friction between tools: Moving to a unified framework (Ray) reduced development time and complexity.
Learn from missteps: Laszlo shared three approaches they've deprecated - automating gaps between tools instead of unifying tools, using Dask instead of Ray, and maintaining a GUI for feature engineering that constantly required backend updates.
Balance on-premises and cloud: While keeping sensitive data on-premises, Visa is strategically exploring cloud options for less sensitive workloads.

Visa's experience demonstrates that even with hardware constraints and strict security requirements, thoughtful software innovation can unlock tremendous AI capabilities. As Sarah Laszlo concluded, even infrastructure work should be guided by principles, and collaboration with open source, academic, and startup communities can help organizations overcome the unique challenges they face in the AI era.

Inside Visa's AI Engine: Software Innovations That Process 76,000 Transactions Per Second

China MagicBot: The Collaborative Humanoid Robot Supports Industrial Automation

Robotic Foundation Models: Transforming How Robots Learn and Adapt

OpenAI Rolls Out Internal Knowledge Feature in ChatGPT Team for Enterprise Search Solutions

Who’s Using AI the Most? Surprising Trends from Anthropic Claude 3.7 Sonnet’s Usage Data

Alibaba Cloud Announced Open Source Qwen2.5-Omni-7B Multimodal AI with End-to-End Capabilities

OpenAI Adopts Anthropic's Model Context Protocol (MCP) for Enhanced Connectivity

Microsoft's New AI Agents Researcher and Analyst Set to Transform How We Work

Google DeepMind Unveiled Gemini 2.5 With Advanced Thinking Capabilities

OpenAI Unveils Revolutionary Image Generation in ChatGPT: A New Era for Visual AI

GTC 2025 Inside Boston Dynamics Humanoid Robotic Breakthroughs