What an AI CoE Is Actually Supposed to Do

Most enterprise AI programs hit the same wall around month nine.

The pilots worked. The leadership deck looked great. A small team built something genuinely impressive — maybe a RAG system over policy documents, maybe an agent that drafts responses to customer complaints. Then the organization tried to scale it. And suddenly the same RAG pipeline was being built six times across six departments, each with its own evaluation harness, its own prompt patterns, and its own quietly-different definition of "production-ready."

The fix isn't another platform. It's an operating model.

The pattern that consistently works is a three-layer structure: a Center of Excellence (CoE) that builds the foundation, business functional units that translate it, and embedded AI pods that execute against real outcomes. Each layer has a job that the others can't do well. When one of them tries to absorb another's responsibilities, the whole thing slows down.

Here's how the layers actually divide the work.

Layer 1: The AI Center of Excellence — Foundation

The CoE's role is to build, govern, and evolve the enterprise AI system. It is not a delivery team. The moment a CoE starts shipping use cases on behalf of business units, it stops being able to do its actual job — which is to make everyone else's delivery faster and safer.

Strategy and governance. The CoE owns the enterprise AI strategy and translates it into policies, risk standards, and compliance guardrails. This is where decisions get made about what model usage is acceptable, what data can flow where, and what ethical lines the organization won't cross. Without this layer, every team negotiates these questions from scratch — usually in the middle of a launch.

Platforms and infrastructure. Shared AI platforms — data infrastructure, MLOps tooling, model hosting — live here. The goal is a scalable, secure, production-ready environment that any pod in the company can build on without reinventing the substrate.

Canonical artifacts and standards. This is the layer most organizations underinvest in, and it's the one with the highest leverage. The CoE develops and maintains reusable assets: agent frameworks, skills libraries, sub-agent architectures, scaffolding, and the standards that govern how things get built, evaluated, and deployed. These artifacts are how the CoE scales its expertise without having to be in every meeting.

Vendor and model landscape. Someone has to decide which models, vendors, and agentic tools are approved for enterprise use, and someone has to track what's emerging that the company should adopt next. That work doesn't belong inside any single business unit.

Evaluation and observability. The CoE defines evaluation methodologies and benchmarks, and provides the monitoring, logging, and tracing tooling that makes production AI legible. If every team invents its own evals, you can't compare anything to anything.

Capability enablement. Training, expert support, and unblocking complex problems for business units. The CoE earns its keep here — not by gatekeeping, but by being the team people genuinely want to call.

Layer 2: Business Functional Units — Translation and Adoption

If the CoE is the foundation, functional units are the connective tissue. Their job is to adapt CoE capabilities to a specific business context — finance, supply chain, customer operations, legal — and drive adoption at scale within that domain.

Use case ownership and prioritization. Functional units identify, prioritize, and fund AI use cases. They define what success looks like in business terms — not "model accuracy" but "reduction in claims handling time" or "first-contact resolution rate." This ownership matters because it forces accountability for outcomes, not outputs.

Artifact adaptation and integration. This is where the canonical artifacts from the CoE get domain-specific. The function takes the standard agent framework and customizes it for, say, regulated financial workflows. They take the skills library and extend it with domain-specific skills. Critically, they ensure consistency across pods within their function — so two teams in supply chain aren't building incompatible versions of the same thing.

Pod enablement and coordination. Functional units guide the AI pods inside their domain on how to use the standards, tools, and architectures. They coordinate across pods to avoid duplicated work and conflicting approaches.

Adoption and change management. Building the model is the easy part. Embedding it into real business processes — getting the underwriters to actually use the assistant, getting the support team to trust the agent's drafts — is where most AI value is won or lost. This belongs to the function, because only the function has the relationships and context to drive it.

Feedback loop to the CoE. Functional units are the CoE's most important customer. They provide structured feedback on what's working and what isn't in the artifacts, standards, tools, and infrastructure. Without this loop, the CoE drifts into building things nobody uses.

Training and adoption monitoring. Role-specific training, tracking actual usage of approved tools and patterns, and surfacing gaps where teams are going off-script — all sit at the functional layer.

Layer 3: AI Pods — Execution

Pods are where AI solutions actually get built and run. They are small, embedded, cross-functional teams — typically a mix of ML/AI engineers, product, and domain experts — sitting close enough to the business to feel its problems directly.

End-to-end solution delivery. Pods translate a business problem into a working AI solution: build, test, deploy. They own the model, the agent, or the workflow from problem statement to production.

Data, modeling, and fine-tuning. Preparing and managing domain-specific data, choosing the right modeling technique for the problem — prompting, RAG, fine-tuning — and applying it. The discipline here is matching the technique to the problem rather than reaching for the most sophisticated tool available.

Integration and deployment. Wiring the AI into the applications, systems, and workflows where work actually happens. A model that lives in a notebook is a demo; a model integrated into the case management system is a product.

Evaluation and performance tracking. Pods apply the CoE's evaluation frameworks and measure both technical performance (accuracy, latency, cost) and business performance (the metrics the function defined upfront). Both matter. Either one alone is misleading.

Monitoring and iteration. Production AI degrades. Models drift, data distributions shift, edge cases emerge. Pods own the ongoing work of monitoring, handling errors, and continuously improving the system.

Collaboration and reuse. Pods use CoE-approved tools and artifacts rather than building from scratch, and they push learnings and reusable components back upstream. This upward flow is what keeps the canonical artifacts current and grounded in reality.

What each layer protects against

The three layers aren't just a tidy way to draw an org chart. Each one prevents a specific failure mode that the others can't catch.

A CoE without a functional layer becomes an ivory tower. Frameworks accumulate, nobody adopts them, and the gap between "what's available" and "what's actually used" widens until the CoE loses credibility — which is much harder to recover from than people realize.

A functional layer without pods becomes a planning function with nothing to translate. Roadmaps get written, governance gets refined, and nothing ships.

Pods without a CoE either reinvent infrastructure on every project or, worse, ship solutions that violate governance, can't be observed, and become production liabilities the moment something goes wrong.

All three layers are non-negotiable, because each one is the only thing standing between the organization and a specific way the program can fail.

What makes this model work

Beyond the structural protection, a few things distinguish operating models that scale from ones that stall.

The first is separation of concerns without separation of context. The CoE doesn't ship use cases. Functional units don't build platforms. Pods don't define enterprise policy. But all three talk constantly — through the artifacts, the feedback loops, the training programs, and the shared evaluation language. Artifacts flow downward and feedback flows upward. The boundaries are about ownership, not isolation.

The second is artifacts as the unit of leverage. The CoE's reach is determined by how good its canonical artifacts are. A well-designed agent framework, an opinionated skills library, a default evaluation harness — these scale expertise in a way that meetings and office hours never can. If your CoE isn't producing artifacts that pods actually want to use, it's underperforming, regardless of how much governance documentation it's published.

The third is the feedback loop is structural, not aspirational. Functional units provide structured feedback to the CoE; pods push reusable components upstream. This isn't a nice-to-have — it's how the system stays alive. A CoE that doesn't get challenged by real-world delivery becomes an ivory tower in about eighteen months.

The fourth is business outcomes are owned at the right altitude. Pods own delivery, but functions own outcomes. This is the right split, because outcomes depend on adoption, process change, and organizational behavior — things pods can't drive alone.

The fifth, and the one that makes the whole thing worth the effort, is compounding. When the layers are working, every solution makes the next solution easier to build. Pods consume canonical artifacts instead of starting from scratch. Functional units accumulate domain-specific extensions that the next pod inherits. The CoE folds production lessons back into the foundation. The system gets cheaper, faster, and safer over time — which is the actual ROI of an operating model, and what no individual project can deliver on its own.

The trap to avoid

The most common failure mode is a CoE that tries to do everything — strategy, platforms, artifacts, and delivery. It feels efficient at first ("we have all the experts in one place") and then collapses under its own weight. The CoE becomes a bottleneck for every use case, business units lose ownership of outcomes, and the work that only the CoE can do — standards, evaluation, vendor strategy — gets crowded out by delivery pressure.

The opposite failure is a CoE that does only governance. Pure policy with no platform, no artifacts, and no enablement. Pods then build everything themselves, inconsistently, and the governance documents go unread.

The three-layer model works because it forces each layer to do what it's uniquely positioned to do. The CoE builds the system. Functional units make it relevant. Pods make it real. The foundation, the translation, the execution — each indispensable, none sufficient on its own.

That's the operating model. The hard part isn't designing it. It's having the discipline to keep the layers honest as the program grows.

References:

About the Author

Dr. Rohit Aggarwal is a professor, AI researcher and practitioner. His research focuses on two complementary themes: how AI can augment human decision-making by improving learning, skill development, and productivity, and how humans can augment AI by embedding tacit knowledge and contextual insight to make systems more transparent, explainable, and aligned with human preferences. He has done AI consulting for many startups, SMEs and public listed companies. He has helped many companies integrate AI-based workflow automations across functional units, and developed conversational AI interfaces that enable users to interact with systems through natural dialogue.

Table of Content

Layer 1: The AI Center of Excellence — Foundation

Layer 2: Business Functional Units — Translation and Adoption

Layer 3: AI Pods — Execution

What each layer protects against

What makes this model work

References:

About the Author

What an AI CoE Is Actually Supposed to Do

Enterprise AI doesn’t fail — it stalls. Here’s how a three-layer model (CoE, functions, and embedded pods) turns early wins into real scale.

Layer 1: The AI Center of Excellence — Foundation

Layer 2: Business Functional Units — Translation and Adoption

Layer 3: AI Pods — Execution

What each layer protects against

What makes this model work

The trap to avoid

References:

About the Author

You may also like