Building a Taboola MCP Server for Claude: The Four-Resource Schema (Plus One Drift Tool) That Stops Site-Level CPL Hallucinations

Article title in white and green text on dark teal background, centered as a full-bleed hero header.

Share This Post

TL;DR

  • A working mcp server for taboola campaign reporting in claude needs four separate resources (campaigns, sites, creatives, daily_performance), not one flat performance blob.
  • Taboola site_ids repeat across campaigns. Without a compound (campaign_id, site_id) key, Claude silently joins them and returns CPL numbers that look clean and are arithmetically meaningless.
  • Hand Claude a 40-row JSON and ask for week-over-week CPL deltas, and it rounds and reorders. A server-side get_drift_candidates tool that returns a pre-ranked list fixes it.
  • Ship v1 read-only. Pause and budget tools belong behind a two-step confirmation tool, not in the agent’s open hand.
  • The same schema extends to Outbrain Amplify and Newsbreak Ads. You rebuild OAuth and dimension names. The agent layer stays untouched.

Questions this article answers:

Most agencies running native ads have already tried the obvious AI move. Export a Taboola Backstage report to CSV, paste it into Claude, ask which sites drifted on CPL this week. The answer looks crisp. It is also, usually, wrong. The numbers do not tie back to the API, the site rankings shift if you ask twice, and creative-level attribution comes out garbled.

That is not a prompting problem. It is a schema problem. Taboola’s reporting shape is campaign to site to creative to day, and that hierarchy does not survive a flat export. The fix is a small Model Context Protocol (MCP) server that exposes Backstage to Claude as four structured resources plus one drift tool, so the agent reads live data through a shape that mirrors how a buyer actually thinks.

This guide is for technical marketers and AI practitioners running real native ad spend who want to build or audit that server. The pattern is reusable. Once it works for Taboola, the same skeleton extends to Outbrain Amplify and Newsbreak Ads without rewriting the agent.

Why Pasting Taboola Exports Into Claude Produces Confidently Wrong CPL Analysis

Pasting a Taboola CSV into Claude breaks for three specific reasons, and none of them are fixable with a better prompt.

First, the hierarchy collapses. Taboola’s Backstage API (developer docs here) returns site performance and creative performance from different endpoints. When you export to CSV, the relationship between a creative and the site it ran on flattens into a single wide row, or worse, two unrelated sheets. Claude cannot reconstruct the join.

Second, site_ids repeat across campaigns. In practice, the same publisher site shows up under Campaign A and Campaign B with the same site_id string, but it represents two different bid landscapes, two different audiences, and two different creative rotations. A flat export looks like one site with two rows. Claude treats it as one site and averages the CPL. The number it returns is arithmetically clean and operationally meaningless.

Third, Claude is bad at arithmetic across rows. Hand it a 40-row JSON payload of daily CPLs and ask for a week-over-week delta ranked by site, and it will silently round, reorder, and sometimes invent a row. Anthropic’s own guidance on tool use is consistent on this: when the math matters, the tool does the math, not the model.

Why better prompting cannot fix a schema problem

You can prompt around one of these failures at a time. You cannot prompt around all three at once. The data shape coming in is wrong, and no instruction will reconstruct a hierarchy that was destroyed in the export step. The only fix is to stop exporting and let Claude read the API directly through a schema that preserves the relationships.

What MCP Actually Is, and Why Resources vs Tools Maps Cleanly to Ad Reporting

Key Concept: Model Context Protocol (MCP) is an open standard from Anthropic that lets an AI agent read structured data and call functions on an outside system through one consistent interface. It has three primitives: resources (data Claude can query), tools (functions Claude can run), and prompts (reusable templates).

The split between resources and tools is the architecture decision that decides whether your agent hallucinates. Resources are read-only and stable. The agent queries them like rows in a database. Tools are computed, parameterized, and run on demand. The agent calls them like functions.

Ad platform reporting maps onto this split cleanly. Campaign metadata, site lists, creative inventories, daily numbers, those are resources. Drift detection, alerting, pausing a site, those are tools. Put everything in one bucket and the agent gets confused about when to compute and when to look up. Split them correctly and behavior gets predictable.

Anthropic’s MCP documentation covers the protocol itself. What it does not cover is which side of the line each ad platform concept belongs on. That is the work below.

Portrait process-flow infographic in teal and green outlining MCP server steps for Taboola campaign reporting in Claude.
The mcp server for taboola campaign reporting in claude process, step by step.

The Four-Resource Schema: Campaigns, Sites, Creatives, Daily Performance

The Taboola MCP server exposes four resources, each with a compound key. The keys are the whole point.

Resource Key Backstage endpoint (map to your account’s reporting routes) Cache strategy
campaigns campaign_id campaigns list endpoint Cache 1 hour
sites (campaign_id, site_id) campaign-by-site breakdown report Cache 15 minutes
creatives (campaign_id, item_id) campaign-by-item breakdown report Cache 15 minutes
daily_performance (campaign_id, date) campaign summary report, grouped by day Live, no cache

Each resource wraps one Backstage reporting route. Confirm the exact paths against your account’s Backstage API reference before wiring them up, since dimension names and report routes can vary by account configuration. Authentication uses whichever Backstage auth flow your account is provisioned for (typically a server-to-server token exchange). The server holds the credentials in environment variables, refreshes the access token on its own, and never lets the token enter the model context.

The site_id collision problem in one paragraph

Taboola site_ids are stable identifiers for publisher placements, but they are not analytically unique across campaigns. The site 1234567 running under your insurance lead-gen campaign at a $4.20 bid is a different thing than the same site 1234567 running under your auto warranty campaign at a $1.80 bid. Different audience slice, different creative mix, different CPL profile. If you expose a single sites resource keyed on site_id alone, Claude will see the duplicate and collapse them. The number it returns will look authoritative. It will not be real.

The compound key (campaign_id, site_id) forces every site row to be addressable only within its campaign context. The agent cannot join across campaigns by accident because the keys do not match. Same logic for creatives: (campaign_id, item_id), never just item_id.

Which resources hit the API live, and which cache

Campaign metadata changes slowly. Cache it for an hour. Site and creative breakdowns change throughout the day as spend rolls in. Fifteen-minute cache is a reasonable floor, longer if you are rate-limited. Daily performance is what the operator is actually asking about, so that one hits the API live every time. The cache layer lives in the server, not in Claude. The agent always sees fresh data. The API just does not get hammered.

If you have already built something similar for call tracking, the pattern echoes our Ringba MCP server build, which uses the same four-resource discipline for publisher-level call data.

The Fifth Piece: A get_drift_candidates Tool That Computes CPL Deltas Server-Side

This is the single highest-leverage choice in the build.

The operator’s real question is not “show me daily performance.” It is “which sites or creatives spiked on CPL this week versus last week, ranked by how much they moved the campaign.” Asking Claude to compute that across a 40-row JSON is exactly the failure mode the model is worst at.

The fix is a tool named get_drift_candidates that takes a campaign_id, a current date range, a comparison date range, and a threshold. The server pulls both windows, does the math, and returns a ranked list. Claude receives an already-sorted answer and reports it.

Operator Note: Once get_drift_candidates exists, the model’s job collapses from “do arithmetic and rank” to “narrate the ranked list and call out the top three.” That is what Claude is actually good at. Reliability jumps the moment you move the math.

The tool implements three formulas:

  • Week-over-week CPL delta = (CPL_current − CPL_prior) ÷ CPL_prior
  • Site contribution to drift = (site_spend_current ÷ total_campaign_spend_current) × (site_CPL_current − site_CPL_prior) (spend-weighted CPL move, so the site contributions sum to roughly the campaign’s CPL drift)
  • Creative fatigue signal = CTR_last_7_days ÷ CTR_first_7_days (a ratio below 0.7 is a common native-fatigue working threshold)

The return shape is a list of objects, each with site_id, campaign_id, prior CPL, current CPL, absolute delta, percentage delta, and contribution score. The list comes back sorted by contribution descending. The agent does not sort, does not round, does not aggregate. It reads and reports.

Why a tool and not a resource: the call is parameterized (date ranges, threshold) and computed on demand. Resources are for stable queryable data. Tools are for computed answers. Get this split wrong and Claude will try to recompute drift itself off the daily_performance resource, which is exactly the failure you were trying to avoid.

Wiring It Into Claude Desktop and Claude Code Without Leaking Credentials

Two wiring paths, depending on who is using the server.

For an analyst working in chat mode, use Claude Desktop. Edit claude_desktop_config.json and register your server under mcpServers with the command to start it and the env vars it needs. Restart the app. The four resources show up as queryable, the drift tool shows up as callable.

For an engineer iterating on the server itself, use Claude Code with a project-level .mcp.json. That file lives in the repo, gets versioned, and lets you test the server in agent mode while you build it. Anthropic’s MCP quickstart covers both config formats.

Where Backstage credentials actually live

In environment variables on the machine running the MCP server. Not in claude_desktop_config.json itself if you can avoid it. Not in the model context, ever. The server reads TABOOLA_CLIENT_ID and TABOOLA_CLIENT_SECRET at startup, exchanges them for an access token against Taboola’s auth endpoint, and refreshes the token on a timer. Claude only sees the responses, never the credentials.

If the token expires mid-conversation, the server’s HTTP client should catch the 401, refresh, and retry once. The agent should not know this happened.

Rate limits and what to cache

Backstage rate limits are reasonable but not generous. Cache campaign metadata aggressively. Paginate site and creative breakdowns. Let daily_performance hit live, because that is the resource the operator is actually asking about and stale numbers there are worse than slow numbers.

Read-Only by Default: Where Human Review Belongs

Quick Win: Ship v1 of the server with zero write tools. No pause_site. No update_budget. No kill_creative. Read-only servers cannot cause a Monday morning incident.

The temptation to expose write actions is real. Claude finds a fatigued creative, you want it to kill the creative. But an agent that can change spend without a human in the loop is a liability waiting for a bad inference. Models still hallucinate function-call arguments. They still occasionally misread a date range. The blast radius of a misfire is paused spend on a campaign that was actually fine.

If you do expose write tools in v2, gate each one behind a confirmation tool. The pattern: propose_pause_site returns a summary of what would change, the human reads it in chat, the human calls confirm_pause_site with the proposal ID, and only then does the actual write fire. Two steps, one human in between.

Three decisions stay human-owned regardless of how mature the agent gets. Budget changes during high-stakes windows like open enrollment or peak season. Site pauses on any campaign with active learning. Creative kills on items that just hit a fatigue threshold but might still recover with a refresh. The agent surfaces the candidate. The buyer decides.

Extending the Same Pattern to Outbrain and Newsbreak Without Rewriting the Agent

The payoff for this architecture is that swapping platforms does not break the agent.

What carries over from Taboola: the four-resource shape, the compound-key discipline, the drift tool’s math, the read-only-by-default posture. An Outbrain MCP server has campaigns, sections (their word for sites), creatives, and daily performance. A Newsbreak server has the same four. The keys are still compound. The drift tool still does the math server-side.

What you rebuild per platform: the auth flow (Outbrain’s token exchange differs from Taboola’s, Newsbreak’s API is younger and less consistent), the dimension naming (Outbrain section_id instead of site_id), and the rate-limit handling. None of that touches the agent layer. Claude’s prompts, the resources it knows about, the tool signatures, all stay identical from the agent’s point of view.

A quick comparison of the build-it-yourself path versus the off-the-shelf options:

Option Drift math server-side Compound keys When it wins
Taboola’s realize-mcp No No You want a starting point and will fork it
Supermetrics hosted MCP No, flat schema No You want managed auth and accept hallucination risk
Custom build Yes Yes You run native at scale and need reliable analysis

The official Taboola repo is a useful skeleton, but it does not solve the site_id collision problem and it does not pre-compute deltas. Supermetrics’ connector handles auth nicely and reintroduces the flat-schema problem on the way out. The custom build is the only path that closes both gaps.

Frequently Asked Questions

What is MCP, and why does it matter for a native ads buyer?

Model Context Protocol is an open standard from Anthropic that lets Claude read structured data and call functions on an outside system through one consistent interface. For a native ads buyer, it means the agent can query live Backstage data instead of working off stale CSV exports. The shape of the data the server exposes is what decides whether the agent’s answers are reliable.

Why does Claude round numbers when I paste a Taboola export?

Claude rounds and reorders when asked to compute math across roughly 30 or more rows of JSON because numerical reasoning over long structured payloads is a known model weakness. Anthropic’s tool use documentation recommends pushing arithmetic to a function rather than asking the model to compute. A server-side get_drift_candidates tool eliminates the failure entirely.

Should I build my own server or fork Taboola’s realize-mcp?

Fork realize-mcp as a starting point for auth and endpoint wrappers, then add compound keys and the drift tool yourself. The official repo does not implement either, so the hallucination failure modes still show up out of the box. The custom layer is roughly a week of engineering work for a competent backend developer.

How do I keep Backstage credentials out of the model context?

Store your Backstage credentials as environment variables on the MCP server, exchange them for an access token at startup, and refresh on a timer. The model only sees API responses, never the credentials or the token. If a request returns 401 mid-conversation, the server catches it, refreshes, and retries silently.

Should I expose pause_site and budget tools in v1?

No. Ship v1 strictly read-only. Write tools that change spend are a Monday morning incident waiting for a bad inference, and the operational cost of a wrongly paused site outweighs the convenience. If you add write tools later, gate each one behind a confirmation tool that requires a human to approve a generated change summary.

How do I extend the same server to Outbrain and Newsbreak?

Reuse the four-resource shape, compound keys, and drift tool math. Rebuild only the auth flow, dimension names, and rate-limit handling per platform. Because the agent layer only sees resources and tools, the Claude prompts and skills you have built do not change. Swapping platforms becomes a server-side job, not an agent rewrite.

Talk to Elevarus About Your AI Marketing Workflow

The through-line is simple. A four-resource schema with compound keys stops Claude from collapsing site_ids that should not be joined. A server-side drift tool stops Claude from doing arithmetic it is bad at. Read-only by default keeps the agent from causing incidents while you build trust in its outputs. Together, those three choices turn a plausible summarizer into a reliable analyst.

If you are running real spend on Taboola, Outbrain, or Newsbreak and want to build this server, audit an existing one, or have us architect it end to end, book a free consultation with Elevarus. We will walk through the schema, the integration path, and where human review still belongs in the workflow.



Ready to put this into action?

Picture of SHANE MCINTYRE

SHANE MCINTYRE

Founder & Executive with a Background in Marketing and Technology | Director of Growth Marketing.