Guides & Tutorials9 min read

Build Server-Side UTM Attribution with OpenClaw Workflow Automation

By ButterGrow Team

TL;DR

This hands on tutorial shows how to implement server side UTM and event attribution using OpenClaw, a compact Postgres schema, and a few small scripts. You will capture UTMs and click IDs on the first touch, persist them as sessions, join conversions to sessions, and optionally forward confirmed purchases to GA4. The result is a dependable foundation for reporting and optimization that avoids pixel fragility and gives your team control over data quality. We will reference OpenClaw workflow automation once so you can map the same pattern to other funnels.

What You Will Build

You will assemble a production ready attribution pipeline that captures UTMs and click IDs from web requests, normalizes them, stores them as sessions, and connects orders to those sessions. The pipeline is designed to be resilient to retries and late arriving events, supports upserts without losing state, and can emit server side conversions to analytics platforms. Along the way, you will see a minimal data model, an OpenClaw Playbook, a TypeScript transformer, and the SQL used to answer business questions like first touch and last non direct touch revenue.

Before starting, skim the AI marketing automation features on the ButterGrow site at what ButterGrow does so you know where this workflow will plug in. If you are new to the product, you can get started in minutes with the onboarding flow. Throughout the tutorial we will point to answers to common questions that teams ask during implementation, and we will link a related post on retries and DLQs for reliable pipelines.

Architecture Overview

At a high level the pipeline has four parts:

  • Ingest a webhook that contains UTMs, click IDs, and metadata from your web tier or tag manager.
  • Normalize and enrich the payload in an OpenClaw transformer, generating a stable idempotency key.
  • Upsert the session into Postgres and emit an event when an order is created or updated.
  • Optionally mirror confirmed conversions to GA4 via the Measurement Protocol for consistency with media reporting.

The end state is visible inside ButterGrow where you can route downstream campaigns or build dashboards without adding client side pixels.

Data Model and Constraints

We will use two small tables: sessions and orders. Sessions represent the first landing event with UTMs. Orders represent a conversion. Orders link to sessions via session_id. If you need user level granularity later, add a users table with a surrogate user_id and link both sessions and orders to the user. The structure below is intentionally compact and supports common analyses while keeping writes simple.

Step 1Create a Minimal Postgres Schema

The following DDL creates the base tables with useful indexes and a natural place to store idempotency keys.

-- sessions: first touch visit with UTMs and click IDs
CREATE TABLE IF NOT EXISTS sessions (
  session_id      TEXT PRIMARY KEY,
  user_id         TEXT,
  first_seen_at   TIMESTAMPTZ NOT NULL DEFAULT now(),
  landing_page    TEXT,
  referrer        TEXT,
  utm_source      TEXT,
  utm_medium      TEXT,
  utm_campaign    TEXT,
  utm_term        TEXT,
  utm_content     TEXT,
  gclid           TEXT,
  fbclid          TEXT,
  idempotency_key TEXT UNIQUE
);

CREATE INDEX IF NOT EXISTS idx_sessions_first_seen ON sessions(first_seen_at);
CREATE INDEX IF NOT EXISTS idx_sessions_utm ON sessions(utm_source, utm_medium, utm_campaign);

-- orders: confirmed conversions
CREATE TABLE IF NOT EXISTS orders (
  order_id        TEXT PRIMARY KEY,
  session_id      TEXT REFERENCES sessions(session_id),
  user_id         TEXT,
  revenue_cents   BIGINT NOT NULL,
  currency        TEXT NOT NULL DEFAULT 'USD',
  status          TEXT NOT NULL,
  created_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  updated_at      TIMESTAMPTZ NOT NULL DEFAULT now(),
  idempotency_key TEXT UNIQUE
);

CREATE INDEX IF NOT EXISTS idx_orders_created ON orders(created_at);

Step 2Define Your Ingest Contract

For consistency, your web tier should POST a well formed JSON payload whenever a session starts and whenever an order is placed. A typical session payload might look like this. Notice that we include a client_id and a session_id generated by the web tier so we can line up analytics platforms and the database.

{
  "event": "session_start",
  "session_id": "s_01HQW8...",
  "client_id": "GA4.123456.78910",
  "user_id": "u_42",
  "landing_page": "https://example.com/pricing",
  "referrer": "https://www.google.com/",
  "utm": {
    "source": "google",
    "medium": "cpc",
    "campaign": "brand-search",
    "term": "acme brand",
    "content": "v1"
  },
  "click_ids": { "gclid": "CjwK...", "fbclid": null }
}

An order payload would then include the session identifier and revenue.

{
  "event": "order_created",
  "order_id": "o_1001",
  "session_id": "s_01HQW8...",
  "user_id": "u_42",
  "revenue_cents": 12900,
  "currency": "USD"
}

Step 3Create an OpenClaw Playbook for Ingest and Upsert

Below is a compact Playbook with an HTTP trigger, a TypeScript transform to normalize data, and two sinks that upsert into Postgres. The same pattern works for other databases with minor changes. If you are new to Playbooks, the product documentation explains how to load secrets and run in a staging space.

name: utm_attribution_pipeline
version: 1

triggers:
  - id: ingest_webhook
    type: http
    path: /webhooks/attribution
    method: POST

nodes:
  - id: normalize
    type: typescript
    source: |
      export default async function run(event) {
        const body = event.payload || {};
        const now = new Date().toISOString();
        const sessionId = body.session_id || `s_${Date.now()}`;
        const utm = body.utm || {};
        const click = body.click_ids || {};
        const keySeed = [body.event, sessionId, body.order_id || '', body.user_id || ''].join(':');
        const idempotencyKey = await hash(keySeed);
        return {
          event: body.event,
          session: {
            session_id: sessionId,
            user_id: body.user_id || null,
            landing_page: body.landing_page || null,
            referrer: body.referrer || null,
            utm_source: utm.source || null,
            utm_medium: utm.medium || null,
            utm_campaign: utm.campaign || null,
            utm_term: utm.term || null,
            utm_content: utm.content || null,
            gclid: click.gclid || null,
            fbclid: click.fbclid || null,
            idempotency_key: idempotencyKey,
            first_seen_at: now
          },
          order: body.event === 'order_created' ? {
            order_id: body.order_id,
            session_id: sessionId,
            user_id: body.user_id || null,
            revenue_cents: body.revenue_cents,
            currency: body.currency || 'USD',
            status: 'confirmed',
            idempotency_key: idempotencyKey
          } : null
        };
      }

  - id: upsert_session
    type: sql
    if: "{{ nodes.normalize.output.event == 'session_start' }}"
    config:
      connection: ${POSTGRES_URL}
      statement: |
        INSERT INTO sessions (
          session_id, user_id, landing_page, referrer,
          utm_source, utm_medium, utm_campaign, utm_term, utm_content,
          gclid, fbclid, idempotency_key, first_seen_at
        ) VALUES (
          :session_id, :user_id, :landing_page, :referrer,
          :utm_source, :utm_medium, :utm_campaign, :utm_term, :utm_content,
          :gclid, :fbclid, :idempotency_key, :first_seen_at
        )
        ON CONFLICT (session_id) DO UPDATE SET
          user_id = EXCLUDED.user_id,
          utm_source = COALESCE(sessions.utm_source, EXCLUDED.utm_source),
          utm_medium = COALESCE(sessions.utm_medium, EXCLUDED.utm_medium),
          utm_campaign = COALESCE(sessions.utm_campaign, EXCLUDED.utm_campaign),
          utm_term = COALESCE(sessions.utm_term, EXCLUDED.utm_term),
          utm_content = COALESCE(sessions.utm_content, EXCLUDED.utm_content),
          gclid = COALESCE(sessions.gclid, EXCLUDED.gclid),
          fbclid = COALESCE(sessions.fbclid, EXCLUDED.fbclid);
      params:
        session_id: "{{ nodes.normalize.output.session.session_id }}"
        user_id: "{{ nodes.normalize.output.session.user_id }}"
        landing_page: "{{ nodes.normalize.output.session.landing_page }}"
        referrer: "{{ nodes.normalize.output.session.referrer }}"
        utm_source: "{{ nodes.normalize.output.session.utm_source }}"
        utm_medium: "{{ nodes.normalize.output.session.utm_medium }}"
        utm_campaign: "{{ nodes.normalize.output.session.utm_campaign }}"
        utm_term: "{{ nodes.normalize.output.session.utm_term }}"
        utm_content: "{{ nodes.normalize.output.session.utm_content }}"
        gclid: "{{ nodes.normalize.output.session.gclid }}"
        fbclid: "{{ nodes.normalize.output.session.fbclid }}"
        idempotency_key: "{{ nodes.normalize.output.session.idempotency_key }}"
        first_seen_at: "{{ nodes.normalize.output.session.first_seen_at }}"

  - id: upsert_order
    type: sql
    if: "{{ nodes.normalize.output.event == 'order_created' }}"
    config:
      connection: ${POSTGRES_URL}
      statement: |
        INSERT INTO orders (
          order_id, session_id, user_id, revenue_cents,
          currency, status, idempotency_key
        ) VALUES (
          :order_id, :session_id, :user_id, :revenue_cents,
          :currency, :status, :idempotency_key
        )
        ON CONFLICT (order_id) DO UPDATE SET
          revenue_cents = EXCLUDED.revenue_cents,
          status = EXCLUDED.status,
          updated_at = now();
      params:
        order_id: "{{ nodes.normalize.output.order.order_id }}"
        session_id: "{{ nodes.normalize.output.order.session_id }}"
        user_id: "{{ nodes.normalize.output.order.user_id }}"
        revenue_cents: "{{ nodes.normalize.output.order.revenue_cents }}"
        currency: "{{ nodes.normalize.output.order.currency }}"
        status: "{{ nodes.normalize.output.order.status }}"
        idempotency_key: "{{ nodes.normalize.output.order.idempotency_key }}"

Step 4Generate a Stable Idempotency Key

The transformer above calls a helper hash to produce a deterministic key. In OpenClaw you can import a standard helper or inline a tiny function. Here is a simple TypeScript example that computes a SHA 256 hash. This prevents duplicate writes when retries or network hiccups occur.

import crypto from 'crypto';

export async function hash(input: string): Promise<string> {
  return crypto.createHash('sha256').update(input).digest('hex');
}

Step 5Store UTM parameters in Postgres

With the Playbook in place, your sessions and orders will be upserted on each event. If you prefer more control over connection pooling, move the SQL nodes into a small service that reads from an OpenClaw queue and performs writes in batches. That approach can reduce load during spikes while keeping the same external contract. Either way, the session table becomes the source of truth for first touch UTMs.

Step 6Send server side GA4 conversions

To keep platform reporting aligned, send a mirrored conversion to GA4 from your backend after the order is confirmed. Use the same client_id or user_id that you observed on the website so sessions line up. Below is a minimal curl example using the Measurement Protocol.

export MEASUREMENT_ID=G-XXXXXXX
export API_SECRET=your_secret

curl -X POST \
  "https://www.google-analytics.com/mp/collect?measurement_id=$MEASUREMENT_ID&api_secret=$API_SECRET" \
  -H "Content-Type: application/json" \
  -d '{
    "client_id": "GA4.123456.78910",
    "user_id": "u_42",
    "timestamp_micros": 1730572800000000,
    "events": [{
      "name": "purchase",
      "params": {
        "currency": "USD",
        "value": 129.00,
        "transaction_id": "o_1001"
      }
    }]
  }'

Step 7Query First Touch and Last Non Direct Touch Revenue

Once data is landing, you can derive useful views with plain SQL. The queries below illustrate two common models. They assume that a later process or a trigger populates orders.session_id from the session that existed when the user placed the order.

-- First touch attribution: credit the first session for each order
WITH first_sessions AS (
  SELECT s.user_id, MIN(s.first_seen_at) AS ft_time
  FROM sessions s
  GROUP BY s.user_id
)
SELECT o.order_id, o.revenue_cents, s.utm_source, s.utm_medium, s.utm_campaign
FROM orders o
JOIN first_sessions f ON f.user_id = o.user_id
JOIN sessions s ON s.user_id = o.user_id AND s.first_seen_at = f.ft_time;

-- Last non direct touch: ignore direct visits where utm_source IS NULL and referrer is empty
WITH candidates AS (
  SELECT s.*, ROW_NUMBER() OVER (
    PARTITION BY s.user_id ORDER BY s.first_seen_at DESC
  ) AS rn
  FROM sessions s
  WHERE NOT (s.utm_source IS NULL AND (s.referrer IS NULL OR s.referrer = ''))
)
SELECT o.order_id, o.revenue_cents, c.utm_source, c.utm_medium, c.utm_campaign
FROM orders o
JOIN candidates c ON c.user_id = o.user_id AND c.rn = 1;

Step 8Build a small replay tool for resilience

Create a short script that replays dead letter events or a date range of webhooks into the pipeline. This makes migrations and backfills safe. If you want more rigor on retries and failure isolation, our related guide on retries and DLQs for reliable pipelines walks through patterns for queues, retry budgets, and idempotent writes.

# replay.js: resend stored events to OpenClaw
node replay.js --from 2026-05-01 --to 2026-05-31 \
  --endpoint https://your-gateway/webhooks/attribution \
  --token $REPLAY_TOKEN

Step 9Connect the data to campaigns and reporting

Inside ButterGrow you can map audiences and lifetime value models to the session and order tables. For example, you might build a segment of first time purchasers sourced from a specific campaign and trigger a welcome series.

Troubleshooting and Validation Checklist

  • Session rows are created exactly once per first touch and updated only to fill missing UTM fields.
  • Orders are upserted, and repeated notifications update revenue and status instead of duplicating rows.
  • GA4 receives a purchase event with the same transaction_id as the database record.
  • Your SQL joins return the expected campaign breakdowns for at least three test orders.
  • The replay tool can resend events safely without creating duplicates.

Practical Long Tail Scenarios

Teams ask variations of these questions during implementation, so we included them here in query style language you can paste into tickets.

  • How to store UTM parameters in Postgres for attribution when users browse across devices.
  • How to track fbclid for attribution and still use first party cookies only.
  • How to build a multi touch attribution model with only two compact tables.

The answers map directly to the schema, transformer, and SQL included earlier.

To place this project in your wider stack, deploy a staging pipeline with your GA4 sandbox and document the payload contracts for your web team.

Your final checklist for rollout is simple. Ensure secrets are loaded, rotate API keys, add a monitor to verify events flowing per minute, and set a monthly cleanup job for sessions older than your retention policy so storage costs stay predictable.

This tutorial focused on UTMs and orders, but the same approach works for trials, demo requests, and usage events. If your product has a free to paid motion, add a plans table and reuse the session link to analyze upgrade paths later. For B2B funnels, enrich sessions with account identifiers from your CRM so attribution can be summarized at the company level.

When your team is satisfied with staging results, copy the Playbook to production and switch the GA4 endpoint to your production property. Keep the staging environment so you can validate changes safely during new campaigns or landing page tests.

To validate reporting consistency, build a saved SQL query that returns daily revenue by utm_source and utm_campaign. Compare it to GA4 for a few days using the same date range and timezone. Differences should be small if you used the same user identity and mirrored purchase events from the backend.

Finally, set up a weekly task to export modeled revenue by campaign to your media platforms for automated budget shifts. This closes the loop and turns data collection into a feedback system for growth.

ButterGrow users commonly move from this baseline to advanced forecasts, but the foundation remains the same. Capture first touch cleanly, keep writes idempotent, and join conversions to sessions with explicit keys. If you encounter edge cases such as orders without sessions, treat them as defects and fix the capture path rather than bolting on heuristics. Clean inputs make every downstream workflow simpler.

ButterGrow can host this pipeline for you and wire it to campaigns, dashboards, and agent workflows. If you want to try it with your data, start in the onboarding flow.

References

Frequently Asked Questions

Which UTM parameters and click IDs should I store for reliable attribution?+

Capture utm_source, utm_medium, utm_campaign, utm_term, utm_content, the landing_page URL, and the referrer if available. Also record gclid and fbclid so you can reconcile ad platform conversions with your server side events.

How do I deduplicate events in this pipeline to avoid double counting?+

Include a deterministic idempotency_key derived from user_id plus timestamp or a stable session identifier, and perform an INSERT...ON CONFLICT upsert in Postgres. In OpenClaw, keep a short lived cache keyed by that id so retries and late arrivals do not create duplicates.

Can I send GA4 conversions from the backend while keeping user identity consistent?+

Yes. Use the GA4 Measurement Protocol with the same user_id or client_id you observed on the website and include session_id when available. Mirror the event_name and revenue parameters so your database and GA4 show aligned totals.

What is the minimal Postgres schema for UTMs and orders that still supports multi touch analysis?+

Use a sessions table with UTM columns and a separate orders table with revenue and a session_id foreign key. You can later derive first touch or last non direct touch attribution with SQL window functions without changing the data model.

How should I handle privacy and retention for UTM and click ID data?+

Store only what you use, set a retention policy, and avoid persisting raw IP addresses unless required for fraud controls. Respect consent preferences from your CMP, and filter collection for users who declined marketing tracking.

What is the best way to test this end to end before going live?+

Start with a staging GA4 property and a test database. Use the campaign URL builder to craft URLs with distinct UTMs, submit test orders with a small revenue amount, and validate that your SQL joins link orders to the correct sessions and that GA4 receives matching events.

Ready to try ButterGrow?

See how ButterGrow can supercharge your growth with a quick demo.

Book a Demo