Cloudflare: Edge Security, CDN, AI Platform, and Performance

A deep technical guide covering DNS management, CDN caching, WAF rules, rate limiting, Workers serverless platform, Pages static deployment, Log Explorer analytics, bot management, Cloudflare Tunnel, Zero Trust, and the full AI platform: Workers AI (Infire engine, 50+ models, Neurons pricing), AI Gateway (Unified Billing, 14+ providers, 70+ models), Vectorize, AI Search, Agents SDK (Think, Artifacts, Sandboxes), Dynamic Workers, MCP servers, EmDash CMS, Containers, and Workflows. Also announced during Agents Week (April 2026): Agent Memory, a managed service giving agents persistent recall that improves over time, and the Agent Readiness Score, a diagnostic tool helping site owners measure how well their websites support AI agent interaction.

DNSDNSSECCDNWAFRate LimitingWorkersKVD1R2QueuesPagesLog ExplorerBot ManagementTunnelZero TrustEmail RoutingDDoSSSL/TLSArgoWorkers AIInfire EngineAI GatewayUnified BillingVectorizeAI SearchDynamic WorkersMCP ServersAgents SDKContainersWorkflowsEmDash CMS

1. DNS Management and DNSSEC

DNS Record Types and Configuration

Cloudflare acts as an authoritative DNS provider with global anycast. The orange cloud icon (proxied) routes traffic through Cloudflare's edge, enabling CDN, WAF, and DDoS protection. Gray cloud (DNS only) is a plain DNS resolution.

# Terraform - Cloudflare DNS records
resource "cloudflare_record" "root" {
  zone_id = var.zone_id
  name    = "@"
  content = "192.0.2.1"
  type    = "A"
  proxied = true
  ttl     = 1 # Auto when proxied
}

resource "cloudflare_record" "www" {
  zone_id = var.zone_id
  name    = "www"
  content = "josenobile.co"
  type    = "CNAME"
  proxied = true
}

resource "cloudflare_record" "api" {
  zone_id = var.zone_id
  name    = "api"
  content = "api-lb.example.com"
  type    = "CNAME"
  proxied = true
}

# MX records (cannot be proxied)
resource "cloudflare_record" "mx" {
  zone_id  = var.zone_id
  name     = "@"
  content  = "mx.zoho.com"
  type     = "MX"
  priority = 10
}

DNSSEC

DNSSEC adds cryptographic signatures to DNS records, preventing DNS spoofing and cache poisoning. Cloudflare manages the signing keys automatically; you just need to add the DS record at your registrar.

# Enable DNSSEC via API
curl -X PATCH "https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/dnssec" \
  -H "Authorization: Bearer ${CF_TOKEN}" \
  -H "Content-Type: application/json" \
  --data '{"status":"active"}'

# Response includes DS record to add at registrar:
# {
#   "ds_record": "example.com. 3600 IN DS 2371 13 2 abc123...",
#   "algorithm": "ECDSAP256SHA256",
#   "digest": "abc123...",
#   "key_tag": 2371
# }

2. CDN and Caching Rules

Cache Rules and TTLs

Cloudflare's CDN caches static assets at 300+ edge locations worldwide. Cache Rules (replacing the legacy Page Rules) provide fine-grained control over what gets cached, for how long, and under what conditions.

# Terraform - Cache Rules
resource "cloudflare_ruleset" "cache_rules" {
  zone_id = var.zone_id
  name    = "Cache Rules"
  kind    = "zone"
  phase   = "http_request_cache_settings"

  # Cache static assets aggressively
  rules {
    action = "set_cache_settings"
    action_parameters {
      cache = true
      browser_ttl { mode = "override_origin"; default = 2592000 }
      edge_ttl { mode = "override_origin"; default = 604800 }
    }
    expression = "(http.request.uri.path.extension in {\"js\" \"css\" \"png\" \"jpg\" \"webp\" \"woff2\" \"svg\"})"
    description = "Cache static assets: 7d edge, 30d browser"
  }

  # API: no cache
  rules {
    action = "set_cache_settings"
    action_parameters {
      cache = false
    }
    expression = "(starts_with(http.request.uri.path, \"/api/\"))"
    description = "Bypass cache for API endpoints"
  }

  # HTML: short edge cache, revalidate
  rules {
    action = "set_cache_settings"
    action_parameters {
      cache = true
      edge_ttl { mode = "override_origin"; default = 3600 }
      browser_ttl { mode = "override_origin"; default = 0 }
    }
    expression = "(http.request.uri.path.extension eq \"html\" or http.request.uri.path eq \"/\")"
    description = "Cache HTML: 1h edge, always revalidate browser"
  }
}

Cache Purging Strategies

Purge by URL for targeted invalidation, by tag for grouped assets, or purge everything for full cache clears. API-driven purging integrates with CI/CD for deploy-triggered cache busting.

# Purge specific URLs after deploy
curl -X POST "https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/purge_cache" \
  -H "Authorization: Bearer ${CF_TOKEN}" \
  -H "Content-Type: application/json" \
  --data '{
    "files": [
      "https://josenobile.co/",
      "https://josenobile.co/health/",
      "https://josenobile.co/scripts/lib/search-component.js"
    ]
  }'

# Purge by Cache-Tag header (Enterprise)
curl -X POST "https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/purge_cache" \
  -H "Authorization: Bearer ${CF_TOKEN}" \
  --data '{"tags": ["health-articles", "dashboard"]}'

Tiered Caching

Tiered Caching uses regional upper-tier data centers to serve cache misses before reaching the origin, reducing origin load and improving cache hit ratios. Smart Tiered Caching dynamically selects the optimal upper-tier topology.

# Terraform - Enable Tiered Caching
resource "cloudflare_tiered_cache" "default" {
  zone_id    = var.zone_id
  cache_type = "smart" # "smart" or "generic"
}

# Tiered Caching topology:
# Client -> Edge PoP (cache miss) -> Upper-Tier PoP (cache hit) -> Response
# Client -> Edge PoP (cache miss) -> Upper-Tier PoP (cache miss) -> Origin
# This reduces origin requests by 30-50% on average

3. WAF (Web Application Firewall)

Custom WAF Rules

Cloudflare WAF provides managed rulesets (OWASP Core Ruleset, Cloudflare Managed) and custom rules using wirefilter expressions. Custom rules let you block, challenge, or log traffic based on request attributes.

# Terraform - Custom WAF rules
resource "cloudflare_ruleset" "waf_custom" {
  zone_id = var.zone_id
  name    = "Custom WAF Rules"
  kind    = "zone"
  phase   = "http_request_firewall_custom"

  # Block known bad user agents
  rules {
    action     = "block"
    expression = "(http.user_agent contains \"sqlmap\" or http.user_agent contains \"nikto\" or http.user_agent contains \"nmap\")"
    description = "Block security scanner user agents"
  }

  # Challenge suspicious login attempts
  rules {
    action     = "managed_challenge"
    expression = "(http.request.uri.path eq \"/api/auth/login\" and http.request.method eq \"POST\" and ip.geoip.country ne \"CO\" and ip.geoip.country ne \"US\")"
    description = "Challenge foreign login attempts"
  }

  # Block access to sensitive paths
  rules {
    action     = "block"
    expression = "(http.request.uri.path contains \"/.env\" or http.request.uri.path contains \"/wp-admin\" or http.request.uri.path contains \"/.git\")"
    description = "Block access to sensitive files"
  }

  # Log high-risk requests for analysis
  rules {
    action     = "log"
    expression = "(cf.threat_score gt 30)"
    description = "Log high threat score requests"
  }
}

Managed Rulesets and OWASP

Cloudflare Managed Ruleset and the OWASP Core Ruleset provide automatic protection against common vulnerabilities: SQLi, XSS, RCE, LFI, and protocol anomalies. You can override sensitivity and action per rule or rule group.

# Terraform - Deploy managed WAF rulesets
resource "cloudflare_ruleset" "waf_managed" {
  zone_id = var.zone_id
  name    = "Managed WAF Rulesets"
  kind    = "zone"
  phase   = "http_request_firewall_managed"

  # Cloudflare Managed Ruleset
  rules {
    action = "execute"
    action_parameters {
      id = "efb7b8c949ac4650a09736fc376e9aee" # CF Managed
      overrides {
        rules {
          id      = "5de7edfa648c4d6891dc3e7f84534ffa"
          action  = "log"
          enabled = true
        }
      }
    }
    expression  = "true"
    description = "Deploy Cloudflare Managed Ruleset"
  }

  # OWASP Core Ruleset
  rules {
    action = "execute"
    action_parameters {
      id = "4814384a9e5d4991b9815dcfc25d2f1f" # OWASP
      overrides {
        categories {
          category = "paranoia-level-2"
          action   = "managed_challenge"
        }
      }
    }
    expression  = "true"
    description = "Deploy OWASP Core Ruleset"
  }
}
Jose's Experience: In production, I configured Cloudflare WAF with custom rules to protect the API from automated attacks, credential stuffing on the login endpoint, and SQL injection attempts. The WAF blocks thousands of malicious requests per month with minimal false positives on legitimate traffic.

4. Rate Limiting

Edge Rate Limiting Rules

Rate limiting at the edge stops abusive traffic before it reaches your origin servers. Cloudflare supports rate limiting by IP, IP + path, session, country, and custom keys using headers or cookies.

# Terraform - Rate limiting rules
resource "cloudflare_ruleset" "rate_limit" {
  zone_id = var.zone_id
  name    = "Rate Limiting"
  kind    = "zone"
  phase   = "http_ratelimit"

  # API global rate limit: 100 req/min per IP
  rules {
    action = "block"
    action_parameters {
      response {
        status_code  = 429
        content      = "{\"error\":\"Rate limit exceeded\"}"
        content_type = "application/json"
      }
    }
    ratelimit {
      characteristics     = ["cf.colo.id", "ip.src"]
      period              = 60
      requests_per_period  = 100
      mitigation_timeout   = 120
    }
    expression  = "(starts_with(http.request.uri.path, \"/api/\"))"
    description = "API rate limit: 100 req/min per IP"
  }

  # Login brute force protection: 5 attempts/min
  rules {
    action = "managed_challenge"
    ratelimit {
      characteristics     = ["ip.src"]
      period              = 60
      requests_per_period  = 5
      mitigation_timeout   = 600
    }
    expression  = "(http.request.uri.path eq \"/api/auth/login\" and http.request.method eq \"POST\")"
    description = "Login rate limit: 5 attempts/min"
  }
}

5. Workers (Serverless at Edge)

Worker Fundamentals

Cloudflare Workers run JavaScript/TypeScript at the edge in V8 isolates. Sub-millisecond cold starts, no containers, deployed to 300+ locations. Workers can intercept, modify, or generate HTTP responses.

// src/index.ts - Cloudflare Worker
export interface Env {
  API_KEY: string;
  RATE_LIMIT_KV: KVNamespace;
  ASSETS: Fetcher;
}

export default {
  async fetch(request: Request, env: Env, ctx: ExecutionContext): Promise<Response> {
    const url = new URL(request.url);

    // Serve static assets
    if (url.pathname.startsWith('/static/')) {
      return env.ASSETS.fetch(request);
    }

    // API proxy with auth header injection
    if (url.pathname.startsWith('/api/')) {
      const origin = new URL(request.url);
      origin.hostname = 'api-internal.example.com';

      const modifiedRequest = new Request(origin.toString(), {
        method: request.method,
        headers: new Headers(request.headers),
        body: request.body,
      });
      modifiedRequest.headers.set('X-Worker-Auth', env.API_KEY);
      modifiedRequest.headers.set('X-Real-IP', request.headers.get('CF-Connecting-IP') || '');

      return fetch(modifiedRequest);
    }

    // A/B testing via cookie
    const variant = request.headers.get('cookie')?.match(/ab_variant=(\w+)/)?.[1]
      || (Math.random() > 0.5 ? 'A' : 'B');

    const response = await fetch(`https://origin.example.com/${variant.toLowerCase()}${url.pathname}`);
    const newResponse = new Response(response.body, response);
    newResponse.headers.set('Set-Cookie', `ab_variant=${variant}; Path=/; Max-Age=86400`);
    return newResponse;
  }
};

Workers KV and Durable Objects

Workers KV is a globally distributed key-value store with eventual consistency. Durable Objects provide strong consistency and single-instance coordination for use cases like rate limiters, counters, and WebSocket rooms.

// Rate limiter using Durable Objects
export class RateLimiter implements DurableObject {
  private requests: Map<string, number[]> = new Map();

  async fetch(request: Request): Promise<Response> {
    const ip = request.headers.get('CF-Connecting-IP') || 'unknown';
    const now = Date.now();
    const windowMs = 60_000;
    const limit = 100;

    const timestamps = this.requests.get(ip) || [];
    const recent = timestamps.filter(t => now - t < windowMs);

    if (recent.length >= limit) {
      return new Response(JSON.stringify({ error: 'Rate limit exceeded' }), {
        status: 429,
        headers: {
          'Content-Type': 'application/json',
          'Retry-After': '60',
          'X-RateLimit-Limit': limit.toString(),
          'X-RateLimit-Remaining': '0',
        },
      });
    }

    recent.push(now);
    this.requests.set(ip, recent);

    return new Response(JSON.stringify({ allowed: true }), {
      headers: {
        'X-RateLimit-Remaining': (limit - recent.length).toString(),
      },
    });
  }
}

D1, R2, and Queues

D1 is Cloudflare's serverless SQLite database. R2 is S3-compatible object storage with zero egress fees. Queues enable asynchronous message passing between Workers for decoupled architectures.

// D1 - Serverless SQLite at the edge
export interface Env {
  DB: D1Database;
  BUCKET: R2Bucket;
  QUEUE: Queue<{ type: string; payload: unknown }>;
}

// D1: query SQLite
async function getUser(env: Env, id: string) {
  const stmt = env.DB.prepare('SELECT * FROM users WHERE id = ?').bind(id);
  return await stmt.first();
}

// R2: store/retrieve objects (S3-compatible)
async function uploadFile(env: Env, key: string, body: ReadableStream) {
  await env.BUCKET.put(key, body, {
    httpMetadata: { contentType: 'application/octet-stream' },
    customMetadata: { uploadedAt: new Date().toISOString() },
  });
}

async function getFile(env: Env, key: string) {
  const obj = await env.BUCKET.get(key);
  if (!obj) return new Response('Not found', { status: 404 });
  return new Response(obj.body, {
    headers: { 'Content-Type': obj.httpMetadata?.contentType || '' },
  });
}

// Queues: async message passing between Workers
export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    // Producer: send message to queue
    await env.QUEUE.send({
      type: 'user.signup',
      payload: { email: 'user@example.com', timestamp: Date.now() },
    });
    return new Response('Queued');
  },

  // Consumer: process messages from queue
  async queue(batch: MessageBatch<{ type: string; payload: unknown }>, env: Env) {
    for (const msg of batch.messages) {
      console.log(`Processing: ${msg.body.type}`);
      // Process message...
      msg.ack();
    }
  },
};

6. Pages (Static Site Deployment)

Cloudflare Pages Configuration

Pages deploys static sites directly from Git repositories with automatic builds, preview deployments for branches, and custom domain support. Zero-config for most static site generators.

# wrangler.toml for Pages project
name = "josenobile-co"
compatibility_date = "2024-12-01"
pages_build_output_dir = "./"

# Deploy via CLI
npx wrangler pages deploy ./ --project-name=josenobile-co

# Or connect Git repo for automatic deploys:
# Repository: github.com/josenobile/josenobile.co
# Branch: main -> Production
# All other branches -> Preview URLs

# Custom headers (_headers file)
/*
  X-Frame-Options: DENY
  X-Content-Type-Options: nosniff
  Referrer-Policy: strict-origin-when-cross-origin
  Permissions-Policy: camera=(), microphone=(), geolocation=()
  Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; font-src 'self'

/assets/*
  Cache-Control: public, max-age=31536000, immutable

/*.html
  Cache-Control: public, max-age=0, must-revalidate

Redirects and Custom Routing

The _redirects file handles URL redirects and rewrites. Combined with Pages Functions (server-side Workers), you get full-stack capability on a static site platform.

# _redirects file
/blog/*  /health/:splat  301
/old-page  /new-page  301
/api/*  https://api.example.com/:splat  200
/es/*  /:splat  302

# Pages Function (functions/api/health.ts)
export const onRequestGet: PagesFunction = async (context) => {
  const data = await fetch('https://api.example.com/health');
  return new Response(await data.text(), {
    headers: {
      'Content-Type': 'application/json',
      'Cache-Control': 'public, max-age=60',
    },
  });
};
Jose's Experience: josenobile.co is deployed on Cloudflare Pages with automatic Git-triggered deploys. Preview URLs for every branch enable review before merging to production. Combined with Cloudflare's global CDN, the site loads in under 500ms worldwide.

7. Log Explorer (SQL API for Analytics)

SQL-Based Log Querying

Cloudflare Log Explorer provides a SQL API to query HTTP request logs, firewall events, and Workers analytics. You can run ad-hoc queries for debugging, security investigation, and performance analysis.

# Top 10 paths by request count (last 24h)
SELECT
  ClientRequestPath AS path,
  COUNT(*) AS requests,
  AVG(EdgeTimeToFirstByteMs) AS avg_ttfb_ms,
  SUM(CASE WHEN EdgeResponseStatus >= 500 THEN 1 ELSE 0 END) AS errors_5xx
FROM http_requests
WHERE Datetime > NOW() - INTERVAL '24' HOUR
GROUP BY path
ORDER BY requests DESC
LIMIT 10;

# Detect anomalous traffic patterns
SELECT
  ClientIP,
  ClientCountry,
  COUNT(*) AS requests,
  COUNT(DISTINCT ClientRequestPath) AS unique_paths,
  AVG(EdgeTimeToFirstByteMs) AS avg_ttfb
FROM http_requests
WHERE Datetime > NOW() - INTERVAL '1' HOUR
GROUP BY ClientIP, ClientCountry
HAVING requests > 500
ORDER BY requests DESC;

# WAF blocked requests analysis
SELECT
  Action,
  RuleID,
  Description,
  COUNT(*) AS blocks,
  array_agg(DISTINCT ClientCountry) AS countries
FROM firewall_events
WHERE Datetime > NOW() - INTERVAL '7' DAY
  AND Action = 'block'
GROUP BY Action, RuleID, Description
ORDER BY blocks DESC;
Jose's Experience: I use Cloudflare Log Explorer's SQL API to monitor the platform's traffic patterns, investigate security incidents, and optimize caching hit rates. The SQL interface makes it possible to quickly answer questions like "which endpoints have the highest 5xx rate in the last hour?" without setting up separate log aggregation infrastructure.

8. Bot Management

Bot Score and Detection

Cloudflare's bot detection assigns a score from 1 (likely bot) to 99 (likely human) using machine learning, behavioral analysis, and fingerprinting. Rules can take different actions based on the bot score.

# Terraform - Bot management rules
resource "cloudflare_ruleset" "bot_rules" {
  zone_id = var.zone_id
  name    = "Bot Management"
  kind    = "zone"
  phase   = "http_request_firewall_custom"

  # Block definitely automated traffic on API
  rules {
    action     = "block"
    expression = "(cf.bot_management.score lt 10 and starts_with(http.request.uri.path, \"/api/\") and not cf.bot_management.verified_bot)"
    description = "Block definite bots on API"
  }

  # Challenge likely bots on forms
  rules {
    action     = "managed_challenge"
    expression = "(cf.bot_management.score lt 30 and http.request.method eq \"POST\" and not cf.bot_management.verified_bot)"
    description = "Challenge likely bots on POST"
  }

  # Allow verified bots (Googlebot, Bingbot, etc.)
  rules {
    action     = "skip"
    action_parameters { ruleset = "current" }
    expression = "(cf.bot_management.verified_bot)"
    description = "Allow verified search engine bots"
  }
}

# Super Bot Fight Mode (free tier)
resource "cloudflare_bot_management" "default" {
  zone_id              = var.zone_id
  fight_mode           = true
  enable_js            = true
  sbfm_definitely_automated = "block"
  sbfm_likely_automated     = "managed_challenge"
  sbfm_verified_bots        = "allow"
}

9. Cloudflare Tunnel

Tunnel Configuration

Cloudflare Tunnel (formerly Argo Tunnel) creates encrypted outbound-only connections from your origin to Cloudflare's edge, eliminating the need to expose public IPs or open inbound firewall ports. Traffic is routed through the nearest Cloudflare data center.

# Install and authenticate cloudflared
cloudflared tunnel login
cloudflared tunnel create my-tunnel

# config.yml
tunnel: <TUNNEL_ID>
credentials-file: /root/.cloudflared/<TUNNEL_ID>.json

ingress:
  - hostname: app.example.com
    service: http://localhost:8080
  - hostname: api.example.com
    service: http://localhost:3000
    originRequest:
      noTLSVerify: true
      connectTimeout: 30s
  - hostname: ssh.example.com
    service: ssh://localhost:22
  - service: http_status:404  # Catch-all

# Run as systemd service
cloudflared tunnel --config /etc/cloudflared/config.yml run my-tunnel

# Terraform - Cloudflare Tunnel
resource "cloudflare_tunnel" "main" {
  account_id = var.account_id
  name       = "production-tunnel"
  secret     = base64encode(random_password.tunnel_secret.result)
}

resource "cloudflare_tunnel_config" "main" {
  account_id = var.account_id
  tunnel_id  = cloudflare_tunnel.main.id

  config {
    ingress_rule {
      hostname = "app.example.com"
      service  = "http://localhost:8080"
    }
    ingress_rule {
      service = "http_status:404"
    }
  }
}

10. Zero Trust and Access

Cloudflare Access Policies

Cloudflare Access replaces traditional VPNs with identity-aware proxy. Every request is authenticated and authorized at the edge based on identity provider (IdP), device posture, IP, and country before reaching the application.

# Terraform - Zero Trust Access application
resource "cloudflare_access_application" "internal_dashboard" {
  zone_id                   = var.zone_id
  name                      = "Internal Dashboard"
  domain                    = "dashboard.example.com"
  type                      = "self_hosted"
  session_duration          = "24h"
  auto_redirect_to_identity = true
  allowed_idps              = [cloudflare_access_identity_provider.google.id]
}

resource "cloudflare_access_policy" "dashboard_policy" {
  zone_id        = var.zone_id
  application_id = cloudflare_access_application.internal_dashboard.id
  name           = "Allow engineering team"
  precedence     = 1
  decision       = "allow"

  include {
    email_domain = ["example.com"]
  }
  require {
    group = [cloudflare_access_group.engineering.id]
  }
}

resource "cloudflare_access_group" "engineering" {
  zone_id = var.zone_id
  name    = "Engineering Team"

  include {
    email_domain = ["example.com"]
  }
  require {
    group = ["engineering@example.com"]
  }
}

# Identity provider configuration
resource "cloudflare_access_identity_provider" "google" {
  zone_id = var.zone_id
  name    = "Google Workspace"
  type    = "google"
  config {
    client_id     = var.google_client_id
    client_secret = var.google_client_secret
  }
}

11. Email Routing

Email Routing Configuration

Cloudflare Email Routing creates custom email addresses on your domain and forwards them to existing mailboxes. No mail server required. Supports catch-all rules, multiple destinations, and Workers-based email processing.

# Terraform - Email routing rules
resource "cloudflare_email_routing_settings" "default" {
  zone_id = var.zone_id
  enabled = true
}

resource "cloudflare_email_routing_address" "personal" {
  account_id = var.account_id
  email      = "jose.nobile@gmail.com"
}

resource "cloudflare_email_routing_rule" "contact" {
  zone_id = var.zone_id
  name    = "contact"
  enabled = true

  matcher {
    type  = "literal"
    field = "to"
    value = "contact@josenobile.co"
  }
  action {
    type  = "forward"
    value = ["jose.nobile@gmail.com"]
  }
}

resource "cloudflare_email_routing_catch_all" "default" {
  zone_id = var.zone_id
  name    = "catch-all"
  enabled = true

  matcher { type = "all" }
  action {
    type  = "forward"
    value = ["jose.nobile@gmail.com"]
  }
}

# Email Worker for programmatic email processing
export default {
  async email(message: EmailMessage, env: Env) {
    // Forward based on subject
    if (message.headers.get('subject')?.includes('urgent')) {
      await message.forward('alerts@example.com');
    } else {
      await message.forward('inbox@example.com');
    }
  },
};

12. DDoS Protection

DDoS Mitigation and Configuration

Cloudflare provides unmetered, always-on DDoS protection at layers 3, 4, and 7. The network absorbs attacks up to hundreds of Tbps using anycast and automated detection. HTTP DDoS rules and Advanced TCP Protection allow tuning sensitivity for specific traffic patterns.

# Terraform - HTTP DDoS Attack Protection overrides
resource "cloudflare_ruleset" "ddos_override" {
  zone_id = var.zone_id
  name    = "DDoS Protection Overrides"
  kind    = "zone"
  phase   = "ddos_l7"

  rules {
    action = "execute"
    action_parameters {
      id = "4d21379b4f9f4bb088e0729962c8b3cf" # HTTP DDoS ruleset
      overrides {
        # Increase sensitivity for API endpoints
        rules {
          id              = "fdfdac75430c4c47a422bdc024f57730"
          sensitivity_level = "high"
          action          = "block"
        }
      }
    }
    expression  = "true"
    description = "HTTP DDoS protection with custom sensitivity"
  }
}

# Network-layer DDoS (L3/L4) is automatic for proxied records
# Monitor via API:
curl -s "https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/security/events?filter=kind:ddos" \
  -H "Authorization: Bearer ${CF_TOKEN}" | jq '.result[:5]'

13. SSL/TLS Modes

SSL/TLS Encryption Modes

Cloudflare offers four SSL/TLS modes: Off, Flexible (edge only), Full (edge-to-origin, self-signed OK), and Full (Strict) (edge-to-origin, valid cert required). Always use Full (Strict) in production. Origin certificates issued by Cloudflare provide free, long-lived TLS for the origin connection.

# Terraform - SSL/TLS configuration
resource "cloudflare_zone_settings_override" "ssl" {
  zone_id = var.zone_id

  settings {
    ssl                      = "strict"        # Full (Strict)
    always_use_https         = "on"
    min_tls_version          = "1.2"
    tls_1_3                  = "zrt"           # 0-RTT enabled
    automatic_https_rewrites = "on"
    opportunistic_encryption = "on"
  }
}

# Origin Certificate (15-year, free from Cloudflare)
curl -X POST "https://api.cloudflare.com/client/v4/certificates" \
  -H "Authorization: Bearer ${CF_TOKEN}" \
  -H "Content-Type: application/json" \
  --data '{
    "hostnames": ["example.com", "*.example.com"],
    "requested_validity": 5475,
    "request_type": "origin-rsa",
    "csr": "'"$(cat origin.csr)"'"
  }'

# Authenticated Origin Pulls (mTLS: Cloudflare -> Origin)
resource "cloudflare_authenticated_origin_pulls" "default" {
  zone_id = var.zone_id
  enabled = true
}

# On the origin server (nginx):
# ssl_client_certificate /etc/nginx/certs/cloudflare-origin-pull-ca.pem;
# ssl_verify_client on;

14. Argo Smart Routing

Argo Smart Routing and Tiered Cache

Argo Smart Routing uses real-time network intelligence to route traffic through the fastest paths across Cloudflare's private backbone, reducing latency by 30% on average. Combined with Tiered Caching, it optimizes both dynamic and static content delivery.

# Terraform - Enable Argo Smart Routing
resource "cloudflare_argo" "default" {
  zone_id        = var.zone_id
  smart_routing  = "on"
  tiered_caching = "on"
}

# Argo routing benefits:
# - Detects real-time congestion and routes around it
# - Uses Cloudflare's private backbone instead of public internet
# - Reduces TTFB by 30% on average for dynamic content
# - Reduces origin load by ~35% with tiered caching

# Monitor Argo analytics via API
curl -s "https://api.cloudflare.com/client/v4/zones/${ZONE_ID}/analytics/latency?bins=10" \
  -H "Authorization: Bearer ${CF_TOKEN}" | jq '
  {
    argo_enabled_latency: .result.argo_enabled,
    without_argo_latency: .result.without_argo,
    improvement_pct: ((.result.without_argo - .result.argo_enabled) / .result.without_argo * 100)
  }'
Jose's Experience: In production, I manage Cloudflare DNS with Terraform for all zones, WAF custom rules to block scanner traffic and credential stuffing, and edge rate limiting to protect the API. For josenobile.co, I use Cloudflare Pages with Git-triggered deploys and preview URLs for every branch.

15. AI Platform (Workers AI, Infire Engine, Gateway, Vectorize)

Workers AI: Inference at the Edge

Workers AI runs AI inference directly at Cloudflare's edge on GPUs deployed in 180+ cities worldwide, one of the largest distributed AI inference footprints globally. The platform hosts 50+ open-source models across task types: LLMs (Llama 3.1/3.2, DeepSeek-R1, Qwen3, Gemma, Mistral), text embeddings, image generation (FLUX, Stable Diffusion), speech-to-text, text-to-speech, image classification, and object detection. Pricing uses Neurons: $0.011 per 1,000 Neurons on paid plans, with a free allocation of 10,000 Neurons/day. The Infire engine, a Rust-based inference runtime launched in late 2025, delivers 7% faster inference than vLLM and cuts CPU overhead by 82% using Granular CUDA Graphs and JIT compilation. Speculative decoding provides 2-4x speed improvements for LLM inference.

// Workers AI: LLM inference at the edge
export default {
  async fetch(request, env) {
    const { prompt } = await request.json();

    // DeepSeek-R1 reasoning model
    const response = await env.AI.run('@cf/deepseek-ai/deepseek-r1-distill-qwen-32b', {
      messages: [
        { role: 'system', content: 'You are a helpful assistant.' },
        { role: 'user', content: prompt },
      ],
      max_tokens: 1024,
      temperature: 0.7,
    });

    return Response.json({ result: response.response });
  },
};

// Embeddings for semantic search
const embeddings = await env.AI.run('@cf/baai/bge-base-en-v1.5', {
  text: ['gym membership plans', 'workout schedule', 'personal training'],
});
// Returns: { data: [{ values: [0.012, -0.034, ...] }, ...] }

// Image generation with FLUX
const image = await env.AI.run('@cf/black-forest-labs/flux-1-schnell', {
  prompt: 'modern gym interior, natural lighting, photorealistic',
});
// Returns: ArrayBuffer of generated PNG

// Workers AI pricing (Neurons):
// @cf/meta/llama-3.2-1b-instruct:  ~2,457 Neurons/M input, ~18,252/M output
// @cf/meta/llama-3.1-70b-instruct: ~26,668 Neurons/M input, ~204,805/M output
// Free tier: 10,000 Neurons/day (hard limit on free plan)

Cloudflare finalized the acquisition of Replicate in April 2026 (originally announced November 2025), bringing over 50,000 production-ready models into the Workers AI ecosystem. Replicate's model registry, fine-tuning infrastructure, and developer community now integrate directly with Workers AI, giving developers access to the largest catalog of ready-to-deploy AI models at the edge. Replicate's existing API remains operational with backward compatibility, while new models are progressively available through the standard Workers AI binding with Neurons-based pricing.

Kimi K2.5 (Agents Week 2026): Moonshot AI's Kimi K2.5 became the first frontier-scale open-source model available on Workers AI, featuring a 256K context window, multi-turn tool calling, vision inputs, and structured outputs. Internal testing showed it cut inference costs by 77% compared to mid-tier proprietary models for agentic coding tasks, making it a strong choice for cost-sensitive agent workloads running on Cloudflare's edge.

Alongside the AI platform expansion, Cloudflare rebuilt the Wrangler CLI with comprehensive API coverage designed for agent-friendly workflows. The new CLI surfaces every Workers AI, Agents SDK, and platform capability as a scriptable command, making it straightforward for AI agents to provision resources, deploy workers, manage Durable Objects, and orchestrate AI pipelines programmatically without dashboard interaction.

Infire: Rust-Based Inference Engine

Infire is Cloudflare's custom LLM inference engine, written in Rust, that powers Workers AI across the global network. It replaces Python-based inference stacks (like vLLM) with a high-performance, memory-efficient runtime optimized for edge deployment. Infire uses Granular CUDA Graphs with JIT compilation to compile a dedicated CUDA graph for every possible batch size on the fly, executing work as a single monolithic GPU structure. This cuts CPU overhead by 82% (25% CPU vs. vLLM's 140%+) and delivers 7% faster inference than vLLM 0.10.0 on H100 NVL GPUs. Model loading from disk takes under 4 seconds for Llama-3-8B-Instruct, and GPU utilization reaches 80%+.

AI Gateway: Proxy, Observability, and Unified Billing

AI Gateway is a proxy and observability layer that sits between your application and 14+ AI providers (OpenAI, Anthropic, Google, Mistral, Cohere, Groq, xAI, Alibaba Cloud, Bytedance, Workers AI, and others), exposing 70+ models through a single API endpoint. It provides an OpenAI-compatible endpoint, rate limiting per user/key (sliding or fixed window), response caching (reducing latency by up to 90%), request/response logging, real-time cost analytics, and model fallback with automatic retries. Unified Billing consolidates all AI provider costs into a single Cloudflare invoice -- load credits and spend across any provider, switching models with a one-line code change. Persistent logs (open beta) store prompts and responses for extended analysis.

// AI Gateway: proxy any AI provider through Cloudflare
const response = await fetch(
  'https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/openai/chat/completions',
  {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${OPENAI_KEY}`,
      'Content-Type': 'application/json',
      'cf-aig-cache-ttl': '3600', // Cache identical prompts for 1h
    },
    body: JSON.stringify({
      model: 'gpt-4o',
      messages: [{ role: 'user', content: 'Suggest a workout plan' }],
    }),
  }
);
// Gateway adds: caching, rate limiting, logging, cost tracking, fallback
// No code changes needed — just change the base URL

// Unified Billing: pay for OpenAI, Anthropic, Google, etc. via Cloudflare
// 1. Load credits in Cloudflare dashboard
// 2. Remove provider API keys from your app
// 3. AI Gateway handles auth and billing transparently

// Provider fallback: automatic failover between providers
// Configure in dashboard: primary=OpenAI, fallback=Anthropic, fallback=Workers AI

Vectorize: Vector Database for RAG

Vectorize is Cloudflare's globally distributed vector database, now GA and supporting up to 5 million vectors per index (up from 200K at launch) with a median query latency of 31ms (down from 549ms). Purpose-built for Retrieval-Augmented Generation (RAG) pipelines, it stores embeddings generated by Workers AI and enables similarity search with low latency. Integrates natively with Workers AI for end-to-end RAG at the edge: embed documents, store vectors, query similar content, and generate responses. Automatic storage optimization reduces costs as indexes grow.

// RAG pipeline: Workers AI + Vectorize
export default {
  async fetch(request, env) {
    const { question } = await request.json();

    // 1. Embed the question
    const queryEmbedding = await env.AI.run('@cf/baai/bge-base-en-v1.5', {
      text: [question],
    });

    // 2. Search Vectorize for similar documents (31ms median latency)
    const matches = await env.VECTORIZE.query(queryEmbedding.data[0].values, {
      topK: 5,
      returnMetadata: true,
      // Supports up to 5M vectors per index
    });

    // 3. Build context from matched documents
    const context = matches.matches
      .map(m => m.metadata.text)
      .join('\n');

    // 4. Generate answer with context
    const answer = await env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
      messages: [
        { role: 'system', content: `Answer using this context:\n${context}` },
        { role: 'user', content: question },
      ],
    });

    return Response.json({ answer: answer.response, sources: matches.matches });
  },
};
Jose's Experience: josenobile.co runs on Cloudflare Pages with CI/CD from GitLab, giving me first-hand experience with Cloudflare's edge platform. The AI platform is compelling because inference runs at the same edge locations that serve the site: no additional networking hops, no separate GPU provider accounts. AI Gateway's Unified Billing simplifies cost management when using multiple AI providers across different projects.

17. Agents SDK, Think, Artifacts, and Sandboxes

Agents SDK: Stateful AI Agents on Durable Objects

The Agents SDK is a TypeScript framework for building production AI agents on Cloudflare's Durable Objects infrastructure. Each agent is a stateful micro-server with its own SQL database, WebSocket connections, and scheduling. AIChatAgent provides streaming AI chat with automatic message persistence and resumable streams. Define server-side tools, client-side tools that run in the browser, and human-in-the-loop approval flows. Agents can call any AI model (Workers AI, OpenAI, Anthropic, Gemini) and stream responses over WebSockets or Server-Sent Events. Schedule tasks on a delay, at a specific time, or on a cron, and agents can wake themselves up, do work, and go back to sleep without a user present. Durable Objects are a natural host for the A2A Protocol -- per-task state, native SSE, global anycast for Agent Card fetches, and edge-native x402 support for agent-to-agent payments.

// Agents SDK: AI agent with persistent state and tools
import { AIChatAgent } from 'cloudflare:agents';

export class ResearchAgent extends AIChatAgent {
  // Persistent SQL database per agent instance
  // Automatic message history persistence
  // Resumable streams across disconnections

  async onMessage(message) {
    // Stream response from any AI provider
    const stream = await this.ai.run('@cf/meta/llama-3.1-70b-instruct-fp8-fast', {
      messages: this.getHistory(),
      stream: true,
    });

    // Tool calling with human-in-the-loop approval
    return this.streamResponse(stream, {
      tools: [this.searchDocs, this.createReport],
      requireApproval: ['createReport'],
    });
  }

  @tool('Search documentation')
  async searchDocs(query: string) {
    const results = await this.env.VECTORIZE.query(
      await this.embed(query), { topK: 10 }
    );
    return results.matches;
  }

  @tool('Create report (requires approval)')
  async createReport(title: string, content: string) {
    await this.env.DB.prepare(
      'INSERT INTO reports (title, content) VALUES (?, ?)'
    ).bind(title, content).run();
    return { created: true };
  }

  // Schedule: agents can wake themselves for periodic work
  async onSchedule(trigger) {
    await this.refreshData();
  }
}

Think: Long-Running Multi-Step Agent Framework

Think is a framework within the Agents SDK designed for persistence and long-running, multi-step tasks. Instead of responding to single prompts, Think agents maintain state across steps, pause for external events, resume after hours or days, and coordinate with other agents. Think combines Durable Objects persistence with Workflows' step-by-step retry semantics, making it suitable for agents that need to perform research, gather approvals, execute multi-stage workflows, and report results autonomously over extended periods.

Artifacts: Git-Compatible Agent Storage

Artifacts is a Git-compatible storage primitive built for the agent-first era. Developers can create millions of repositories, fork from any remote source, and provide agents with a permanent home for code and data accessible to any standard Git client. Agents use Artifacts to persist generated code, configuration files, research data, and iterative work products across sessions. Because Artifacts is Git-compatible, human developers can review, branch, merge, and collaborate on agent output using standard Git workflows.

Sandboxes: Full OS for Agent Workloads

Sandboxes give agents access to persistent, isolated Linux environments with a shell, filesystem, and background processes. An agent can clone a repository, install Python packages, run builds, execute tests, and iterate with the same feedback loop a human developer uses. Sandboxes are heavier than Dynamic Workers (which use V8 isolates) but provide full OS capabilities when agents need system-level access, package managers, or long-running compilation tasks.

April 2026: Agent Cloud Expansion

On April 13, 2026, Cloudflare announced a major expansion to its agent infrastructure under the Agent Cloud umbrella. Artifacts reached general availability as a Git-compatible storage layer purpose-built for agents, enabling millions of repositories with fork-from-anywhere capability. Sandboxes moved to GA with persistent Linux environments that survive agent restarts, giving agents the same development feedback loop as human developers. The Think framework was integrated into the Agents SDK as a first-class primitive for multi-step, long-running agent tasks with automatic state persistence and retry semantics. Together, these three capabilities form a complete agent runtime: Think for orchestration, Artifacts for persistent storage, and Sandboxes for execution.

Durable Objects Free Tier

Durable Objects are now available on the Workers Free plan. This removes the paid-plan barrier for building real-time applications, WebSocket servers, collaborative tools, and AI agents. The free tier includes 100,000 requests/day and 1 GB storage, sufficient for development and small production workloads.

18. Dynamic Workers (AI Code Sandboxing)

Dynamic Workers: 100x Faster AI Code Execution

Dynamic Workers (open beta, March 2026) are an isolate-based runtime for executing AI-generated code in secure sandboxes. Instead of running each AI tool call sequentially, the LLM generates a single TypeScript function that chains multiple API calls together, runs it in a Dynamic Worker, and returns the result. Dynamic Workers use V8 isolates (not containers), starting in milliseconds and using megabytes of memory, making them 100x faster to boot and 10-100x more memory-efficient than traditional container sandboxes. Cap'n Web RPC bridges handle API connectivity across the security boundary, and outbound HTTP interception injects credentials automatically so agent code never sees secrets directly. Pricing: $0.002 per unique Worker loaded per day (waived during beta), plus standard CPU and invocation charges.

// Dynamic Workers: LLM generates and executes code at runtime
import { DynamicWorkerLoader } from 'cloudflare:workers';

export default {
  async fetch(request, env) {
    const { task } = await request.json();

    // LLM generates a TypeScript function for the task
    const codeResponse = await env.AI.run('@cf/meta/llama-3.1-70b-instruct-fp8-fast', {
      messages: [{
        role: 'system',
        content: `Generate a TypeScript function that accomplishes the task.
                  Available APIs: env.DB (D1), env.KV, env.R2.
                  Return only the function body.`,
      }, {
        role: 'user',
        content: task,
      }],
    });

    // Execute the generated code in an isolated V8 sandbox
    const dynamicWorker = await env.DYNAMIC_WORKER.create({
      code: codeResponse.response,
      // Cap'n Web RPC bridges provide API access
      bindings: { DB: env.DB, KV: env.KV, R2: env.R2 },
    });

    // Starts in milliseconds, isolated from host Worker
    const result = await dynamicWorker.fetch(new Request('http://sandbox/run'));
    return result;
  },
};

// Security: outbound HTTP interception injects credentials
// Agent code never sees API keys or secrets directly
// Each Dynamic Worker runs in its own V8 isolate
// Boot time: ~5ms (vs. ~500ms for containers)

19. MCP Servers and Code Mode

Remote MCP Servers on Cloudflare Workers

Model Context Protocol (MCP) is the open standard for connecting AI agents to external tools and services. Cloudflare enables building and deploying remote MCP servers on Workers, making your tools accessible to any MCP-compatible AI agent (Claude, GPT, Gemini, etc.). Unlike local MCP servers that require installation on each machine, remote MCP servers on Cloudflare run globally at the edge with zero infrastructure management. Cloudflare provides the mcp-remote adapter to bridge MCP clients that only support local connections to remote servers, and an AI Playground as a remote MCP client for testing.

// MCP Server on Cloudflare Workers
import { McpAgent } from 'cloudflare:agents';

export class MyMCPServer extends McpAgent {
  // Expose tools to any MCP-compatible AI agent
  @tool('Query database')
  async queryDB(sql: string) {
    return await this.env.DB.prepare(sql).all();
  }

  @tool('Upload file to R2')
  async uploadFile(key: string, content: string) {
    await this.env.BUCKET.put(key, content);
    return { uploaded: key };
  }

  @tool('Run AI inference')
  async runInference(prompt: string) {
    return await this.env.AI.run('@cf/meta/llama-3.1-8b-instruct', {
      messages: [{ role: 'user', content: prompt }],
    });
  }
}
// Deploy: npx wrangler deploy
// Connect from Claude Desktop, Cursor, etc. via mcp-remote

Code Mode: Entire APIs in 1,000 Tokens

Code Mode solves a core tension in MCP: agents need many tools to do useful work, but each tool fills the context window. Instead of describing every operation as a separate tool, Code Mode lets the model write code against a typed SDK and execute it safely in a Dynamic Worker. The Cloudflare API MCP server uses Code Mode to provide access to the entire Cloudflare API (hundreds of endpoints) while consuming only ~1,000 tokens of context. This enables agents to manage DNS, Workers, R2, D1, and all Cloudflare services through a single, context-efficient MCP tool.

Cloudflare's Own MCP Servers

Cloudflare provides official MCP servers for managing Cloudflare services directly from AI agents: manage Workers deployments, query D1 databases, interact with R2 storage, configure DNS records, inspect analytics, and more. Install the Cloudflare MCP server in your AI IDE (Claude Desktop, Cursor, Windsurf) and manage your entire Cloudflare infrastructure conversationally through your AI agent.

20. Containers

Cloudflare Containers at the Edge

Cloudflare Containers run OCI (Docker) containers on the Durable Objects infrastructure across Cloudflare's global network. Workers serve as the API gateway, routing requests to long-running container instances. This bridges the gap between serverless Workers (stateless, sub-millisecond cold starts) and traditional container workloads (stateful, long-running processes). Containers support TCP networking, persistent file systems, full Linux userspace, and account-level limits of approximately 6 TiB memory, 1,500 vCPU, and 30 TB disk for concurrent workloads. Use containers for workloads that need system packages, long-running processes, or languages beyond JavaScript/TypeScript.

// wrangler.jsonc — Container configuration
{
  "name": "my-container-app",
  "containers": [{
    "name": "ml-inference",
    "image": "./Dockerfile",
    "max_instances": 5,
    "instance_type": "standard"
  }]
}

// Worker as API gateway for containers
export default {
  async fetch(request, env) {
    const container = await env.ML_CONTAINER.get("inference-1");
    // Forward request to the container
    const result = await container.fetch(request);
    return result;
  },
};

// Account limits (2026):
// ~6 TiB memory, ~1,500 vCPU, ~30 TB disk concurrent
// Full Docker/OCI compatibility
// Persistent filesystem across restarts

21. Workflows (Durable Execution)

Workflows (GA): Durable Execution Engine

Workflows (GA) is Cloudflare's durable execution engine for multi-step applications. Each step is persisted and automatically retried on failure, with configurable retry policies and timeouts. Workflows support human-in-the-loop approval steps via the waitForEvent API, parallel execution, and long-running processes (hours, days, or weeks). State is automatically persisted between steps, so workflows survive Worker restarts and infrastructure failures. Workflows is the backbone for AI agent orchestration: chain tool calls, wait for human approval, handle API rate limits with automatic retry, and coordinate multi-agent workflows.

// Workflows: durable multi-step execution
import { WorkflowEntrypoint, WorkflowStep } from 'cloudflare:workers';

export class MemberOnboarding extends WorkflowEntrypoint {
  async run(event, step: WorkflowStep) {
    // Step 1: Create account (auto-retried on failure)
    const member = await step.do('create-account', async () => {
      return await this.env.API.createMember(event.payload);
    });

    // Step 2: Send welcome email
    await step.do('send-welcome', async () => {
      await this.env.EMAIL.send(member.email, 'Welcome!');
    });

    // Step 3: Wait for human approval (pauses workflow)
    await step.waitForEvent('admin-approval', { timeout: '7 days' });

    // Step 4: Activate subscription
    await step.do('activate', async () => {
      await this.env.API.activateSubscription(member.id);
    });

    return { memberId: member.id, status: 'onboarded' };
  }
}

22. EmDash CMS (WordPress Successor)

EmDash: AI-Native CMS on Cloudflare Workers

EmDash (v0.1, April 2026) is Cloudflare's open-source CMS positioned as the spiritual successor to WordPress, rewritten entirely in TypeScript on Astro 6.0. It runs on Cloudflare Workers (scales to zero when idle), uses Dynamic Workers to sandbox plugins with explicit permissions (addressing the root cause of 96% of WordPress security vulnerabilities), and includes AI-native features: Agent Skills for AI-assisted plugin/theme development, a CLI for programmatic site management, and a built-in MCP server for direct AI tool integration. EmDash is designed to be managed by AI agents, not just humans. Database portability uses Kysely for SQL (supporting SQLite, D1, Turso, PostgreSQL) and S3 API for storage (R2, AWS S3, local files). WordPress migration imports posts, pages, media, and taxonomies from WXR exports, the WordPress REST API, or WordPress.com.

# EmDash: TypeScript CMS on Cloudflare Workers
npm create emdash@latest my-site
cd my-site

# Configure for Cloudflare (also works on Node.js, Deno, etc.)
# astro.config.ts
import { defineConfig } from 'astro/config';
import emdash from '@emdash/astro';
import cloudflare from '@astrojs/cloudflare';

export default defineConfig({
  integrations: [emdash()],
  adapter: cloudflare(),
});

# Deploy to Cloudflare Workers
npx wrangler deploy

# Features included:
# - Admin panel, REST API, authentication, media library
# - Plugin system (sandboxed in Dynamic Workers)
# - Agent Skills for AI-assisted development
# - Built-in MCP server for AI agent management
# - WordPress import (WXR, REST API, WordPress.com)
# - Database: D1/SQLite/PostgreSQL via Kysely
# - Storage: R2/S3/local via S3 API abstraction

More Guides