Skip to content
Cloudflare Docs

Changelog

New updates and improvements at Cloudflare.

Subscribe to RSS
View all RSS feeds

Select...
hero image
  1. Custom Errors are now generally available for all paid plans — bringing a unified and powerful experience for customizing error responses at both the zone and account levels.

    You can now manage Custom Error Rules, Custom Error Assets, and redesigned Error Pages directly from the Cloudflare dashboard. These features let you deliver tailored messaging when errors occur, helping you maintain brand consistency and improve user experience — whether it’s a 404 from your origin or a security challenge from Cloudflare.

    What's new:

    • Custom Errors are now GA – Available on all paid plans and ready for production traffic.
    • UI for Custom Error Rules and Assets – Manage your zone-level rules from the Rules > Overview and your zone-level assets from the Rules > Settings tabs.
    • Define inline content or upload assets – Create custom responses directly in the rule builder, upload new or reuse previously stored assets.
    • Refreshed UI and new name for Error Pages – Formerly known as “Custom Pages,” Error Pages now offer a cleaner, more intuitive experience for both zone and account-level configurations.
    • Powered by Ruleset Engine – Custom Error Rules support conditional logic and override Error Pages for 500 and 1000 class errors, as well as errors originating from your origin or other Cloudflare products. You can also configure Response Header Transform Rules to add, change, or remove HTTP headers from responses returned by Custom Error Rules.
    Custom Errors GA

    Learn more in the Custom Errors documentation.

  1. You can now create Python Workers which are executed via a cron trigger.

    This is similar to how it's done in JavaScript Workers, simply define a scheduled event listener in your Worker:

    from workers import handler
    @handler
    async def on_scheduled(event, env, ctx):
    print("cron processed")

    Define a cron trigger configuration in your Wrangler configuration file:

    {
    "triggers": {
    "crons": [
    "*/3 * * * *",
    "0 15 1 * *",
    "59 23 LW * *"
    ]
    }
    }

    Then test your new handler by using Wrangler with the --test-scheduled flag and making a request to /cdn-cgi/handler/scheduled?cron=*+*+*+*+*:

    Terminal window
    npx wrangler dev --test-scheduled
    curl "http://localhost:8787/cdn-cgi/handler/scheduled?cron=*+*+*+*+*"

    Consult the Workers Cron Triggers page for full details on cron triggers in Workers.

  1. You can now filter AutoRAG search results by folder and timestamp using metadata filtering to narrow down the scope of your query.

    This makes it easy to build multitenant experiences where each user can only access their own data. By organizing your content into per-tenant folders and applying a folder filter at query time, you ensure that each tenant retrieves only their own documents.

    Example folder structure:

    Terminal window
    customer-a/logs/
    customer-a/contracts/
    customer-b/contracts/

    Example query:

    const response = await env.AI.autorag("my-autorag").search({
    query: "When did I sign my agreement contract?",
    filters: {
    type: "eq",
    key: "folder",
    value: "customer-a/contracts/",
    },
    });

    You can use metadata filtering by creating a new AutoRAG or reindexing existing data. To reindex all content in an existing AutoRAG, update any chunking setting and select Sync index. Metadata filtering is available for all data indexed on or after April 21, 2025.

    If you are new to AutoRAG, get started with the Get started AutoRAG guide.

  1. The Access bulk policy tester is now available in the Cloudflare Zero Trust dashboard. The bulk policy tester allows you to simulate Access policies against your entire user base before and after deploying any changes. The policy tester will simulate the configured policy against each user's last seen identity and device posture (if applicable).

    Example policy tester

  1. Custom Fields now support logging both raw and transformed values for request and response headers in the HTTP requests dataset.

    These fields are configured per zone and apply to all Logpush jobs in that zone that include request headers, response headers. Each header can be logged in only one format—either raw or transformed—not both.

    By default:

    • Request headers are logged as raw values
    • Response headers are logged as transformed values

    These defaults can be overidden to suit your logging needs.

    For more information refer to Custom fields documentation

  1. You can now retrieve up to 100 keys in a single bulk read request made to Workers KV using the binding.

    This makes it easier to request multiple KV pairs within a single Worker invocation. Retrieving many key-value pairs using the bulk read operation is more performant than making individual requests since bulk read operations are not affected by Workers simultaneous connection limits.

    // Read single key
    const key = "key-a";
    const value = await env.NAMESPACE.get(key);
    // Read multiple keys
    const keys = ["key-a", "key-b", "key-c", ...] // up to 100 keys
    const values : Map<string, string?> = await env.NAMESPACE.get(keys);
    // Print the value of "key-a" to the console.
    console.log(`The first key is ${values.get("key-a")}.`)

    Consult the Workers KV Read key-value pairs API for full details on Workers KV's new bulk reads support.

  1. Queues pull consumers can now pull and acknowledge up to 5,000 messages / second per queue. Previously, pull consumers were rate limited to 1,200 requests / 5 minutes, aggregated across all queues.

    Pull consumers allow you to consume messages over HTTP from any environment—including outside of Cloudflare Workers. They’re also useful when you need fine-grained control over how quickly messages are consumed.

    To setup a new queue with a pull based consumer using Wrangler, run:

    Create a queue with a pull based consumer
    npx wrangler queues create my-queue
    npx wrangler queues consumer http add my-queue

    You can also configure a pull consumer using the REST API or the Queues dashboard.

    Once configured, you can pull messages from the queue using any HTTP client. You'll need a Cloudflare API Token with queues_read and queues_write permissions. For example:

    Pull messages from a queue
    curl "https://api.cloudflare.com/client/v4/accounts/${CF_ACCOUNT_ID}/queues/${QUEUE_ID}/messages/pull" \
    --header "Authorization: Bearer ${API_TOKEN}" \
    --header "Content-Type: application/json" \
    --data '{ "visibility_timeout": 10000, "batch_size": 2 }'

    To learn more about how to acknowledge messages, pull batches at once, and setup multiple consumers, refer to the pull consumer documentation.

    As always, Queues doesn't charge for data egress. Pull operations continue to be billed at the existing rate, of $0.40 / million operations. The increased limits are available now, on all new and existing queues. If you're new to Queues, get started with the Cloudflare Queues guide.

  1. Happy Developer Week 2025! Workers AI is excited to announce a couple of new features and improvements available today. Check out our blog for all the announcement details.

    Faster inference + New models

    We’re rolling out some in-place improvements to our models that can help speed up inference by 2-4x! Users of the models below will enjoy an automatic speed boost starting today:

    • @cf/meta/llama-3.3-70b-instruct-fp8-fast gets a speed boost of 2-4x, leveraging techniques like speculative decoding, prefix caching, and an updated inference backend.
    • @cf/baai/bge-small-en-v1.5, @cf/baai/bge-base-en-v1.5, @cf/baai/bge-large-en-v1.5 get an updated back end, which should improve inference times by 2x.
      • With the bge models, we’re also announcing a new parameter called pooling which can take cls or mean as options. We highly recommend using pooling: cls which will help generate more accurate embeddings. However, embeddings generated with cls pooling are not backwards compatible with mean pooling. For this to not be a breaking change, the default remains as mean pooling. Please specify pooling: cls to enjoy more accurate embeddings going forward.

    We’re also excited to launch a few new models in our catalog to help round out your experience with Workers AI. We’ll be deprecating some older models in the future, so stay tuned for a deprecation announcement. Today’s new models include:

    • @cf/mistralai/mistral-small-3.1-24b-instruct: a 24B parameter model achieving state-of-the-art capabilities comparable to larger models, with support for vision and tool calling.
    • @cf/google/gemma-3-12b-it: well-suited for a variety of text generation and image understanding tasks, including question answering, summarization and reasoning, with a 128K context window, and multilingual support in over 140 languages.
    • @cf/qwen/qwq-32b: a medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.
    • @cf/qwen/qwen2.5-coder-32b-instruct: the current state-of-the-art open-source code LLM, with its coding abilities matching those of GPT-4o.

    Batch Inference

    Introducing a new batch inference feature that allows you to send us an array of requests, which we will fulfill as fast as possible and send them back as an array. This is really helpful for large workloads such as summarization, embeddings, etc. where you don’t have a human-in-the-loop. Using the batch API will guarantee that your requests are fulfilled eventually, rather than erroring out if we don’t have enough capacity at a given time.

    Check out the tutorial to get started! Models that support batch inference today include:

    Expanded LoRA support

    We’ve upgraded our LoRA experience to include 8 newer models, and can support ranks of up to 32 with a 300MB safetensors file limit (previously limited to rank of 8 and 100MB safetensors) Check out our LoRAs page to get started. Models that support LoRAs now include:

  1. You can now use more flexible redirect capabilities in Cloudflare One with Gateway.

    • A new Redirect action is available in the HTTP policy builder, allowing admins to redirect users to any URL when their request matches a policy. You can choose to preserve the original URL and query string, and optionally include policy context via query parameters.
    • For Block actions, admins can now configure a custom URL to display when access is denied. This block page redirect is set at the account level and can be overridden in DNS or HTTP policies. Policy context can also be passed along in the URL.

    Learn more in our documentation for HTTP Redirect and Block page redirect.

  1. Cloudflare Stream has completed an infrastructure upgrade for our Live WebRTC beta support which brings increased scalability and improved playback performance to all customers. WebRTC allows broadcasting directly from a browser (or supported WHIP client) with ultra-low latency to tens of thousands of concurrent viewers across the globe.

    Additionally, as part of this upgrade, the WebRTC beta now supports Signed URLs to protect playback, just like our standard live stream options (HLS/DASH).

    For more information, learn about the Stream Live WebRTC beta.