Progressive Delivery in Kubernetes: A Comprehensive Analysis
This article was originally published on Empathy.co Engineering Blog and has been updated with current practices for 2025.
Progressive Delivery is a modern approach to deploying applications that extends Continuous Delivery by gradually rolling out changes to a subset of users while evaluating key metrics and incorporating automated rollbacks.
The Problem
When deploying applications in Kubernetes, the native Deployment object with Rolling Update strategy presents several limitations:
- No control over the rollout speed
- No traffic flow control to new versions
- Readiness probes are not suitable for deeper checks
- No external metrics verification
- No automated rollback capabilities
These limitations make Rolling Updates risky in production environments because:
- We can't control the blast radius
- The rollout can be too aggressive
- There's no automated rollback mechanism
Requirements
For our Progressive Delivery solution, we need:
- GitOps Approach
- Declarative configurations
- Version control integration
- No manual interventions
- NGINX Ingress Support
- Compatible with our infrastructure
- No ingress controller changes
- Prometheus Integration
- Metric analysis capabilities
- Query-based validations
- Service Mesh Independence
- Flexibility for future changes
- No vendor lock-in
- Visual Interface
- Deployment visualization
- Progress tracking
Affected/Related Systems
- Kubernetes deployment methods
- Application delivery strategies from Teams
Current Design
Native Kubernetes Deployment Objects:
- Rolling Update: A Rolling Update slowly replaces the old version with the new version. This is the default strategy of the Deployment object
- Recreate: Deletes the old version of the application before bringing up the new version. This ensures that two versions of the application never run at the same time, but there is downtime during the deployment.
Proposed Design
The aspirational goal is to add extra deployment capabilities to the current Kubernetes cluster and, therefore, increase the agility and confidence of application teams by reducing the risk of outages when deploying new releases.
The main benefits would be:
- Safer Releases: Reduce the risk of introducing a new software version in production by gradually shifting traffic to the new version while measuring metrics like request success rate and latency.
- Flexible Traffic Routing: Shift and route traffic between app versions with the possibility of using a service mesh (Linkerd, Istio, Kuma...) or not (Contour, NGINX, Traefik...)
- Extensible Validation: Extend the application analysis with custom metrics and webhooks for acceptance tests, load tests or any other custom validation.
- Progressive Delivery: Alternatives deployment strategies:
- Canary (progressive traffic shifting)
- A/B Testing (HTTP headers and cookies traffic routing): Called experiments by Argo Rollouts, although Canaries could have specific headers too.
- Blue/Green (traffic switching and mirroring)
Concepts
- Blue/Green
It has both the new and old version of the application deployed at the same time. During this time, only the old version of the application will receive production traffic. This allows the developers to run tests against the new version before switching the live traffic to the new version.

- Canary
A Canary deployment exposes a subset of users to the new version of the application while serving the rest of the traffic to the old version. Once the new version is verified as being correct, it can gradually replace the old version. Ingress controllers and service meshes, such as NGINX and Istio, enable more sophisticated traffic shaping patterns for canarying than what is natively available (e.g. achieving very fine-grained traffic splitting, or splitting based on HTTP headers).

The picture above shows a Canary with two stages (25% and 75% of traffic goes to the new version), but this is just an example. Argo Rollouts allow multiple stages and percentages of traffic to be defined for each use case.
Analysis of Options
Argo Rollouts (v1.7)
Pros:
- Excellent UI with rapid feedback
- Strong ArgoCD integration
- Simple deployment resource
- Comprehensive documentation
- Active community support
- Native Kubernetes Gateway API support (new in 2025)
Cons:
- UI lacks RBAC/auth mechanisms
- Manual loadtest integration required
- Non-native Kubernetes resources
Argo Rollouts is a Kubernetes Controller and set of CRDs which provide advanced deployment capabilities such as Blue/Green, Canary, Canary analysis, experimentation and progressive delivery features to Kubernetes. A UI is deployed to see the different Rollouts.
Two kinds of rollouts:
- Canary
- Blue/Green
Argo Rollouts offers experiments that allow users to have ephemeral runs of one or more ReplicaSets and run AnalysisRuns along those ReplicaSets to confirm everything is running as expected. Some use cases of experiments could be:
- Deploying two versions of an application for a specific duration to enable the analysis of the application.
- Using experiments to enable A/B/C testing by launching multiple experiments with a different version of their application for a long duration.
- Launching a new version of an existing application with different labels to avoid receiving traffic from a Kubernetes service. The user can run tests against the new version before continuing the Rollout.
- A/B Testing could be performed using Argo Rollouts experiments
There are several ways to perform analysis to drive progressive delivery.
- AnalysisRuns are like Jobs, in that they eventually complete; the result of the run affects if the Rollout's update will continue, abort or pause. AnalysisRuns accept templating, making it easy to parametrize analysis.
- AnalysisRuns accepts multiple data sources like:
- Prometheus, querying over the applications metrics to foresee if the service has a degraded performance during the deployment
- Cloudwatch, querying over AWS metrics to check if everything is fine during the deployment
- Web, perform an HTTP request and compare against the result of a JSON response
- Job, execute a custom script in order to success/fail
- Traffic Management
- NGINX Ingress Controller
- Service Mesh Interface(SMI)
- Observability
- Grafana Dashboard
- Migration
- Instead of modifying and creating a new rollout from scratch, Argo Rollouts allows reference Deployment from Rollout. This will reduce effort in the event of migration.
- Pain Points
- RBAC & Authentication
- Non-native integration: Argo Rollouts use their own CRD Rollout, not Kubernetes native
Example Configuration:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: example-rollout
spec:
replicas: 5
strategy:
canary:
steps:
- setWeight: 20
- pause: {duration: 1h}
- setWeight: 40
- analysis:
templates:
- templateName: success-rate
- setWeight: 60
- pause: {duration: 30m}
- setWeight: 80
- pause: {duration: 30m}
Flagger (v1.15)
Pros:
- Kubernetes native approach
- Integrated load testing
- Native resource management
- Enhanced metric providers (new in 2025)
- Multi-cluster support
Cons:
- No dedicated UI
- Limited ArgoCD feedback
- Documentation gaps
- Simplified Blue/Green implementation
Flagger is part of the Flux family of GitOps tools. Flagger is pretty similar to Argo Rollouts and its main highlights are:
- Native integration: It watches Deployment resources, not need to handle it using a CRD
- Highly extensible and comes with batteries included: It provides a load-tester to run basic or complex-scenarios
When you create a deployment, Flagger generates duplicate resources of your app (including configmaps and secrets). It creates Kubernetes objects with <targetRef.name>-primary and a service endpoint to the primary deployment.
It employs the same concepts about Canary, Blue/Green and A/B Testing as Argo Rollouts does.
- Observability
- Grafana Dashboard
- Pain Points
- No UI, so no RBAC and authentication are needed, but it's complex to have fast feedback from the current status of the rollouts. Checking the logs or checking the status of Canary resources is the only way.
- No kubectl plugin to check how the deployment is going; necessary to deal with
kubectl logs -f flagger-controller
to see how kubectl describes Canary in order to check the progress. - Documentation could be better.
- Blue/Green is an adapted Canary (same as a Canary but with 100% weight)
Example Configuration:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: example-app
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: example-app
service:
port: 80
targetPort: 8080
gateways:
- public-gateway
analysis:
interval: 1m
threshold: 10
maxWeight: 50
stepWeight: 5
metrics:
- name: request-success-rate
threshold: 99
interval: 1m
Questions
- What happens if the controller is down?
- Argo Rollouts
- If there are Rollouts changes while the Argo Rollouts Controller is down, the controller will receive the latest changes; it's not going to start from where the Rollout was.
- If there is no new commit while the Controller is down, the Controller reconciles the status automatically. If the Rollout is in step 3 and the Controller is down, when it is back up, it will pick up from the same spot.
- Flagger
- Like Argo Rollouts, it reconciles fine enough.
- The difference is that it follows the steps, instead of the previous changes and then the latest changes.
- New rollouts/deployments will be blocked, but the pods and HPA will remain up and running, even if it breaks in the middle of a rollout/deployment. Both Controllers will reconcile automatically after recovery.
- Argo Rollouts
- What happens with the dashboards? Any changes?
- Argo Rollouts
- Although we don't have a Deployment resource, metrics from deployments won't disappear.
- Flagger
- Deployment resource is there, so no changes are expected.
- No changes
- Argo Rollouts
- What happens when a Canary is paused on the GUI or command line? Is the GitOps setup going to override the change?
- Argo Rollouts
- It can be done from the GUI and from the kubectl command line easily; the RolloutAbort will be notified by ArgoCD.
- It can be retried from the GUI easily or from kubectl commands; ArgoCD will mark the Rollout in progress
- Flagger
- It looks like it's not possible to pause the deployment using the command line. It's needed to have Flagger Tester API deployed
- Argo Rollouts
- What happens when a rollback occurs? What happens with the GitOps setup?
- Argo Rollouts
- Argo Rollouts is integrated with ArgoCD and the progress of the Rollout can be seen from ArgoCD UI.
- Flagger is not integrated with ArgoCD as seamlessly as Argo Rollouts, so a bunch of resources have been created and are visible in the ArgoCD UI, but there is no feedback.
- Argo Rollouts
- What happens in lower environments with a Canary deployment if there is not enough traffic?
- Argo Rollouts
- Argo Rollouts doesn't have a current way to do a loadtest directly but, as, a workaround it can be used with the webhooks to launch a k6 loadtest, as seen in this issue in their project.
- The loadtest has to be controlled out of the box; it specifically stops the loadtest when Canary reaches the step required.
- Flagger
- It has integration with k6 loadtests through a webhook and offers a flagger-loadtest tool; more information on webhooks can be found here.
- Argo Rollouts
- How does Canary traffic management work without a service mesh?
- In the absence of a traffic routing provider, both options can handle the Canary weights using NGINX capabilities. Besides, both options handle SMI and offer a broad selection related to service meshes. Then, whichever tool fits best and is not a blocker can be used to select one service mesh or another.
- What happens when a configMap or secret used by the Deployment (as volume mounts, environment variables) are changed?
- Argo Rollouts
- There is no support for that in Argo Rollouts, but there is an open issue in their Project
- Some workaround should be done, to be able to have rollout and rollback available when only a configMap changes. The workaround consists:
- Random suffix in the configMap name
- ConfigMap and Deployment definition in the same .yaml to avoid creating multiple random suffixes
- Flagger
- Using the Helm annotation trick for automatically rolling out deployments when the configMap changes works well enough in the event of a rollout. But, for a rollback after the rollout, the same issue as the Deployments and ConfigMaps may appear because there is only one configMap, not multiple. That means the workaround for the rollback would have to be done in the same way as Argo Rollouts
- Argo Rollouts
To Sum Up
Both tools will help us to get alternative deployments, while there are some tradeoffs related to each tool:
Argo Rollouts
Pros
- Great UI, fast feedback
- Great integration and feedback with ArgoCD, indicating if the Rollout is in progress
- Easy integration with current Deployment resources
- Documentation
Cons
- UI without RBAC or auth
- Loadtest not integrated, it has to be added ad-hoc using a webhook
- Non-Kubernetes native, Rollout resource added by the CRD
Flagger
Pros
- Kubernetes native, doesn't introduce new Kubernetes resources
- Loadtest integrated
Cons
- No UI; feedback needs to be gathered through the K8s API
- Zero feedback from ArgoCD; Flagger integrates better with Flux, based on their documentation
- Documentation could be better
- Main differences with Argo Rollouts
- Feedback using kubectl commands
- Blue/Green is an adapted Canary (same as a Canary but with 100% weight, after some tests)
At Empathy, the tool chosen was Argo Rollouts. It fits the needs pretty well, offers faster feedback, has great integration with ArgoCD, and is open to more complex strategies.
What's next?
- Choose your fighter, adapt the strategies to your applications. Likely some apps fit better with a Blue/Green approach and others with a Canary approach.
- Demo Session in lower environments.
- Plan migration with Teams.
- Capabilities could be improved in the future if/when a Service Mesh is added to the Platform.
2025 Updates
Recent developments have enhanced Progressive Delivery capabilities:
- Gateway API Support
- Native integration
- Enhanced traffic management
- Multi-cluster routing
- Enhanced Metrics
- OpenTelemetry integration
- Custom metric providers
- Advanced analysis capabilities
- Security Improvements
- RBAC enhancements
- Audit logging
- Security policy enforcement
- Multi-cluster Features
- Cross-cluster deployments
- Unified management
- Global traffic control
Resources
- Argo Rollouts
- Flagger
- More Problems with GitOps and How to Fix Them
- Kubernetes Canary Deployments
- Flagger Load Testing
- Argo Rollouts Migrating
About the Author
I'm a Platform Engineer Architect specializing in cloud-native technologies and engineering leadership. I focus on building scalable, reliable deployment pipelines and cloud infrastructure.
Connect with me on LinkedIn or contact me for more information.