Value Stream Mapping¶

Document Type: Business Architecture
Status: Draft
Version: 1.0
Last Updated: 2024-12-30
Owner: Architecture Team

Purpose¶

This document maps the end-to-end value streams that flow through Dokploy, from user need to delivered value. Value stream mapping helps identify bottlenecks, waste, and optimization opportunities in how the platform delivers value to stakeholders.

Value Stream Overview¶

Dokploy delivers value through five primary value streams:

Application Deployment: From code to running application
Infrastructure Provisioning: From resource need to available capacity
Incident Response: From problem detection to resolution
User Onboarding: From new user to productive user
Feature Development: From idea to deployed feature

Value Stream 1: Application Deployment¶

Overview¶

Trigger: Developer has new code to deploy
Outcome: Application running in production, accessible to users
Frequency: 10-100+ times per day per team
Critical Success Factor: Speed and reliability

Current State Map¶

graph LR
    A[Code Change] -->|Push| B[Git Repository]
    B -->|Webhook| C[Dokploy Receives Event]
    C -->|Queue| D[Build Process]
    D -->|Image| E[Container Registry]
    E -->|Pull| F[Docker Swarm]
    F -->|Deploy| G[Running Container]
    G -->|Health Check| H[Traffic Routing]
    H -->|Live| I[Users Access App]

    style A fill:#e1f5ff
    style I fill:#c8e6c9

Detailed Steps¶

Step 1: Code Commit¶

Actor: Developer
Actions:
Write code
Run local tests
Commit to Git
Push to repository
Duration: 5-60 minutes (variable)
Value-Add: Yes (creating feature/fix)
Wait Time: 0

Step 2: Git Webhook Trigger¶

Actor: Git provider (GitHub, GitLab)
Actions:
Detect push event
Call Dokploy webhook endpoint
Duration: 1-5 seconds
Value-Add: No (waiting)
Wait Time: 1-5 seconds
Automation: Fully automated

Step 3: Webhook Receipt & Validation¶

Actor: Dokploy API
Actions:
Verify webhook signature
Parse payload
Identify target application
Queue deployment job
Duration: 100-500ms
Value-Add: Yes (security, routing)
Wait Time: 0
Automation: Fully automated

Step 4: Build Queuing¶

Actor: Deployment queue
Actions:
Add job to queue
Wait for worker availability
Duration: 0-60 seconds (depends on queue depth)
Value-Add: No (waiting)
Wait Time: 0-60 seconds
Bottleneck: High load periods

Step 5: Image Build¶

Actor: Build worker
Actions:
Clone repository
Detect build context (Dockerfile, Buildpack)
Build Docker image
Tag image
Push to registry
Duration: 30 seconds - 10 minutes
Value-Add: Yes (creating deployable artifact)
Wait Time: 0
Automation: Fully automated
Bottleneck: Large dependencies, slow network

Step 6: Registry Push¶

Actor: Docker Registry
Actions:
Receive image layers
Store image
Confirm receipt
Duration: 5-60 seconds
Value-Add: No (storage)
Wait Time: 0
Automation: Fully automated

Step 7: Service Update¶

Actor: Docker Swarm
Actions:
Pull new image
Create new container
Start container
Wait for health check
Stop old container (rolling)
Duration: 10-120 seconds
Value-Add: Yes (deploying)
Wait Time: Health check period (10-30s)
Automation: Fully automated

Step 8: Traffic Routing¶

Actor: Traefik
Actions:
Detect new container
Update routing rules
Start routing traffic
Duration: 1-5 seconds
Value-Add: Yes (making available)
Wait Time: 0
Automation: Fully automated

Step 9: Verification¶

Actor: Developer
Actions:
Check deployment status
Verify application works
Monitor for errors
Duration: 1-10 minutes
Value-Add: Yes (quality assurance)
Wait Time: 0

Metrics¶

Metric	Target	Current	Gap
Lead Time (commit to live)	<5 minutes	2-15 minutes	Optimize build
Process Time (actual work)	~2 minutes	~2 minutes	✅
Wait Time (queuing, health checks)	<30 seconds	0-90 seconds	Reduce queue
Deployment Success Rate	>95%	~92%	Improve health checks
Rollback Time	<2 minutes	1-3 minutes	✅

Value Stream Efficiency¶

Process Efficiency = Process Time / Lead Time
= 2 minutes / 8 minutes (average)
= 25%

Target: 40%+

Waste Identification¶

Type 1: Waiting - Queue wait time (0-60s) - Health check wait (10-30s) - Improvement: Increase worker pool, optimize health checks

Type 2: Overprocessing - Rebuild unchanged dependencies every time - Improvement: Layer caching, dependency caching

Type 3: Defects - Failed deployments due to config errors - Improvement: Pre-deployment validation, config templates

Type 4: Transportation - Pushing large images to registry - Improvement: Use smaller base images, multi-stage builds

Improvement Opportunities¶

Quick Wins (Implement in v1.5)¶

Build caching: 40% faster builds
Parallel builds: Handle multiple simultaneous deployments
Smarter health checks: Reduce wait time by 50%
Pre-flight validation: Catch errors before deployment

Medium Term (v2.0)¶

Predictive scaling: Pre-scale before traffic spikes
Progressive delivery: Canary deployments for safer updates
Build analytics: Identify slow build steps

Long Term (v3.0)¶

Edge deployment: Deploy closer to users
Smart caching: AI-powered cache optimization

Value Stream 2: Infrastructure Provisioning¶

Overview¶

Trigger: Need for new compute/storage capacity
Outcome: Resources available and ready for workloads
Frequency: 5-20 times per week per team
Critical Success Factor: Speed and cost-efficiency

Current State Map¶

graph LR
    A[Resource Need] -->|Request| B[Provision API]
    B -->|Create| C[Docker Service]
    C -->|Allocate| D[Container]
    D -->|Mount| E[Volumes]
    E -->|Configure| F[Network]
    F -->|Start| G[Ready]

    style A fill:#e1f5ff
    style G fill:#c8e6c9

Detailed Steps¶

Step 1: Identify Need¶

Actor: Developer/Team Lead
Actions: Determine resource requirements
Duration: 5-30 minutes
Value-Add: Yes (planning)

Step 2: Configure Resource¶

Actor: Developer
Actions:
Open Dokploy UI
Select resource type (database, application)
Configure settings (size, replicas, etc.)
Review estimated cost
Duration: 2-10 minutes
Value-Add: Yes (configuration)

Step 3: Submit Request¶

Actor: Dokploy API
Actions:
Validate configuration
Check quotas
Authorize request
Duration: 100-500ms
Value-Add: Yes (validation)

Step 4: Provision Service¶

Actor: Docker Swarm
Actions:
Pull image
Create service
Schedule containers
Allocate resources
Duration: 10-60 seconds
Value-Add: Yes (provisioning)

Step 5: Configure Networking¶

Actor: Docker Swarm + Traefik
Actions:
Assign IP address
Configure DNS
Set up load balancing
Configure TLS
Duration: 5-20 seconds
Value-Add: Yes (networking)

Step 6: Storage Setup¶

Actor: Docker Volumes
Actions:
Create volume
Mount to container
Set permissions
Duration: 1-10 seconds
Value-Add: Yes (persistence)

Step 7: Health Verification¶

Actor: Dokploy
Actions:
Run health checks
Verify connectivity
Test access
Duration: 5-30 seconds
Value-Add: Yes (verification)

Step 8: Notify & Document¶

Actor: Dokploy
Actions:
Send notification to requester
Update inventory
Generate connection info
Duration: 1-5 seconds
Value-Add: Yes (communication)

Metrics¶

Metric	Target	Current	Gap
Time to Available	<2 minutes	1-3 minutes	✅
Configuration Errors	<5%	~8%	Improve validation
Resource Utilization	70-85%	~65%	Better sizing
Cost per Resource	Minimize	Baseline	Optimize

Improvement Opportunities¶

Resource templates: Pre-configured common setups
Smart sizing: ML-based resource recommendations
Cost analytics: Real-time cost tracking
Auto-cleanup: Remove unused resources

Value Stream 3: Incident Response¶

Overview¶

Trigger: Application error or outage detected
Outcome: Service restored, root cause identified
Frequency: 1-10 times per week (varies)
Critical Success Factor: Mean time to resolution (MTTR)

Current State Map¶

graph LR
    A[Issue Occurs] -->|Detect| B[Alert Triggered]
    B -->|Notify| C[Team Notified]
    C -->|Investigate| D[Log Analysis]
    D -->|Identify| E[Root Cause]
    E -->|Fix| F[Deploy Fix]
    F -->|Verify| G[Resolved]

    style A fill:#ffccbc
    style G fill:#c8e6c9

Detailed Steps¶

Step 1: Issue Detection¶

Actor: Monitoring system
Actions:
Health check fails
Error rate spike detected
Resource exhaustion
Duration: 30 seconds - 5 minutes (detection lag)
Value-Add: Yes (detection)
Bottleneck: Alert delay

Step 2: Alert Generation¶

Actor: Alerting system
Actions:
Evaluate alert rules
Determine severity
Route to appropriate channel
Duration: 5-30 seconds
Value-Add: Yes (notification)

Step 3: Team Notification¶

Actor: Notification system
Actions:
Send email/Slack/webhook
Page on-call (critical issues)
Duration: 1-5 minutes (includes human response time)
Value-Add: No (waiting for human)
Bottleneck: Human availability

Step 4: Initial Triage¶

Actor: On-call engineer
Actions:
Acknowledge alert
Assess severity
Determine if escalation needed
Duration: 2-10 minutes
Value-Add: Yes (assessment)

Step 5: Investigation¶

Actor: Engineer
Actions:
Review logs (Dokploy log viewer)
Check metrics (Grafana dashboards)
Review recent changes (deployment history)
Check resource utilization
Duration: 5-30 minutes
Value-Add: Yes (diagnosis)
Bottleneck: Log accessibility, tool switching

Step 6: Root Cause Identification¶

Actor: Engineer
Actions:
Correlate symptoms
Identify root cause
Determine fix strategy
Duration: 5-60 minutes (highly variable)
Value-Add: Yes (diagnosis)

Step 7: Remediation¶

Actor: Engineer
Actions:
Immediate: Rollback, restart, scale up
Short-term: Config change, hotfix deployment
Long-term: Code fix, architecture change
Duration: 1-30 minutes (immediate), hours-days (long-term)
Value-Add: Yes (resolution)

Step 8: Verification¶

Actor: Engineer
Actions:
Verify metrics recovered
Check error rates
Confirm user impact resolved
Duration: 5-15 minutes
Value-Add: Yes (verification)

Step 9: Post-Mortem¶

Actor: Team
Actions:
Document incident
Identify prevention measures
Create follow-up tasks
Duration: 30-60 minutes
Value-Add: Yes (learning)

Metrics¶

Metric	Target	Current	Gap
MTTD (Mean Time to Detect)	<2 minutes	1-5 minutes	Improve monitoring
MTTR (Mean Time to Resolve)	<30 minutes	15-120 minutes	Varies widely
False Positive Rate	<10%	~20%	Tune alerts
Repeat Incidents	<5%	~12%	Better root cause analysis

Improvement Opportunities¶

v1.5 (Quick Wins)¶

One-click rollback: Reduce resolution time by 50%
Integrated log viewer: Eliminate tool switching
Smart alerts: Reduce false positives
Runbooks: Guided troubleshooting

v2.0 (Medium Term)¶

AIOps: Anomaly detection, predictive alerts
Auto-remediation: Automatic scaling, restarts
Correlation engine: Link related events
Incident timeline: Automatic chronology

v3.0 (Long Term)¶

Self-healing: Automatic issue resolution
Chaos engineering: Proactive resilience testing
AI assistant: Guided troubleshooting

Value Stream 4: User Onboarding¶

Overview¶

Trigger: New user signs up
Outcome: User successfully deploys first application
Frequency: Varies by growth (100s-1000s per month at scale)
Critical Success Factor: Time to first value

Current State Map¶

graph LR
    A[Sign Up] -->|Create| B[Account Created]
    B -->|Setup| C[Connect Git]
    C -->|Configure| D[Create App]
    D -->|Deploy| E[First Deployment]
    E -->|Verify| F[App Running]
    F -->|Use| G[Productive User]

    style A fill:#e1f5ff
    style G fill:#c8e6c9

Detailed Steps¶

Actor: Potential user
Actions:
Find Dokploy (search, referral, etc.)
Visit website
Read documentation
Decide to try
Sign up (email, OAuth)
Duration: 5-60 minutes
Value-Add: Yes (discovery)
Bottleneck: Documentation clarity

Step 2: Initial Setup¶

Actor: New user
Actions:
Complete registration
Verify email
Set password
Configure profile
Duration: 2-5 minutes
Value-Add: No (necessary friction)

Step 3: Environment Setup¶

Actor: User
Actions:
Install Dokploy (if self-hosted)
Or provision server
Configure DNS
Set up TLS
Duration: 10-60 minutes
Value-Add: Yes (preparation)
Bottleneck: Technical complexity

Step 4: First Project Creation¶

Actor: User
Actions:
Create project
Invite team members (optional)
Set project settings
Duration: 2-5 minutes
Value-Add: Yes (organization)

Step 5: Connect Git Repository¶

Actor: User
Actions:
Authenticate with Git provider
Select repository
Configure webhook
Duration: 3-10 minutes
Value-Add: Yes (integration)
Bottleneck: OAuth complexity

Step 6: Application Configuration¶

Actor: User
Actions:
Detect build settings (auto or manual)
Configure environment variables
Set resource limits
Configure domain
Duration: 5-20 minutes
Value-Add: Yes (configuration)
Bottleneck: Too many options, unclear defaults

Step 7: First Deployment¶

Actor: User + Dokploy
Actions:
Trigger deployment
Watch build logs
Wait for completion
Duration: 2-10 minutes
Value-Add: Yes (deployment)
Bottleneck: Build time, unclear errors

Step 8: Verification & Success¶

Actor: User
Actions:
Visit deployed application
Verify it works
Celebrate! 🎉
Duration: 1-5 minutes
Value-Add: Yes (verification)

Step 9: Explore & Learn¶

Actor: User
Actions:
Explore other features
Read advanced docs
Join community
Duration: Ongoing
Value-Add: Yes (education)

Metrics¶

Metric	Target	Current	Gap
Time to First Deployment	<30 minutes	45-90 minutes	Simplify setup
Setup Success Rate	>80%	~65%	Reduce friction
Activation Rate (deploy within 7 days)	>70%	~55%	Improve onboarding
Retention (active after 30 days)	>60%	~45%	Prove value faster

Improvement Opportunities¶

v1.0 (Launch)¶

Quick start guide: 5-minute deployment tutorial
Sample applications: Pre-configured examples
One-click deploy: Deploy from template
Better error messages: Help users fix issues

v1.5 (Enhanced)¶

Interactive tutorial: In-app guided setup
Video walkthroughs: Visual learning
Auto-detection: Detect framework, auto-configure
Hosted demo: Try without installing

v2.0 (Advanced)¶

AI setup assistant: Conversational setup
Instant preview: Deploy to preview environment
Migration tools: Import from Heroku, Vercel
Onboarding analytics: Identify drop-off points

Value Stream 5: Feature Development¶

Overview¶

Trigger: User need or strategic initiative
Outcome: Feature deployed and adopted by users
Frequency: Continuous (sprints)
Critical Success Factor: Time to market, user adoption

Current State Map¶

graph LR
    A[Idea/Need] -->|Validate| B[Requirement]
    B -->|Design| C[Architecture]
    C -->|Develop| D[Code]
    D -->|Test| E[QA]
    E -->|Deploy| F[Production]
    F -->|Measure| G[Adoption]

    style A fill:#e1f5ff
    style G fill:#c8e6c9

Detailed Steps¶

Step 1: Ideation & Validation¶

Actor: Product team + users
Actions:
Gather user feedback
Analyze usage data
Identify pain points
Prioritize features
Duration: 1-7 days
Value-Add: Yes (validation)

Step 2: Requirements Definition¶

Actor: Product manager
Actions:
Write user stories
Define acceptance criteria
Create mockups
Review with stakeholders
Duration: 1-3 days
Value-Add: Yes (definition)

Step 3: Architecture & Design¶

Actor: Architects + engineers
Actions:
Design solution
Create technical spec
Review alternatives
Get approval
Duration: 1-5 days
Value-Add: Yes (planning)

Step 4: Development¶

Actor: Engineers
Actions:
Write code
Local testing
Code review
Merge to main
Duration: 2-10 days
Value-Add: Yes (building)

Step 5: Testing¶

Actor: QA + engineers
Actions:
Unit tests
Integration tests
Manual testing
Security scanning
Duration: 1-3 days
Value-Add: Yes (quality)

Step 6: Documentation¶

Actor: Technical writer + engineers
Actions:
Update user docs
Create API docs
Write release notes
Update examples
Duration: 1-2 days
Value-Add: Yes (enablement)

Step 7: Release¶

Actor: Release manager
Actions:
Deploy to staging
Smoke testing
Deploy to production
Monitor for issues
Duration: 2-4 hours
Value-Add: Yes (delivery)

Step 8: Announcement¶

Actor: Marketing + product
Actions:
Write blog post
Social media announcement
Email newsletter
Update website
Duration: 1-2 days
Value-Add: Yes (awareness)

Step 9: Adoption & Feedback¶

Actor: Users + team
Actions:
Users try feature
Collect feedback
Monitor usage metrics
Iterate based on learnings
Duration: Ongoing (2-4 weeks active monitoring)
Value-Add: Yes (learning)

Metrics¶

Metric	Target	Current	Gap
Lead Time (idea to production)	<14 days	14-30 days	Streamline process
Cycle Time (development to production)	<7 days	5-14 days	✅
Feature Adoption (used within 30 days)	>40%	~30%	Better communication
User Satisfaction	>4.5/5	~4.⅖	Better quality

Improvement Opportunities¶

Feature flags: Gradual rollout, A/B testing
Telemetry: Automatic usage tracking
In-app announcements: Notify users of new features
Feedback loops: In-app feedback collection

Cross-Stream Patterns¶

Pattern 1: Automation¶

Benefit: Reduce manual steps, increase consistency
Implementation: Webhooks, CI/CD, auto-scaling
Impact: 50% reduction in manual tasks

Pattern 2: Observability¶

Benefit: Faster issue detection and resolution
Implementation: Metrics, logs, traces, alerts
Impact: 40% reduction in MTTR

Pattern 3: Self-Service¶

Benefit: Reduce bottlenecks, empower users
Implementation: UI, API, documentation
Impact: 10x increase in capacity

Pattern 4: Validation¶

Benefit: Catch errors early, reduce failures
Implementation: Pre-flight checks, configuration validation
Impact: 60% reduction in failed deployments

Value Stream Optimization Roadmap¶

Phase 1: v1.0 (Foundation)¶

Focus: Core value streams functional - ✅ Basic deployment pipeline - ✅ Manual provisioning - ✅ Basic monitoring - ✅ Documentation

Phase 2: v1.5 (Automation)¶

Focus: Reduce manual work - Git webhooks - Build caching - Alert automation - Quick-start templates

Phase 3: v2.0 (Intelligence)¶

Focus: Smart optimization - Auto-scaling - Predictive alerts - Smart caching - AI-powered troubleshooting

Phase 4: v3.0 (Self-Optimization)¶

Focus: Continuous improvement - Self-healing systems - Automatic optimization - Proactive capacity management - AI-driven onboarding

Success Metrics Dashboard¶

Overall Platform Health¶

Deployment Lead Time: <5 minutes (avg)
Deployment Success Rate: >95%
Platform Uptime: >99.9%
MTTR: <30 minutes

User Experience¶

Time to First Deployment: <30 minutes
User Activation Rate: >70%
User Retention (30 days): >60%
NPS Score: >40

Efficiency¶

Process Efficiency: >40%
Resource Utilization: 70-85%
Cost per Deployment: Minimize
Support Ticket Volume: <10% of users/month

Business Capability Model: Capabilities that enable these value streams
Architecture Principles: Principles guiding optimization decisions
Stakeholder Analysis: Stakeholders impacted by each value stream
Deployment Diagram: Technical infrastructure supporting value streams

Document Version: 1.0
Last Updated: 2024-12-30
Next Review: 2025-03-30
Reviewed By: Architecture Team, Product Team, Operations Team

Value Stream Mapping¶

Purpose¶

Value Stream Overview¶

Value Stream 1: Application Deployment¶

Overview¶

Current State Map¶

Detailed Steps¶

Step 1: Code Commit¶

Step 2: Git Webhook Trigger¶

Step 3: Webhook Receipt & Validation¶

Step 4: Build Queuing¶

Step 5: Image Build¶

Step 6: Registry Push¶

Step 7: Service Update¶

Step 8: Traffic Routing¶

Step 9: Verification¶

Metrics¶

Value Stream Efficiency¶

Waste Identification¶

Improvement Opportunities¶

Quick Wins (Implement in v1.5)¶

Medium Term (v2.0)¶

Long Term (v3.0)¶

Value Stream 2: Infrastructure Provisioning¶

Overview¶

Current State Map¶

Detailed Steps¶

Step 1: Identify Need¶

Step 2: Configure Resource¶

Step 3: Submit Request¶

Step 4: Provision Service¶

Step 5: Configure Networking¶

Step 6: Storage Setup¶

Step 7: Health Verification¶

Step 8: Notify & Document¶

Metrics¶

Improvement Opportunities¶

Value Stream 3: Incident Response¶

Overview¶

Current State Map¶

Detailed Steps¶

Step 1: Issue Detection¶

Step 2: Alert Generation¶

Step 3: Team Notification¶

Step 4: Initial Triage¶

Step 5: Investigation¶

Step 6: Root Cause Identification¶

Step 7: Remediation¶

Step 8: Verification¶

Step 9: Post-Mortem¶

Metrics¶

Improvement Opportunities¶

v1.5 (Quick Wins)¶

v2.0 (Medium Term)¶

v3.0 (Long Term)¶

Value Stream 4: User Onboarding¶

Overview¶

Current State Map¶

Detailed Steps¶

Step 1: Discovery & Sign Up¶

Step 2: Initial Setup¶

Step 3: Environment Setup¶

Step 4: First Project Creation¶

Step 5: Connect Git Repository¶

Step 6: Application Configuration¶

Step 7: First Deployment¶

Step 8: Verification & Success¶

Step 9: Explore & Learn¶

Metrics¶

Improvement Opportunities¶

v1.0 (Launch)¶

v1.5 (Enhanced)¶

v2.0 (Advanced)¶

Value Stream 5: Feature Development¶

Overview¶

Current State Map¶

Detailed Steps¶

Step 1: Ideation & Validation¶

Step 2: Requirements Definition¶

Step 3: Architecture & Design¶