OLM (Operator Lifecycle Manager) Workflow Examples¶

This document provides comprehensive examples for managing and analyzing OLM-deployed operators using k8s-datamodel.

Table of Contents¶

Basic OLM Operations
ClusterServiceVersion Analysis
Operator Lifecycle Management
RBAC and Security Analysis
Upgrade and Version Management
Multi-Environment OLM Management
Troubleshooting and Diagnostics
Integration with Monitoring

Basic OLM Operations¶

Discovering OLM-Managed Operators¶

# List all ClusterServiceVersions
k8s-datamodel olm list

# List with rich formatting for better readability
k8s-datamodel olm list --output rich

# Filter by installation status
k8s-datamodel olm list --phase Succeeded
k8s-datamodel olm list --phase Failed
k8s-datamodel olm list --phase Installing

# List OLM operators in specific namespace
k8s-datamodel olm list --namespace operators

Expected Output:

┌─────────────────────────────────────┬────────────────┬─────────────────────┬───────────┬────────────────────────────────────────┐
│ Name                                │ Namespace      │ Display Name        │ Version   │ Phase                                  │
├─────────────────────────────────────┼────────────────┼─────────────────────┼───────────┼────────────────────────────────────────┤
│ azure-service-operator.v1.0.28631   │ operators      │ Azure Service Operator │ 1.0.28631 │ Succeeded                              │
│ cloudnative-pg.v1.27.0              │ operators      │ CloudNativePG        │ 1.27.0    │ Succeeded                              │
│ mariadb-operator.v25.8.3            │ operators      │ MariaDB Operator     │ 25.8.3    │ Succeeded                              │
│ oracle-database-operator.v1.2.0     │ operators      │ Oracle DB Operator   │ 1.2.0     │ Succeeded                              │
└─────────────────────────────────────┴────────────────┴─────────────────────┴───────────┴────────────────────────────────────────┘

Getting Detailed CSV Information¶

# Get detailed information about a specific CSV
k8s-datamodel olm get azure-service-operator.v1.0.28631 --namespace operators

# Get CSV details in JSON format for processing
k8s-datamodel olm get cloudnative-pg.v1.27.0 --namespace operators --output json

# Get CSV details in YAML format for human readability
k8s-datamodel olm get mariadb-operator.v25.8.3 --namespace operators --output yaml

OLM Statistics and Health¶

# Get comprehensive OLM statistics
k8s-datamodel olm stats

# Get statistics with rich formatting
k8s-datamodel olm stats --output rich

# Get statistics in JSON format for monitoring
k8s-datamodel olm stats --output json

Expected Output:

╭─────────────────────────────────────────────────────── OLM Statistics ────────────────────────────────────────────────────────╮
│ Total ClusterServiceVersions: 33                                                                                               │
│ Succeeded: 29                                                                                                                  │
│ Failed: 2                                                                                                                      │
│ Installing: 2                                                                                                                  │
│                                                                                                                                │
│ Total Owned CRDs: 156                                                                                                         │
│ Total Required CRDs: 23                                                                                                       │
│                                                                                                                                │
│ Unique Providers: 12                                                                                                          │
│ Namespaces with OLM: 4                                                                                                        │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

ClusterServiceVersion Analysis¶

Analyzing CSV Metadata and Configuration¶

# Store current OLM state for analysis
k8s-datamodel database store --notes "OLM Analysis - $(date +%Y-%m-%d)"

# Export OLM data for detailed analysis
k8s-datamodel database export 1 --file olm-analysis.json

# Analyze CSV providers
echo "=== OLM Provider Analysis ==="
jq -r '.csvs[] | "\(.provider): \(.display_name) (\(.version))"' olm-analysis.json | sort | uniq -c | sort -nr

# Analyze installation strategies
echo "=== Installation Strategy Breakdown ==="
jq -r '.csvs[] | .install_strategy' olm-analysis.json | sort | uniq -c

# Find CSVs with specific capabilities
echo "=== CSVs with Full Lifecycle Management ==="
jq -r '.csvs[] | select(.spec.spec.installModes[]?.type == "AllNamespaces") | 
       "\(.name): \(.display_name) - Supports AllNamespaces"' olm-analysis.json

CRD Ownership Analysis¶

# Analyze CRD ownership patterns
echo "=== CRD Ownership Analysis ==="

# Find CSVs with the most owned CRDs
echo "## Top CRD Owners:"
jq -r '.csvs[] | "\(.owned_crds | length) \(.name) \(.display_name)"' olm-analysis.json | 
    sort -nr | head -10

# Find CRDs managed by multiple operators
echo "## CRDs with Multiple Owners:"
jq -r '.csvs[] | .owned_crds[] as $crd | "\($crd) \(.name)"' olm-analysis.json | 
    sort | uniq | cut -d' ' -f1 | sort | uniq -d | 
    while read crd; do
        echo "CRD: $crd"
        grep "^$crd " <(jq -r '.csvs[] | .owned_crds[] as $crd | "\($crd) \(.name)"' olm-analysis.json) | 
            sed 's/^[^ ]* /  Owned by: /'
        echo
    done

# Analyze CRD dependencies  
echo "## CRD Dependencies:"
jq -r '.csvs[] | select(.required_crds | length > 0) | 
       "\(.name) requires: \(.required_crds | join(\", \"))"' olm-analysis.json

CSV Resource Requirements Analysis¶

# Analyze resource requirements from CSV specs
echo "=== CSV Resource Requirements Analysis ==="

# Extract deployment specifications from CSVs
jq -r '.csvs[] | select(.spec.spec.install.strategy == "deployment") | 
       {name: .name, deployments: .spec.spec.install.spec.deployments[].spec.template.spec.containers[0].resources}' \
       olm-analysis.json > csv-resources.json

# Find CSVs without resource limits
echo "## CSVs without Resource Limits:"
jq -r 'select(.deployments.limits == null) | .name' csv-resources.json

# Calculate total resource requests
echo "## Total Resource Requests:"
jq -r 'select(.deployments.requests != null) | 
       "\(.name): CPU=\(.deployments.requests.cpu // "none"), Memory=\(.deployments.requests.memory // "none")"' \
       csv-resources.json

# Find high-resource CSVs
echo "## High Resource CSVs:"
jq -r 'select(.deployments.requests.memory != null) | 
       select(.deployments.requests.memory | test("Gi")) | 
       "\(.name): \(.deployments.requests.memory)"' csv-resources.json

Operator Lifecycle Management¶

Monitoring Operator Health¶

# Create OLM health monitoring script
cat > olm-health-monitor.sh << 'EOF'
#!/bin/bash
# OLM Health Monitoring Script

DATE=$(date +%Y-%m-%d-%H-%M)
LOG_FILE="olm-health-$DATE.log"

echo "=== OLM Health Check - $(date) ===" | tee $LOG_FILE

# Check overall OLM status
echo "## Overall OLM Status" | tee -a $LOG_FILE
k8s-datamodel olm stats --output table | tee -a $LOG_FILE
echo "" | tee -a $LOG_FILE

# Check failed CSVs
echo "## Failed ClusterServiceVersions" | tee -a $LOG_FILE
FAILED_CSVS=$(k8s-datamodel olm list --phase Failed --output json)
if [ "$(echo "$FAILED_CSVS" | jq '. | length')" -gt 0 ]; then
    echo "$FAILED_CSVS" | jq -r '.[] | "❌ \(.name) in \(.namespace) - \(.phase)"' | tee -a $LOG_FILE
else
    echo "✅ No failed CSVs detected" | tee -a $LOG_FILE
fi
echo "" | tee -a $LOG_FILE

# Check installing CSVs (potential stuck installations)
echo "## Installing ClusterServiceVersions" | tee -a $LOG_FILE  
INSTALLING_CSVS=$(k8s-datamodel olm list --phase Installing --output json)
if [ "$(echo "$INSTALLING_CSVS" | jq '. | length')" -gt 0 ]; then
    echo "⚠️ CSVs currently installing:" | tee -a $LOG_FILE
    echo "$INSTALLING_CSVS" | jq -r '.[] | "   \(.name) in \(.namespace)"' | tee -a $LOG_FILE
else
    echo "✅ No CSVs currently installing" | tee -a $LOG_FILE
fi
echo "" | tee -a $LOG_FILE

# Check for version mismatches
echo "## Version Consistency Check" | tee -a $LOG_FILE
k8s-datamodel database export $(k8s-datamodel database list --limit 1 --output json | jq -r '.[0].id') --file current-olm.json
jq -r '.csvs[] | select(.replaces != null and .replaces != "") | 
       "Upgrade detected: \(.name) replaces \(.replaces)"' current-olm.json | tee -a $LOG_FILE

echo "Health check complete. Results saved to $LOG_FILE"
EOF

chmod +x olm-health-monitor.sh

Upgrade Planning and Tracking¶

# Create upgrade planning workflow
cat > olm-upgrade-planner.sh << 'EOF'
#!/bin/bash
# OLM Upgrade Planning Script

UPGRADE_PLAN_FILE="olm-upgrade-plan-$(date +%Y-%m-%d).md"

echo "# OLM Upgrade Plan - $(date)" > $UPGRADE_PLAN_FILE
echo "" >> $UPGRADE_PLAN_FILE

# Store pre-upgrade snapshot
k8s-datamodel database store --notes "Pre-upgrade baseline - $(date +%Y-%m-%d)"
BASELINE_ID=$(k8s-datamodel database list --limit 1 --output json | jq -r '.[0].id')

# Export current state
k8s-datamodel database export $BASELINE_ID --file pre-upgrade-olm.json

echo "## Current OLM State" >> $UPGRADE_PLAN_FILE
echo "- Total CSVs: $(jq '.csvs | length' pre-upgrade-olm.json)" >> $UPGRADE_PLAN_FILE
echo "- Succeeded: $(jq '[.csvs[] | select(.phase == "Succeeded")] | length' pre-upgrade-olm.json)" >> $UPGRADE_PLAN_FILE
echo "- Failed: $(jq '[.csvs[] | select(.phase == "Failed")] | length' pre-upgrade-olm.json)" >> $UPGRADE_PLAN_FILE
echo "" >> $UPGRADE_PLAN_FILE

echo "## Operators Ready for Upgrade" >> $UPGRADE_PLAN_FILE
# Find CSVs that have newer versions available (based on replaces field analysis)
jq -r '.csvs[] | select(.replaces != null and .replaces != "") | 
       "- **\(.display_name)**: \(.version) (replaces: \(.replaces))"' pre-upgrade-olm.json >> $UPGRADE_PLAN_FILE
echo "" >> $UPGRADE_PLAN_FILE

echo "## Upgrade Dependencies" >> $UPGRADE_PLAN_FILE
# Analyze required CRDs for upgrade compatibility
jq -r '.csvs[] | select(.required_crds | length > 0) | 
       "- **\(.display_name)**: requires \(.required_crds | join(", "))"' pre-upgrade-olm.json >> $UPGRADE_PLAN_FILE
echo "" >> $UPGRADE_PLAN_FILE

echo "## Pre-Upgrade Checklist" >> $UPGRADE_PLAN_FILE
echo "- [ ] Backup cluster state" >> $UPGRADE_PLAN_FILE
echo "- [ ] Verify all CSVs are in Succeeded state" >> $UPGRADE_PLAN_FILE  
echo "- [ ] Check for stuck installations" >> $UPGRADE_PLAN_FILE
echo "- [ ] Review upgrade dependencies" >> $UPGRADE_PLAN_FILE
echo "- [ ] Schedule maintenance window" >> $UPGRADE_PLAN_FILE
echo "" >> $UPGRADE_PLAN_FILE

echo "Upgrade plan generated: $UPGRADE_PLAN_FILE"
echo "Baseline snapshot ID: $BASELINE_ID"
EOF

chmod +x olm-upgrade-planner.sh

Post-Upgrade Verification¶

# Create post-upgrade verification script
cat > olm-upgrade-verify.sh << 'EOF'
#!/bin/bash
# OLM Post-Upgrade Verification Script

BASELINE_ID=${1:-$(cat .pre-upgrade-snapshot-id 2>/dev/null)}
if [ -z "$BASELINE_ID" ]; then
    echo "Error: Please provide baseline snapshot ID"
    echo "Usage: $0 <baseline_snapshot_id>"
    exit 1
fi

VERIFICATION_REPORT="olm-upgrade-verification-$(date +%Y-%m-%d).md"

echo "# OLM Upgrade Verification Report - $(date)" > $VERIFICATION_REPORT
echo "" >> $VERIFICATION_REPORT

# Store post-upgrade snapshot
k8s-datamodel database store --notes "Post-upgrade verification - $(date +%Y-%m-%d)"
POST_UPGRADE_ID=$(k8s-datamodel database list --limit 1 --output json | jq -r '.[0].id')

# Export both snapshots
k8s-datamodel database export $BASELINE_ID --file pre-upgrade.json
k8s-datamodel database export $POST_UPGRADE_ID --file post-upgrade.json

echo "## Upgrade Summary" >> $VERIFICATION_REPORT
echo "- Baseline Snapshot: $BASELINE_ID" >> $VERIFICATION_REPORT
echo "- Post-Upgrade Snapshot: $POST_UPGRADE_ID" >> $VERIFICATION_REPORT
echo "" >> $VERIFICATION_REPORT

# Compare CSV counts
PRE_COUNT=$(jq '.csvs | length' pre-upgrade.json)
POST_COUNT=$(jq '.csvs | length' post-upgrade.json)
echo "## CSV Count Comparison" >> $VERIFICATION_REPORT
echo "- Pre-upgrade: $PRE_COUNT CSVs" >> $VERIFICATION_REPORT
echo "- Post-upgrade: $POST_COUNT CSVs" >> $VERIFICATION_REPORT
echo "- Change: $(($POST_COUNT - $PRE_COUNT))" >> $VERIFICATION_REPORT
echo "" >> $VERIFICATION_REPORT

# Check for failed CSVs
echo "## Failed CSVs After Upgrade" >> $VERIFICATION_REPORT
FAILED_CSVS=$(jq -r '.csvs[] | select(.phase == "Failed") | .name' post-upgrade.json)
if [ -n "$FAILED_CSVS" ]; then
    echo "⚠️ Failed CSVs detected:" >> $VERIFICATION_REPORT
    echo "$FAILED_CSVS" | while read csv; do echo "- ❌ $csv"; done >> $VERIFICATION_REPORT
else
    echo "✅ No failed CSVs detected" >> $VERIFICATION_REPORT
fi
echo "" >> $VERIFICATION_REPORT

# Check version changes
echo "## Version Changes" >> $VERIFICATION_REPORT
echo "### Upgraded CSVs" >> $VERIFICATION_REPORT
comm -13 <(jq -r '.csvs[] | "\(.name) \(.version)"' pre-upgrade.json | sort) \
         <(jq -r '.csvs[] | "\(.name) \(.version)"' post-upgrade.json | sort) | \
while read csv_version; do echo "- ✅ $csv_version"; done >> $VERIFICATION_REPORT

echo "" >> $VERIFICATION_REPORT
echo "### New CSVs Added" >> $VERIFICATION_REPORT
comm -13 <(jq -r '.csvs[].name' pre-upgrade.json | sort) \
         <(jq -r '.csvs[].name' post-upgrade.json | sort) | \
while read csv; do echo "- ➕ $csv"; done >> $VERIFICATION_REPORT

echo "" >> $VERIFICATION_REPORT
echo "### CSVs Removed" >> $VERIFICATION_REPORT
comm -23 <(jq -r '.csvs[].name' pre-upgrade.json | sort) \
         <(jq -r '.csvs[].name' post-upgrade.json | sort) | \
while read csv; do echo "- ➖ $csv"; done >> $VERIFICATION_REPORT

echo "" >> $VERIFICATION_REPORT
echo "## Verification Status" >> $VERIFICATION_REPORT
if [ -z "$FAILED_CSVS" ]; then
    echo "✅ Upgrade verification PASSED" >> $VERIFICATION_REPORT
else
    echo "⚠️ Upgrade verification REQUIRES ATTENTION" >> $VERIFICATION_REPORT
fi

echo "Verification report generated: $VERIFICATION_REPORT"
EOF

chmod +x olm-upgrade-verify.sh

RBAC and Security Analysis¶

Comprehensive RBAC Analysis¶

# Create comprehensive RBAC analysis script  
cat > olm-rbac-analyzer.sh << 'EOF'
#!/bin/bash
# OLM RBAC Analysis Script

REPORT_FILE="olm-rbac-analysis-$(date +%Y-%m-%d).md"

echo "# OLM RBAC Security Analysis - $(date)" > $REPORT_FILE
echo "" >> $REPORT_FILE

# Store current state for analysis
k8s-datamodel database store --notes "RBAC Security Analysis - $(date +%Y-%m-%d)"
SNAPSHOT_ID=$(k8s-datamodel database list --limit 1 --output json | jq -r '.[0].id')
k8s-datamodel database export $SNAPSHOT_ID --file rbac-analysis.json

echo "## Executive Summary" >> $REPORT_FILE
TOTAL_CSVS=$(jq '.csvs | length' rbac-analysis.json)
CSVS_WITH_CLUSTER_PERMS=$(jq '[.csvs[] | select(.spec.spec.install.spec.clusterPermissions | length > 0)] | length' rbac-analysis.json)
echo "- Total CSVs analyzed: $TOTAL_CSVS" >> $REPORT_FILE
echo "- CSVs with cluster permissions: $CSVS_WITH_CLUSTER_PERMS" >> $REPORT_FILE
echo "- Security risk level: $([ $CSVS_WITH_CLUSTER_PERMS -gt $(($TOTAL_CSVS / 2)) ] && echo "HIGH" || echo "MODERATE")" >> $REPORT_FILE
echo "" >> $REPORT_FILE

echo "## Cluster-Level Permissions Analysis" >> $REPORT_FILE
echo "### CSVs with Cluster Permissions" >> $REPORT_FILE
jq -r '.csvs[] | select(.spec.spec.install.spec.clusterPermissions | length > 0) | 
       "- **\(.display_name)** (\(.name)): \(.spec.spec.install.spec.clusterPermissions | length) cluster permissions"' \
       rbac-analysis.json >> $REPORT_FILE
echo "" >> $REPORT_FILE

echo "### High-Risk Permissions" >> $REPORT_FILE
echo "#### Wildcard Resource Access (*)" >> $REPORT_FILE
jq -r '.csvs[] | select(.spec.spec.install.spec.clusterPermissions[]?.rules[]?.resources[]? == "*") | 
       "- 🚨 **\(.display_name)**: Has wildcard (*) resource access"' rbac-analysis.json >> $REPORT_FILE

echo "" >> $REPORT_FILE
echo "#### Wildcard Verb Access (*)" >> $REPORT_FILE
jq -r '.csvs[] | select(.spec.spec.install.spec.clusterPermissions[]?.rules[]?.verbs[]? == "*") | 
       "- 🚨 **\(.display_name)**: Has wildcard (*) verb access"' rbac-analysis.json >> $REPORT_FILE

echo "" >> $REPORT_FILE  
echo "#### Dangerous Resource Access" >> $REPORT_FILE
DANGEROUS_RESOURCES=("nodes" "persistentvolumes" "clusterroles" "clusterrolebindings" "secrets")
for resource in "${DANGEROUS_RESOURCES[@]}"; do
    echo "##### Access to $resource" >> $REPORT_FILE
    jq -r --arg res "$resource" '.csvs[] | 
           select(.spec.spec.install.spec.clusterPermissions[]?.rules[]?.resources[]? == $res) | 
           "- ⚠️ **\(.display_name)**: Can access \($res)"' rbac-analysis.json >> $REPORT_FILE
done

echo "" >> $REPORT_FILE
echo "## Namespace-Level Permissions Analysis" >> $REPORT_FILE
jq -r '.csvs[] | select(.spec.spec.install.spec.permissions | length > 0) | 
       "- **\(.display_name)**: \(.spec.spec.install.spec.permissions | length) namespace permissions"' \
       rbac-analysis.json >> $REPORT_FILE

echo "" >> $REPORT_FILE
echo "## Security Recommendations" >> $REPORT_FILE
echo "1. **Review Wildcard Permissions**: CSVs with wildcard access should be carefully reviewed" >> $REPORT_FILE
echo "2. **Implement Least Privilege**: Ensure operators only have necessary permissions" >> $REPORT_FILE  
echo "3. **Monitor Privilege Escalation**: Track changes in operator permissions over time" >> $REPORT_FILE
echo "4. **Audit Dangerous Resources**: Special attention to operators accessing sensitive resources" >> $REPORT_FILE
echo "5. **Regular Security Reviews**: Schedule periodic RBAC permission audits" >> $REPORT_FILE

echo "RBAC analysis complete: $REPORT_FILE"
EOF

chmod +x olm-rbac-analyzer.sh

Security Context Analysis¶

# Analyze security contexts of OLM-managed operators
cat > olm-security-context-analyzer.sh << 'EOF'
#!/bin/bash
# OLM Security Context Analysis Script

REPORT_FILE="olm-security-contexts-$(date +%Y-%m-%d).md"

echo "# OLM Security Context Analysis - $(date)" > $REPORT_FILE
echo "" >> $REPORT_FILE

# Get current cluster state
k8s-datamodel database store --notes "Security Context Analysis - $(date +%Y-%m-%d)"
SNAPSHOT_ID=$(k8s-datamodel database list --limit 1 --output json | jq -r '.[0].id')
k8s-datamodel database export $SNAPSHOT_ID --file security-context-analysis.json

echo "## Security Context Summary" >> $REPORT_FILE

# Find operators managed by OLM (deployed via CSVs)
OLM_OPERATORS=$(jq -r '.csvs[] | .spec.spec.install.spec.deployments[]?.name' security-context-analysis.json | sort | uniq)

echo "### OLM-Managed Operators Security Analysis" >> $REPORT_FILE
while read operator; do
    if [ -n "$operator" ]; then
        echo "#### $operator" >> $REPORT_FILE

        # Find corresponding operator in operators list
        OPERATOR_INFO=$(jq --arg op "$operator" '.operators[] | select(.name == $op)' security-context-analysis.json)

        if [ "$OPERATOR_INFO" != "null" ] && [ -n "$OPERATOR_INFO" ]; then
            # Check security context
            PRIVILEGED=$(echo "$OPERATOR_INFO" | jq -r '.spec.spec.template.spec.containers[0].securityContext.privileged // false')
            RUN_AS_ROOT=$(echo "$OPERATOR_INFO" | jq -r '.spec.spec.template.spec.containers[0].securityContext.runAsUser // "not_set"')
            HOST_NETWORK=$(echo "$OPERATOR_INFO" | jq -r '.spec.spec.template.spec.hostNetwork // false')

            echo "- Privileged: $PRIVILEGED" >> $REPORT_FILE
            echo "- Run as root: $([ "$RUN_AS_ROOT" = "0" ] && echo "true" || echo "false ($RUN_AS_ROOT)")" >> $REPORT_FILE
            echo "- Host network: $HOST_NETWORK" >> $REPORT_FILE

            # Security score
            SECURITY_ISSUES=0
            [ "$PRIVILEGED" = "true" ] && ((SECURITY_ISSUES++))
            [ "$RUN_AS_ROOT" = "0" ] && ((SECURITY_ISSUES++))  
            [ "$HOST_NETWORK" = "true" ] && ((SECURITY_ISSUES++))

            if [ $SECURITY_ISSUES -eq 0 ]; then
                echo "- **Security Level: ✅ GOOD**" >> $REPORT_FILE
            elif [ $SECURITY_ISSUES -eq 1 ]; then
                echo "- **Security Level: ⚠️ MODERATE RISK**" >> $REPORT_FILE
            else
                echo "- **Security Level: 🚨 HIGH RISK**" >> $REPORT_FILE
            fi
        else
            echo "- **Status: Not found in operator inventory**" >> $REPORT_FILE
        fi
        echo "" >> $REPORT_FILE
    fi
done <<< "$OLM_OPERATORS"

echo "## High-Risk Security Configurations" >> $REPORT_FILE
echo "### Privileged Containers" >> $REPORT_FILE
jq -r '.operators[] | select(.spec.spec.template.spec.containers[0].securityContext.privileged == true) | 
       "- 🚨 **\(.name)** (namespace: \(.namespace)): Running privileged"' security-context-analysis.json >> $REPORT_FILE

echo "" >> $REPORT_FILE  
echo "### Host Network Access" >> $REPORT_FILE
jq -r '.operators[] | select(.spec.spec.template.spec.hostNetwork == true) | 
       "- ⚠️ **\(.name)** (namespace: \(.namespace)): Has host network access"' security-context-analysis.json >> $REPORT_FILE

echo "" >> $REPORT_FILE
echo "### Root User Execution" >> $REPORT_FILE
jq -r '.operators[] | select(.spec.spec.template.spec.containers[0].securityContext.runAsUser == 0) | 
       "- ⚠️ **\(.name)** (namespace: \(.namespace)): Running as root user"' security-context-analysis.json >> $REPORT_FILE

echo "Security context analysis complete: $REPORT_FILE"
EOF

chmod +x olm-security-context-analyzer.sh

Upgrade and Version Management¶

Version Tracking and Analysis¶

# Create version tracking script
cat > olm-version-tracker.sh << 'EOF'
#!/bin/bash
# OLM Version Tracking Script

REPORT_FILE="olm-version-report-$(date +%Y-%m-%d).md"

echo "# OLM Version Tracking Report - $(date)" > $REPORT_FILE
echo "" >> $REPORT_FILE

# Store current state
k8s-datamodel database store --notes "Version tracking - $(date +%Y-%m-%d)"
CURRENT_ID=$(k8s-datamodel database list --limit 1 --output json | jq -r '.[0].id')
k8s-datamodel database export $CURRENT_ID --file current-versions.json

echo "## Current Version Inventory" >> $REPORT_FILE
echo "| Operator | Current Version | Replaces | Min Kube Version |" >> $REPORT_FILE
echo "|----------|----------------|----------|------------------|" >> $REPORT_FILE

jq -r '.csvs[] | "| \(.display_name) | \(.version) | \(.replaces // "None") | \(.min_kube_version // "Not specified") |"' \
   current-versions.json >> $REPORT_FILE

echo "" >> $REPORT_FILE
echo "## Version Management Analysis" >> $REPORT_FILE

# Check for upgrade chains
echo "### Active Upgrade Chains" >> $REPORT_FILE
jq -r '.csvs[] | select(.replaces != null and .replaces != "") | 
       "- **\(.display_name)**: \(.replaces) → \(.version)"' current-versions.json >> $REPORT_FILE

echo "" >> $REPORT_FILE
echo "### Skipped Versions" >> $REPORT_FILE
jq -r '.csvs[] | select(.skips | length > 0) | 
       "- **\(.display_name)**: skips versions \(.skips | join(\", \"))"' current-versions.json >> $REPORT_FILE

# Compare with previous snapshot if available
PREVIOUS_ID=$(k8s-datamodel database list --offset 1 --limit 1 --output json | jq -r '.[0].id // empty')
if [ -n "$PREVIOUS_ID" ]; then
    echo "" >> $REPORT_FILE
    echo "## Changes Since Last Snapshot (ID: $PREVIOUS_ID)" >> $REPORT_FILE

    k8s-datamodel database export $PREVIOUS_ID --file previous-versions.json

    echo "### Version Updates" >> $REPORT_FILE
    comm -13 <(jq -r '.csvs[] | "\(.name) \(.version)"' previous-versions.json | sort) \
             <(jq -r '.csvs[] | "\(.name) \(.version)"' current-versions.json | sort) | \
    while read update; do echo "- ✅ $update"; done >> $REPORT_FILE

    echo "" >> $REPORT_FILE
    echo "### New Operators" >> $REPORT_FILE  
    comm -13 <(jq -r '.csvs[].name' previous-versions.json | sort) \
             <(jq -r '.csvs[].name' current-versions.json | sort) | \
    while read new_op; do echo "- ➕ $new_op"; done >> $REPORT_FILE

    echo "" >> $REPORT_FILE
    echo "### Removed Operators" >> $REPORT_FILE
    comm -23 <(jq -r '.csvs[].name' previous-versions.json | sort) \
             <(jq -r '.csvs[].name' current-versions.json | sort) | \
    while read removed_op; do echo "- ➖ $removed_op"; done >> $REPORT_FILE
fi

echo "Version tracking report generated: $REPORT_FILE"
EOF

chmod +x olm-version-tracker.sh

Multi-Environment OLM Management¶

Cross-Environment OLM Comparison¶

# Create multi-environment OLM comparison script
cat > olm-multi-env-compare.sh << 'EOF'
#!/bin/bash
# Multi-Environment OLM Comparison Script

ENVIRONMENTS=("prod-cluster" "staging-cluster" "dev-cluster")
REPORT_FILE="olm-multi-env-comparison-$(date +%Y-%m-%d).md"

echo "# Multi-Environment OLM Comparison - $(date)" > $REPORT_FILE
echo "" >> $REPORT_FILE

# Store snapshots for each environment
declare -A ENV_SNAPSHOTS
for env in "${ENVIRONMENTS[@]}"; do
    echo "Collecting OLM data for $env..."
    k8s-datamodel --context $env database store --notes "Multi-env comparison - $env - $(date +%Y-%m-%d)"
    ENV_SNAPSHOTS[$env]=$(k8s-datamodel database list --cluster-context $env --limit 1 --output json | jq -r '.[0].id')
    k8s-datamodel database export ${ENV_SNAPSHOTS[$env]} --file "olm-$env.json"
done

echo "## Environment Summary" >> $REPORT_FILE
echo "| Environment | Total CSVs | Succeeded | Failed | Installing |" >> $REPORT_FILE
echo "|-------------|------------|-----------|---------|------------|" >> $REPORT_FILE

for env in "${ENVIRONMENTS[@]}"; do
    TOTAL=$(jq '.csvs | length' "olm-$env.json")
    SUCCEEDED=$(jq '[.csvs[] | select(.phase == "Succeeded")] | length' "olm-$env.json")
    FAILED=$(jq '[.csvs[] | select(.phase == "Failed")] | length' "olm-$env.json")  
    INSTALLING=$(jq '[.csvs[] | select(.phase == "Installing")] | length' "olm-$env.json")

    echo "| $env | $TOTAL | $SUCCEEDED | $FAILED | $INSTALLING |" >> $REPORT_FILE
done

echo "" >> $REPORT_FILE
echo "## Operator Consistency Analysis" >> $REPORT_FILE

# Find operators present in all environments
echo "### Operators in All Environments" >> $REPORT_FILE
ALL_OPERATORS=""
for env in "${ENVIRONMENTS[@]}"; do
    if [ -z "$ALL_OPERATORS" ]; then
        ALL_OPERATORS=$(jq -r '.csvs[].name' "olm-$env.json" | sort)
    else
        ALL_OPERATORS=$(comm -12 <(echo "$ALL_OPERATORS") <(jq -r '.csvs[].name' "olm-$env.json" | sort))
    fi
done

while read operator; do
    if [ -n "$operator" ]; then
        echo "#### $operator" >> $REPORT_FILE
        for env in "${ENVIRONMENTS[@]}"; do
            VERSION=$(jq -r --arg op "$operator" '.csvs[] | select(.name == $op) | .version' "olm-$env.json")
            PHASE=$(jq -r --arg op "$operator" '.csvs[] | select(.name == $op) | .phase' "olm-$env.json")
            echo "- $env: $VERSION ($PHASE)" >> $REPORT_FILE
        done
        echo "" >> $REPORT_FILE
    fi
done <<< "$ALL_OPERATORS"

# Find environment-specific operators
echo "### Environment-Specific Operators" >> $REPORT_FILE
for env in "${ENVIRONMENTS[@]}"; do
    echo "#### Unique to $env" >> $REPORT_FILE
    UNIQUE_OPS=""
    OTHER_OPS=""

    for other_env in "${ENVIRONMENTS[@]}"; do
        if [ "$other_env" != "$env" ]; then
            OTHER_OPS="$OTHER_OPS $(jq -r '.csvs[].name' "olm-$other_env.json")"
        fi
    done

    jq -r '.csvs[].name' "olm-$env.json" | while read op; do
        if ! echo "$OTHER_OPS" | grep -q "$op"; then
            echo "- $op" >> $REPORT_FILE
        fi
    done
    echo "" >> $REPORT_FILE
done

echo "## Version Drift Analysis" >> $REPORT_FILE
echo "### Operators with Version Differences" >> $REPORT_FILE
while read operator; do
    if [ -n "$operator" ]; then
        VERSIONS=""
        for env in "${ENVIRONMENTS[@]}"; do
            VERSION=$(jq -r --arg op "$operator" '.csvs[] | select(.name == $op) | .version' "olm-$env.json")
            VERSIONS="$VERSIONS $VERSION"
        done

        # Check if all versions are the same
        UNIQUE_VERSIONS=$(echo "$VERSIONS" | tr ' ' '\n' | sort -u | wc -l)
        if [ $UNIQUE_VERSIONS -gt 1 ]; then
            echo "#### $operator" >> $REPORT_FILE
            for env in "${ENVIRONMENTS[@]}"; do
                VERSION=$(jq -r --arg op "$operator" '.csvs[] | select(.name == $op) | .version' "olm-$env.json")
                echo "- $env: $VERSION" >> $REPORT_FILE
            done
            echo "" >> $REPORT_FILE
        fi
    fi
done <<< "$ALL_OPERATORS"

# Cleanup temp files
for env in "${ENVIRONMENTS[@]}"; do
    rm "olm-$env.json"
done

echo "Multi-environment comparison complete: $REPORT_FILE"
EOF

chmod +x olm-multi-env-compare.sh

Troubleshooting and Diagnostics¶

OLM Troubleshooting Toolkit¶

# Create comprehensive OLM troubleshooting script
cat > olm-troubleshoot.sh << 'EOF'
#!/bin/bash
# OLM Troubleshooting Toolkit

ISSUE_TYPE=${1:-"general"}
OPERATOR_NAME=${2:-""}
NAMESPACE=${3:-"operators"}

REPORT_FILE="olm-troubleshoot-$(date +%Y-%m-%d-%H-%M).log"

echo "=== OLM Troubleshooting Report - $(date) ===" | tee $REPORT_FILE
echo "Issue Type: $ISSUE_TYPE" | tee -a $REPORT_FILE
echo "Operator: ${OPERATOR_NAME:-"All"}" | tee -a $REPORT_FILE
echo "Namespace: $NAMESPACE" | tee -a $REPORT_FILE
echo "" | tee -a $REPORT_FILE

case $ISSUE_TYPE in
    "stuck-install")
        echo "## Diagnosing Stuck Installation" | tee -a $REPORT_FILE

        # Check installing CSVs
        echo "### CSVs in Installing State" | tee -a $REPORT_FILE
        k8s-datamodel olm list --phase Installing | tee -a $REPORT_FILE

        # Check for resource conflicts
        echo "### Checking for Resource Conflicts" | tee -a $REPORT_FILE
        if [ -n "$OPERATOR_NAME" ]; then
            CSV_INFO=$(k8s-datamodel olm get "$OPERATOR_NAME" --namespace "$NAMESPACE" --output json 2>/dev/null)
            if [ $? -eq 0 ]; then
                echo "Owned CRDs:" | tee -a $REPORT_FILE
                echo "$CSV_INFO" | jq -r '.owned_crds[]? // "None"' | sed 's/^/  /' | tee -a $REPORT_FILE

                echo "Required CRDs:" | tee -a $REPORT_FILE  
                echo "$CSV_INFO" | jq -r '.required_crds[]? // "None"' | sed 's/^/  /' | tee -a $REPORT_FILE
            fi
        fi
        ;;

    "failed-csv")
        echo "## Diagnosing Failed CSV" | tee -a $REPORT_FILE

        # List all failed CSVs
        echo "### Failed CSVs" | tee -a $REPORT_FILE
        k8s-datamodel olm list --phase Failed | tee -a $REPORT_FILE

        if [ -n "$OPERATOR_NAME" ]; then
            echo "### Specific CSV Analysis: $OPERATOR_NAME" | tee -a $REPORT_FILE
            CSV_DETAILS=$(k8s-datamodel olm get "$OPERATOR_NAME" --namespace "$NAMESPACE" --output json 2>/dev/null)
            if [ $? -eq 0 ]; then
                echo "Display Name: $(echo "$CSV_DETAILS" | jq -r '.display_name')" | tee -a $REPORT_FILE
                echo "Version: $(echo "$CSV_DETAILS" | jq -r '.version')" | tee -a $REPORT_FILE
                echo "Provider: $(echo "$CSV_DETAILS" | jq -r '.provider')" | tee -a $REPORT_FILE
                echo "Install Strategy: $(echo "$CSV_DETAILS" | jq -r '.install_strategy')" | tee -a $REPORT_FILE
            fi
        fi
        ;;

    "permission-issues")
        echo "## Diagnosing Permission Issues" | tee -a $REPORT_FILE

        # Store current state for RBAC analysis
        k8s-datamodel database store --notes "Permission troubleshooting - $(date)"
        SNAPSHOT_ID=$(k8s-datamodel database list --limit 1 --output json | jq -r '.[0].id')
        k8s-datamodel database export $SNAPSHOT_ID --file troubleshoot-rbac.json

        if [ -n "$OPERATOR_NAME" ]; then
            echo "### RBAC Analysis for $OPERATOR_NAME" | tee -a $REPORT_FILE

            # Find CSV for the operator
            CSV_RBAC=$(jq --arg name "$OPERATOR_NAME" '.csvs[] | select(.name == $name or .display_name == $name)' troubleshoot-rbac.json)

            if [ "$CSV_RBAC" != "null" ] && [ -n "$CSV_RBAC" ]; then
                echo "Cluster Permissions:" | tee -a $REPORT_FILE
                echo "$CSV_RBAC" | jq -r '.spec.spec.install.spec.clusterPermissions[]?.rules[]? | 
                    "  - Resources: \(.resources | join(", ")) | Verbs: \(.verbs | join(", "))"' | tee -a $REPORT_FILE

                echo "Namespace Permissions:" | tee -a $REPORT_FILE  
                echo "$CSV_RBAC" | jq -r '.spec.spec.install.spec.permissions[]?.rules[]? | 
                    "  - Resources: \(.resources | join(", ")) | Verbs: \(.verbs | join(", "))"' | tee -a $REPORT_FILE
            else
                echo "CSV not found for operator: $OPERATOR_NAME" | tee -a $REPORT_FILE
            fi
        fi

        rm troubleshoot-rbac.json
        ;;

    "general"|*)
        echo "## General OLM Health Check" | tee -a $REPORT_FILE

        # Overall OLM statistics
        echo "### OLM Statistics" | tee -a $REPORT_FILE
        k8s-datamodel olm stats | tee -a $REPORT_FILE

        # Check for problematic CSVs
        echo "### Problematic CSVs" | tee -a $REPORT_FILE
        echo "Failed CSVs:" | tee -a $REPORT_FILE
        k8s-datamodel olm list --phase Failed --output table | tee -a $REPORT_FILE

        echo "Installing CSVs (potential stuck installations):" | tee -a $REPORT_FILE
        k8s-datamodel olm list --phase Installing --output table | tee -a $REPORT_FILE
        ;;
esac

echo "" | tee -a $REPORT_FILE
echo "## Recommendations" | tee -a $REPORT_FILE

case $ISSUE_TYPE in
    "stuck-install")
        echo "1. Check if required CRDs are available in the cluster" | tee -a $REPORT_FILE
        echo "2. Verify sufficient RBAC permissions" | tee -a $REPORT_FILE
        echo "3. Check for resource conflicts with existing operators" | tee -a $REPORT_FILE
        echo "4. Review OLM operator logs for detailed error messages" | tee -a $REPORT_FILE
        ;;
    "failed-csv") 
        echo "1. Review CSV conditions for specific error messages" | tee -a $REPORT_FILE
        echo "2. Check if all required CRDs are satisfied" | tee -a $REPORT_FILE
        echo "3. Verify CSV syntax and format" | tee -a $REPORT_FILE
        echo "4. Check for conflicting CSV versions" | tee -a $REPORT_FILE
        ;;
    "permission-issues")
        echo "1. Verify ServiceAccount has necessary ClusterRole bindings" | tee -a $REPORT_FILE
        echo "2. Check for missing permissions in CSV RBAC definition" | tee -a $REPORT_FILE  
        echo "3. Review namespace-level permissions" | tee -a $REPORT_FILE
        echo "4. Consider if operator requires cluster-admin privileges" | tee -a $REPORT_FILE
        ;;
    *)
        echo "1. Monitor CSV phases for state changes" | tee -a $REPORT_FILE
        echo "2. Regular health checks with 'k8s-datamodel olm stats'" | tee -a $REPORT_FILE
        echo "3. Keep snapshots for historical analysis" | tee -a $REPORT_FILE
        echo "4. Implement automated monitoring for failed CSVs" | tee -a $REPORT_FILE
        ;;
esac

echo "Troubleshooting report saved to: $REPORT_FILE"
EOF

chmod +x olm-troubleshoot.sh

# Usage examples:
# ./olm-troubleshoot.sh general
# ./olm-troubleshoot.sh stuck-install cert-manager.v1.12.0 cert-manager  
# ./olm-troubleshoot.sh failed-csv my-operator.v1.0.0 operators
# ./olm-troubleshoot.sh permission-issues cloudnative-pg.v1.27.0 operators

Integration with Monitoring¶

Prometheus Metrics for OLM¶

# Create OLM metrics exporter for Prometheus
cat > olm-metrics-exporter.sh << 'EOF'
#!/bin/bash
# OLM Metrics Exporter for Prometheus

METRICS_FILE="/var/lib/node_exporter/olm_metrics.prom"

# Store current OLM state
k8s-datamodel database store --notes "Metrics collection - $(date)"
SNAPSHOT_ID=$(k8s-datamodel database list --limit 1 --output json | jq -r '.[0].id')
k8s-datamodel database export $SNAPSHOT_ID --file metrics-olm.json

# Generate Prometheus metrics
cat > $METRICS_FILE << METRICS
# HELP olm_csvs_total Total number of ClusterServiceVersions
# TYPE olm_csvs_total gauge
olm_csvs_total $(jq '.csvs | length' metrics-olm.json)

# HELP olm_csvs_succeeded Number of succeeded CSVs
# TYPE olm_csvs_succeeded gauge  
olm_csvs_succeeded $(jq '[.csvs[] | select(.phase == "Succeeded")] | length' metrics-olm.json)

# HELP olm_csvs_failed Number of failed CSVs
# TYPE olm_csvs_failed gauge
olm_csvs_failed $(jq '[.csvs[] | select(.phase == "Failed")] | length' metrics-olm.json)

# HELP olm_csvs_installing Number of installing CSVs
# TYPE olm_csvs_installing gauge
olm_csvs_installing $(jq '[.csvs[] | select(.phase == "Installing")] | length' metrics-olm.json)

# HELP olm_owned_crds_total Total number of CRDs owned by OLM operators
# TYPE olm_owned_crds_total gauge
olm_owned_crds_total $(jq '[.csvs[].owned_crds | length] | add' metrics-olm.json)

# HELP olm_required_crds_total Total number of CRDs required by OLM operators  
# TYPE olm_required_crds_total gauge
olm_required_crds_total $(jq '[.csvs[].required_crds | length] | add' metrics-olm.json)

# HELP olm_cluster_permissions_total Total cluster permissions across all CSVs
# TYPE olm_cluster_permissions_total gauge
olm_cluster_permissions_total $(jq '[.csvs[] | .spec.spec.install.spec.clusterPermissions | length] | add' metrics-olm.json)

# HELP olm_namespace_permissions_total Total namespace permissions across all CSVs
# TYPE olm_namespace_permissions_total gauge  
olm_namespace_permissions_total $(jq '[.csvs[] | .spec.spec.install.spec.permissions | length] | add' metrics-olm.json)

# HELP olm_high_risk_csvs Number of CSVs with wildcard permissions
# TYPE olm_high_risk_csvs gauge
olm_high_risk_csvs $(jq '[.csvs[] | select(.spec.spec.install.spec.clusterPermissions[]?.rules[]?.resources[]? == "*" or .spec.spec.install.spec.clusterPermissions[]?.rules[]?.verbs[]? == "*")] | length' metrics-olm.json)

METRICS

echo "OLM metrics exported to $METRICS_FILE"
rm metrics-olm.json
EOF

chmod +x olm-metrics-exporter.sh

Alerting Rules for OLM¶

# Create Prometheus alerting rules for OLM
cat > olm-alerting-rules.yml << 'EOF'
groups:
- name: olm.rules
  rules:
  - alert: OLMCSVFailed
    expr: olm_csvs_failed > 0
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "OLM ClusterServiceVersion(s) failed"
      description: "{{ $value }} ClusterServiceVersion(s) are in Failed state"

  - alert: OLMCSVStuckInstalling  
    expr: olm_csvs_installing > 0
    for: 30m
    labels:
      severity: warning
    annotations:
      summary: "OLM ClusterServiceVersion(s) stuck installing"
      description: "{{ $value }} ClusterServiceVersion(s) have been installing for >30 minutes"

  - alert: OLMHighRiskPermissions
    expr: olm_high_risk_csvs > 0
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: "OLM operators with high-risk permissions detected"
      description: "{{ $value }} ClusterServiceVersion(s) have wildcard (*) permissions"

  - alert: OLMCSVCountChanged
    expr: abs(delta(olm_csvs_total[1h])) > 0
    for: 0m  
    labels:
      severity: info
    annotations:
      summary: "OLM ClusterServiceVersion count changed"
      description: "Number of CSVs changed by {{ $value }} in the last hour"
EOF

This comprehensive collection of OLM examples provides:

Basic Operations: Discovery, analysis, and health monitoring
Advanced Analysis: RBAC security, resource requirements, and CRD ownership
Lifecycle Management: Upgrade planning, verification, and version tracking
Security: Permission analysis and security context evaluation
Multi-Environment: Cross-environment comparison and consistency checking
Troubleshooting: Diagnostic tools for common OLM issues
Monitoring: Prometheus metrics and alerting integration

These examples enable comprehensive OLM management with the k8s-datamodel tool, supporting enterprise-grade operator lifecycle management workflows.