Log in Get started

DevOps & Platform EngineeringInfrastructure

A platform team of one, running 3 K8s clusters and 40 microservices.

Senior DevOps engineers owning multi-cluster fleets are buried in Terraform, runbooks and SOC2 evidence. Bricolage writes infrastructure code, optimises Dockerfiles, consolidates pipelines and produces audit-grade runbooks while you keep production decisions.

Talk to a solutions engineer See other industries

Agent cast

Infrastructure EngineerSecurity AuditorCI/CD SpecialistRunbook Writer

Watch it happen

Scroll the story. The demo plays along.

The team

One platform engineer. 40 services. Three EKS clusters.

Forty microservices across dev, staging, prod. Terraform IaC. GitHub Actions pipelines. Eighty-plus container images in ECR. Prometheus + Grafana + PagerDuty.

Plus internal tooling, developer support, security compliance and an endless queue of "my deployment is broken" questions from forty-five engineers. One person doing the work of three.

app.bricolageai.com / captive-portfolio

Bricolage AI/Platform · 40 services

Bricolage AI/Platform · 40 services

0 agents activeCC

Platform · 40 services

0 of 6 ready

ChatArtifactsWorkFiles & Facts

Idle

The chat is empty until the user types.

Message @bricolage about your portfolio…

Portfolio

0/6sub-twins

DAG

0/4steps

Step 01 · Runbooks

Eight runbooks in twenty-five minutes.

Cluster operations, deployment procedures, incident response, database operations, CI/CD troubleshooting, container registry, monitoring and alerting, and security operations.

Each references actual infrastructure: real cluster names, real namespace conventions, real Helm chart paths, real AWS commands specific to this company's setup. Judgment calls (rollback thresholds, escalation tiers) flagged for review rather than guessed.

scroll into this scene to play the demo

Step 02 · Developer ops

"ImagePullBackOff" — answered once, answered forever.

When a developer's deploy failed, the system traced the root cause: the CI pipeline tagged images with git SHA but the Helm values specified a semantic version tag — a classic mismatch.

The fix was coded (update the pipeline to tag with both), and a developer troubleshooting guide was created with exact kubectl and AWS CLI commands for ImagePullBackOff, CrashLoopBackOff, no traffic, slow deployments. Next time a developer asks, they can be pointed to the guide.

scroll into this scene to play the demo

Step 03 · Image audit

8.2GB → 3.1GB across forty services.

Five services over 300MB each. Twelve services running as root containers (a SOC2 finding). Eighteen services missing health checks. Three pinning to :latest.

Optimized Dockerfiles cut the largest image from 380MB to 120MB — sixty-two percent reduction. Multi-stage builds, non-root users, health checks, distroless bases applied across the portfolio. CI pipeline savings: forty-five minutes per full run.

scroll into this scene to play the demo

Step 04 · CI/CD consolidation

3,200 lines of duplicated YAML → 320 lines.

Before: forty services with copy-pasted GitHub Actions workflows, 3,200 lines of duplicated YAML. After: one template and forty callers (five lines each), 320 lines total — ninety percent reduction.

When the system needs to add vulnerability scanning, it goes into the template once and applies to all forty services automatically.

scroll into this scene to play the demo

Step 05 · Incident

A 502 storm. Diagnosed and patched in minutes.

Critical service throwing 502s. Structured diagnostics: pod logs, pod events, resource usage versus limits, correlation with recent deploy.

Diagnosis: OOMKilled, memory limit exceeded. Emergency fix coded (kubectl patch), postmortem template generated, service's facts updated with incident details for future reference. The platform engineer made the decision; the system provided the analysis.

scroll into this scene to play the demo

The outcome

SOC2-ready in 2 hours. 60–80 hours back per month.

Security posture assessment identified six compliance gaps. Three remediated immediately (PodSecurity policies, network egress, audit logging). Three more with remediation plans. A security controls matrix mapped infrastructure controls to SOC2 criteria — audit prep that normally takes a week, structured in two hours.

Runbook writing: three days → twenty-five minutes. Dockerfile optimization: two weeks → two hours. CI/CD template: one week → thirty minutes. Developer support: one hour/day → ten minutes/day. Sixty to eighty hours per month back for architecture, performance and scaling work that requires human judgment.

scroll into this scene to play the demo

What changes for the team

8 operational runbooks written in 25 minutes (estimated 3 days manually)
Container images cut 62% (8.2 GB → 3.1 GB) across 40 services
CI/CD YAML duplication reduced 90% via a single reusable template
Security posture review surfaced 6 SOC2 audit gaps

Up next

Product Engineering

A teammate that understands your codebase and ships complete features.

Read Product Engineering