Production that runs itself

AI agents that work while you sleep, investigate, fix, prevent, optimize, and understand —
directly in Slack.

Slack

Search Acme Inc.

Home

DMs

Activity

Acme Inc.

Channels

alerts

platform

eng

frontend

infra

support-tickets

general

Apps

Codewolf

eng

CodewolfAPP9:00 AM

Hey team, last night we had 5 alerts with zero incidents. Error rates stayed under 0.2% across all services. Minor auth service spike at 3am was auto-resolved. AWS scheduled an EC2...

daily-morning-brew-2026-01-25.md

Markdown

👀3

☕2

📈2

Sarah9:02 AM

@Codewolf payment API throwing a bunch of 500s and customers can't checkout. can you investigate?

✅1

🙏2

16 repliesLast reply 1m ago

Bob9:15 AM

@Codewolf what alerts fired in the last 30 mins? also show me what changed recently

✅1

📈1

🙏1

1 replyLast reply 30s ago

Alex9:20 AM

@Codewolf database query times spiking, need help debugging

👀1

2 repliesLast reply 30s ago

Message #eng

Thread

CodewolfAPP9:02 AM

On it Sarah! Assigning agents to investigate the payment API issues:

AGENT-1Analyzing error patterns and request traces

↳Pull Sentry events for 500 errors in last hour

↳Check Datadog traces for payment endpoint latency

AGENT-2Investigating recent changes and dependencies

↳Review recent commits and deployments to payment and upstream services

↳Check Stripe API status and AWS health dashboard

AGENT-3Checking infrastructure and resource metrics

↳Check RDS CPU, memory, and connection pool metrics

I'll report back with findings shortly.

PagerDutynow

Incident #2847 triggered: Payment API error rate > 5%

Codewolf1m ago

#support-tickets · Investigating 3 customer checkout failures

Codewolf2m ago

#eng · Found root cause: RDS connection pool exhaustion

GitHub3m ago

Codewolf opened PR #412: Increase connection pool size

Codewolf10m ago

#platform · Reviewed PR #408: potential memory leak in auth service

Zendesk12m ago

New ticket #5241: Customer unable to access dashboard after login

246 more notifications

What Codewolf agents do

Investigate

Tag @Codewolf in Slack when issues strike. AI agents analyze deployments, correlate logs across services, trace requests, and identify root causes. From alert to diagnosis in minutes.

Slack

Thread# alerts

DatadogAPP10:37 AM

Triggered: [ERROR] checkout - Elevated Error Rate

Elevated error rate on checkout (>5%), Current Value: 12.8%, Threshold: 5%

error_rate(checkout-service, env:prod) > 5%

Notified

@alerts @pagerduty

Sarah10:38 AM

@Codewolf can you investigate this alert?

✅1

CodewolfAPP10:38 AM

On it Sarah! Assigning agents to investigate the checkout error rate:

AGENT-1Analyzing error patterns and traces

↳Pull Datadog traces for checkout service errors

↳Correlate error spikes with recent deployments

AGENT-2Investigating infrastructure and logs

↳Review CloudWatch logs for error stack traces

↳Check database connection pool and query performance

I'll report back with findings shortly.

CodewolfAPP10:40 AM

Here's what I already know from recent events:

🔴

[ERROR] frontend-proxy - High Error Rate (8 min ago)

Error rate at 11.6% (threshold: 5%)

View Monitor

🟡

[K8s] Pod Pending (9 min ago - now recovered)

accounting-84b7d785c8-psbjl was stuck in Pending state

🚀

Recent Deployment (47 min ago)

product-catalog deployed to production

View commit

CodewolfAPP10:41 AM

Investigation Complete

🔴 Checkout service confirmed DOWN — returning 504 Gateway Timeout

1️⃣ Root Cause Analysis: Backend service failures cascading to frontend — 92% confidence

Evidence:

flagd feature flag service — resolveBoolean operations failing (14 error traces)
product-reviews service — OpenAI gpt-4o-mini API calls failing (5 error traces)
frontend — getServerSideProps / failing when calling downstream services (12 error traces)
frontend-proxy — 44 error traces, ~4.3% error rate

Error Propagation Chain:

flagd (resolveBoolean) ❌
  ↓
product-reviews → OpenAI API ❌
  ↓
frontend (getServerSideProps /) ❌
  ↓
frontend-proxy → 504 Gateway Timeout

Affected Pods:

frontend-proxy-b6fb5b44f-5wzr9
frontend-78cf8ff86d-pxvfr
flagd-7bd55d47bb-fmcqm
product-reviews-cdc7c8c5f-mpmhk

Recommended Actions:

1️⃣ Immediate: Check flagd pod health — feature flag failures are affecting multiple services
2️⃣ Immediate: Verify OpenAI API key/quota for product-reviews service
3️⃣ Consider restarting flagd if it's in a bad state

Would you like me to dig deeper into the flagd or product-reviews failures?

Slack

# eng

Mike2:10 PM

@Codewolf can you fix the auth bug that causes mobile users to get logged out randomly?

CodewolfAPP2:15 PM

On it! Created a fix for the mobile auth logout issue.

✅ PR #2847: fix: Extend mobile JWT expiry and add session persistence

What changed:

auth/jwt.ts - Extended expiry from 15m to 7 days
middleware/auth.ts - Added mobile client detection
mobile/session.ts - Session persistence logic

🔥2

Sarah2:25 PM

@Codewolf the api-service pods keep OOMKilling in prod, can you bump the memory limits?

✅1

CodewolfAPP2:26 PM

On it! Checking K8s pod restarts first then updating Helm values for api-service.

✅ PR #2848 - Increase api-service memory limits to 2Gi

Updated helm/api-service/values.yaml memory limit from 1Gi to 2Gi based on current usage patterns.

🙏2

🚀1

Alex3:10 PM

@Codewolf can you update the checkout-service to use the new payment-gateway-v2 endpoint? need to update all the downstream dependencies too

CodewolfAPP3:12 PM

On it! Updating service endpoints and dependencies.

✅ PR #2849 - Migrate checkout-service to payment-gateway-v2

Updated checkout-service config to use v2 endpoint, modified order-service and notification-service to handle new webhook format, added feature flag for gradual rollout, and updated Kubernetes service mesh routes.

🔥3

👍2

Fix

AI agents don't just find problems, they fix them. With full production context and live telemetry, agents generate robust PRs with code patches and configuration changes that resolve issues fast and correctly.

Prevent

Catch issues before they reach production. AI agents review PRs for risks and run post-deployment checks. Stop problems before they become incidents.

Slack

# deployments

GitHubAPP11:20 AM

PR #2891 opened by alex: Add transaction status column to payment-service

CodewolfAPP11:23 AM

Reviewed this PR and found potential production impact

Risk Analysis:

Migration adds non-nullable column to transactions table (847M rows)
Will lock table during backfill, estimated 12-15 minutes
Could cause payment processing downtime during peak hours

View full review on GitHub

GitHubAPP2:45 PM

frontend-service v2.4.1 deployed to production • View commit

CodewolfAPP2:46 PM

On it! I see frontend-service v2.4.1 just deployed. Running post-deployment checks:

AGENT-3Monitoring production health

↳Monitor error rates and latency spikes

↳Check for breaking changes in API responses

↳Validate frontend rendering across key endpoints

Will report back in 15 minutes if I see anything suspicious.

GitHubAPP4:12 PM

PR #2903 opened by jordan: Scale RDS instance for production database

CodewolfAPP4:15 PM

Reviewed Terraform changes and found infrastructure impact

Infrastructure Impact:

Scales RDS from db.r6g.xlarge → db.r6g.2xlarge
Cost increase: ~$620/month additional AWS spend
Requires 3-5 minute downtime window for instance modification
Recommend scheduling during maintenance window (Sunday 2-4 AM UTC)

View full analysis on GitHub

Slack

# eng

CodewolfAPP9:00 AM

Good morning! Weekly cost analysis found $2,840/month in savings. Analyzed 847 AWS resources across 3 regions: 12 underutilized EC2 nodes (avg 18% CPU), 3 idle ALBs in staging, 340GB unattached EBS volumes, and 47 over-provisioned K8s pods requesting 4x actual usage. Also received billing email from Anthropic showing 34% increase in token usage and model costs compared to last week.

weekly-cost-analysis-2026-01-27.md

Markdown

👀3

💰2

CodewolfAPP10:15 AM

Performance alert: checkout-service P99 latency increased 45% over the past 3 hours

Average response time: 180ms → 340ms
Slow queries detected on orders table

Opened 2 PRs to address this:

✅ PR #3142 - Add composite index to orders table

✅ PR #3143 - Add Redis caching to featured products endpoint

performance-analysis-checkout-2026-01-27.md

Markdown

👀2

📈1

🙏3

CodewolfAPP2:30 PM

Daily traffic trends from PostHog: 12.4K unique visitors (↑18% vs yesterday). /pricing page load time increased to 2.8s (was 1.4s) affecting 34% bounce rate. Product signup conversion dropped from 8.2% to 6.1% after yesterday's checkout UI update.

daily-product-analytics-2026-01-27.md

Markdown

👀4

📈2

📉1

Optimize

Continuously improve your infrastructure. AI agents analyze costs, identify resource waste, recommend performance improvements, and track efficiency trends. Get daily insights without manual analysis.

Understand

Codewolf AI agents understand your entire system. Not just your codebase, but production behavior, infrastructure, service dependencies, deployment patterns, and system relationships. This deep context enables smarter investigations, better fixes, and proactive prevention.

Slack

Thread# platform

Bob1:05 PM

@Codewolf can you show me the full dependency map for checkout-service? need to see all downstream services, databases, and external APIs it depends on

CodewolfAPP1:05 PM

                        ┌─────────────────┐
                        │   api-gateway   │
                        └────────┬────────┘
                                 │
                        ┌────────▼────────┐
                        │    checkout-    │  k8s: prod-cluster/checkout-ns
                        │     service     │  replicas: 8 pods
                        └────────┬────────┘
                                 │
           ┌─────────────┬───────┴───────┬─────────────┐
           │             │               │             │
      ┌────▼─────┐  ┌────▼─────┐   ┌─────▼────┐  ┌─────▼─────┐
      │ payment- │  │inventory-│   │   user-  │  │ shipping- │
      │ service  │  │ service  │   │  service │  │  service  │
      └────┬─────┘  └────┬─────┘   └─────┬────┘  └─────┬─────┘
           │             │               │             │
      ┌────┴────┐        │               │             │
      │         │        │               │             │
   ┌──▼───┐ ┌───▼───┐ ┌──▼──────────┐ ┌──▼──────────┐ ┌▼─────────────┐
   │Stripe│ │  RDS  │ │     RDS     │ │ ElastiCache │ │     RDS      │
   │ API  │ │ pay-  │ │ inventory-  │ │   (Redis)   │ │  shipping-   │
   │(ext) │ │ments  │ │     db      │ │session-cache│ │     db       │
   └──────┘ └───────┘ └─────────────┘ └─────────────┘ └──────────────┘

Infrastructure breakdown: checkout-service orchestrates orders on K8s (8 replicas), payment-service processes via Stripe + RDS, ElastiCache handles session caching. Total: 3 RDS instances, 1 ElastiCache cluster, 4 K8s services, 1 external API.

🔥3

🙏2

Integrates with the tools you already use

Slack

GitHub

Datadog

Sentry

Kubernetes

AWS

Linear

Grafana

Webhooks

VIEW ALL INTEGRATIONS →

SOC 2 TYPE II CERTIFIED

Enterprise Security

Your data is never combined with others. Read-only access to your integrations with full audit logs. Your data never used for model training.

Isolated Agent Workspaces

Agents run in dedicated sandboxes

AES-256 Encryption

Data encrypted in transit and at rest

SSO & SAML

Enterprise authentication support

Role-based Access Control

Granular permissions and audit logs

LEARN MORE IN OUR TRUST CENTER →

Frequently asked questions

Pricing

Start with included credits. Scale as you need with pay-as-you-go.

Growth

$200/month

For growing teams shipping fast

200 Included Agent Unit Credits
Dedicated agent workspaces
Unlimited users
Standard support

Enterprise

Custom

Custom solutions for large organizations

Custom integrations
SSO/SAML
Dedicated support

Contact sales

Careers

We're replacing engineering toil with AI agents. Join us to build the future of how engineering teams work.

Join our pack

Production that runs itself

What Codewolf agents do

Investigate

Fix

Prevent

Optimize

Understand

Integrates with the tools you already use

Enterprise Security

Frequently asked questions

What is Codewolf?

How long does it take to set up?

Is Codewolf SOC 2 Type II compliant?

How is Codewolf different from Cursor, Copilot, Codex, or Claude Code?

Why not build in-house production AI agents?

How much does Codewolf cost?

Pricing

Growth

Enterprise

Careers

Try Codewolf Agents now.