Learning Objectives
By the end of this lesson, you will be able to:
- Use Cloudflare's native debugging tools (
wrangler tail, Dashboard analytics) - Add structured logging with
console.log/warn/errorin Workers - Use
workway executionsandworkway logsto identify failing workflows - Reproduce failures locally with
wrangler devand payload replay - Configure alerts for error rate spikes, latency issues, and stale workflows
- Implement health checks and integration status monitoring
Workflows in production need visibility. When something goes wrong—and it will—you need the tools to understand what happened and fix it fast.
WORKWAY workflows run on Cloudflare Workers, which means you have access to Cloudflare's native debugging tools alongside WORKWAY's workflow-specific monitoring.
Cloudflare Workers Logging
Cloudflare Workers use the standard Web Console API for logging. In the V8 isolate runtime, console.log, console.warn, and console.error are your primary logging tools.
Console API in Workers
export default defineWorkflow({
name: 'meeting-notes',
async execute({ trigger, integrations }) {
// Standard console methods work in Workers
console.log('Workflow started', { triggerId: trigger.id });
console.warn('Potential issue detected');
console.error('Critical failure', { error: 'details' });
// Objects are automatically serialized
console.log({
nested: {
data: trigger.data,
timestamp: new Date().toISOString()
}
});
return { success: true };
}
});
Viewing Logs with `wrangler tail`
For real-time log streaming from deployed Workers:
# Stream logs from a deployed Worker
wrangler tail meeting-intelligence-private
# Filter by status
wrangler tail --status error
# Filter by sampling rate (for high-traffic workers)
wrangler tail --sampling-rate 0.1
# JSON output for parsing
wrangler tail --format json
Example output:
[2024-01-15 10:00:01] GET https://workway.co/webhooks/zoom
[2024-01-15 10:00:01] (log) Workflow started { triggerId: "wh_abc123" }
[2024-01-15 10:00:02] (log) Meeting fetched { meetingId: "123", duration: 45 }
[2024-01-15 10:00:03] (error) Failed to create Notion page { error: "Rate limited" }
[2024-01-15 10:00:03] (log) Retrying with backoff...
[2024-01-15 10:00:05] (log) Workflow completed { pageId: "page_456" }
Local Development Logging
During wrangler dev, logs appear directly in your terminal:
# Start local dev server
wrangler dev
# In another terminal, trigger the workflow
curl -X POST localhost:8787/execute \
-H "Content-Type: application/json" \
-d '{"meetingId": "123"}'
All console.log output streams to your terminal in real-time.
Cloudflare Dashboard Analytics
The Cloudflare Dashboard provides aggregate analytics for your Workers:
Workers Analytics (dashboard.cloudflare.com)
Navigate to Workers & Pages → Your Worker → Analytics:
| Metric | Description |
|---|---|
| Requests | Total invocations over time |
| Success rate | Percentage of non-error responses |
| Median CPU time | Typical CPU usage per request |
| P99 CPU time | Worst-case CPU usage |
| Duration | Wall-clock execution time |
| Errors | Breakdown by error type |
Useful Dashboard Views
- Request chart: Identify traffic patterns and spikes
- Error chart: Spot failure rate trends
- CPU time distribution: Find performance outliers
- Geographic distribution: Understand where requests originate
Invocation Logs
For detailed per-request logs:
- Go to Workers & Pages → Your Worker → Logs
- Click any log entry to see:
- Request headers and body
- Response status and body
- All
console.logoutput - Execution duration
- Error stack traces (if failed)
Step-by-Step: Debug a Failing Workflow
Step 1: Identify the Failure
Check recent executions:
workway executions --status failed --limit 10
Output:
ID TRIGGER STATUS DURATION TIME
ex_abc123 webhook failed 0.8s 10:15
ex_def456 webhook failed 1.2s 10:12
ex_ghi789 webhook success 2.3s 10:10
Step 2: Get Execution Details
Inspect a specific failure:
workway execution ex_abc123
Output:
Execution: ex_abc123
Status: failed
Duration: 0.8s
Trigger: webhook (recording.completed)
Steps:
1. ✓ zoom.getMeeting (0.3s)
2. ✓ ai.summarize (0.4s)
3. ✗ notion.createPage (0.1s)
Error: API rate limited (429)
Logs:
[10:15:01] INFO Starting workflow...
[10:15:02] INFO Meeting fetched: mtg-123
[10:15:03] ERROR Notion rate limited - retry in 60s
Step 3: Reproduce Locally
Export the trigger payload and test locally:
# Get the original payload
workway execution ex_abc123 --output-payload > payload.json
# Start local dev server
workway dev
# Replay the execution
curl localhost:8787/execute -d @payload.json
Step 4: Add Diagnostic Logging
Enhance your workflow with detailed logs:
async execute({ trigger, context, integrations }) {
const start = Date.now();
context.log.info('Workflow started', {
triggerId: trigger.id,
meetingId: trigger.data?.object?.id,
});
// Step 1
const stepStart = Date.now();
const meeting = await integrations.zoom.getMeeting(trigger.data.object.id);
context.log.info('Step: Fetch meeting', {
duration: Date.now() - stepStart,
success: meeting.success,
});
// Step 2
const notionStart = Date.now();
const page = await integrations.notion.pages.create({ /* ... */ });
context.log.info('Step: Create page', {
duration: Date.now() - notionStart,
success: page.success,
pageId: page.data?.id,
});
context.log.info('Workflow complete', {
totalDuration: Date.now() - start,
});
return { success: true };
}
Step 5: Fix the Issue
Based on the error (rate limiting), add retry logic:
import { withRetry } from './utils';
async execute({ trigger, inputs, integrations }) {
// ... previous steps ...
// Add retry for rate-limited API
const page = await withRetry(
() => integrations.notion.pages.create({
parent: { database_id: inputs.notionDatabase },
properties: { /* ... */ },
}),
{ maxAttempts: 3, delayMs: 2000 }
);
return { success: true, pageId: page.data?.id };
}
Step 6: Deploy and Verify
# Deploy the fix
workway deploy
# Monitor logs in real-time
workway logs --tail --level info
# Watch for new executions
workway executions --limit 5 --watch
Step 7: Set Up Alerts
Configure alerts to catch future issues:
# In your workflow metadata
metadata: {
alerts: [
{
name: 'High error rate',
condition: 'error_rate > 0.05',
window: '5m',
channels: ['slack:alerts'],
},
],
}
The Observability Stack
| Layer | What It Shows |
|---|---|
| Metrics | Aggregate health (success rates, latency) |
| Logs | Individual execution details |
| Traces | Request flow through steps |
| Alerts | Proactive problem notification |
WORKWAY Dashboard
Workflow Overview
Access at workway.co/workflows/{workflow-id}/analytics:
- Executions: Total runs over time
- Success Rate: Percentage of successful completions
- Avg Duration: Mean execution time
- Errors: Breakdown by error type
Execution History
View individual runs:
workway.co/workflows/{workflow-id}/executions
| ID | Trigger | Status | Duration | Time |
|----|---------|--------|----------|------|
| ex_001 | webhook | success | 2.3s | 10:00 |
| ex_002 | webhook | failed | 0.8s | 10:15 |
| ex_003 | cron | success | 4.1s | 11:00 |
Click any execution to see:
- Full request/response
- Step-by-step execution
- Logs generated
- Error details (if failed)
Structured Logging
Log Levels
async execute({ context }) {
context.log.debug('Detailed info for debugging');
context.log.info('Standard operational info');
context.log.warn('Something unexpected but not fatal');
context.log.error('Something failed');
}
Contextual Logging
Always include relevant context:
// Bad: No context
context.log.info('Processing meeting');
// Good: Includes identifiers
context.log.info('Processing meeting', {
meetingId: meeting.id,
topic: meeting.topic,
participants: meeting.participants.length,
});
Log at Key Points
async execute({ trigger, config, integrations, context }) {
// Start
context.log.info('Workflow started', {
triggerId: trigger.id,
triggerType: trigger.type,
});
// Before external calls
context.log.debug('Fetching meeting from Zoom', { meetingId });
const meeting = await integrations.zoom.getMeeting(meetingId);
context.log.info('Meeting fetched', {
meetingId: meeting.id,
duration: meeting.duration,
});
// After processing
context.log.info('AI summary generated', {
inputLength: transcript.length,
outputLength: summary.length,
});
// End
context.log.info('Workflow completed', {
pageId: page.id,
totalDuration: Date.now() - start,
});
return { success: true, pageId: page.id };
}
Error Logging
Capture full error context:
try {
await notion.createPage(pageData);
} catch (error) {
context.log.error('Failed to create Notion page', {
error: error.message,
errorCode: error.code,
stack: error.stack,
pageData: JSON.stringify(pageData).slice(0, 500), // Truncate for logs
meetingId: meeting.id,
});
throw error;
}
Real-Time Log Streaming
CLI Log Tail
workway logs --tail
# Filter by workflow
workway logs --workflow meeting-notes --tail
# Filter by level
workway logs --level error --tail
Output Example
[10:00:01] INFO Workflow started { triggerId: "wh_123", triggerType: "webhook" }
[10:00:01] DEBUG Fetching meeting from Zoom { meetingId: "123" }
[10:00:02] INFO Meeting fetched { meetingId: "123", duration: 45 }
[10:00:03] INFO AI summary generated { inputLength: 5000, outputLength: 200 }
[10:00:04] ERROR Failed to create Notion page { error: "Rate limited", errorCode: 429 }
[10:00:05] INFO Retrying Notion API call { attempt: 2 }
[10:00:06] INFO Workflow completed { pageId: "page_456", totalDuration: 5000 }
Alerting
Built-in Alerts
Configure at workway.co/workflows/{workflow-id}/alerts:
| Alert Type | Trigger |
|---|---|
| Error spike | >10% error rate in 5 minutes |
| Execution failure | Any execution fails |
| Latency | Avg duration >10s |
| No executions | No runs in 24 hours |
Custom Alert Conditions
metadata: {
alerts: [
{
name: 'High failure rate',
condition: 'error_rate > 0.05',
window: '5m',
channels: ['slack:alerts', 'email:[email protected]'],
},
{
name: 'Stale workflow',
condition: 'last_execution > 24h',
channels: ['slack:alerts'],
},
],
},
Alert Destinations
metadata: {
alertChannels: {
slack: {
webhook: 'https://hooks.slack.com/services/xxx',
channel: '#workflow-alerts',
},
email: {
to: ['[email protected]'],
},
pagerduty: {
serviceKey: 'xxx',
},
},
},
Debugging Failed Executions
Step 1: Find the Failure
workway executions --status failed --limit 10
Step 2: Get Execution Details
workway execution ex_abc123
# Output:
# Execution: ex_abc123
# Status: failed
# Duration: 2.3s
# Trigger: webhook (meeting.ended)
#
# Steps:
# 1. ✓ zoom.getMeeting (0.8s)
# 2. ✓ ai.summarize (1.2s)
# 3. ✗ notion.createPage (0.3s)
# Error: API rate limited (429)
#
# Logs:
# [10:00:01] INFO Starting workflow...
# [10:00:03] ERROR Rate limited by Notion API
Step 3: Reproduce Locally
# Copy the trigger payload
workway execution ex_abc123 --output-payload > payload.json
# Test locally
workway dev
curl localhost:8787/execute -d @payload.json
Step 4: Fix and Redeploy
// Add retry logic
const page = await withRetry(
() => notion.createPage(pageData),
{ maxAttempts: 3, delayMs: 2000 }
);
workway deploy
Step 5: Verify
workway logs --tail --workflow meeting-notes
Debugging Techniques
Replay Failed Executions
# Retry a specific execution
workway retry ex_abc123
Inspect Integration State
async execute({ integrations, context }) {
// Log integration health
const zoomHealth = await integrations.zoom.healthCheck();
context.log.debug('Zoom integration health', { status: zoomHealth });
// Log token expiry
const tokenInfo = await integrations.zoom.getTokenInfo();
context.log.debug('Zoom token', {
expiresIn: tokenInfo.expires_in,
scopes: tokenInfo.scopes,
});
}
Debug Mode
Enable verbose logging:
async execute({ context }) {
const debug = context.env.DEBUG_MODE === 'true';
if (debug) {
context.log.debug('Full trigger payload', {
payload: JSON.stringify(trigger.payload, null, 2),
});
}
}
Set via environment:
workway env set DEBUG_MODE true
Step-Through Execution
For complex workflows, log each step:
async execute({ context }) {
const steps = [
{ name: 'fetch_meeting', fn: () => zoom.getMeeting(id) },
{ name: 'get_transcript', fn: () => zoom.getTranscript(id) },
{ name: 'summarize', fn: () => ai.summarize(transcript) },
{ name: 'create_page', fn: () => notion.createPage(data) },
];
const results = {};
for (const step of steps) {
const start = Date.now();
try {
results[step.name] = await step.fn();
context.log.info(`Step ${step.name} completed`, {
duration: Date.now() - start,
});
} catch (error) {
context.log.error(`Step ${step.name} failed`, {
duration: Date.now() - start,
error: error.message,
});
throw error;
}
}
return { success: true, results };
}
Health Checks
Workflow Health Endpoint
// src/health.ts
export async function healthCheck(integrations: Integrations) {
const checks = {
zoom: false,
notion: false,
slack: false,
};
try {
await integrations.zoom.getUser();
checks.zoom = true;
} catch {}
try {
await integrations.notion.getDatabases();
checks.notion = true;
} catch {}
try {
await integrations.slack.getChannels();
checks.slack = true;
} catch {}
const healthy = Object.values(checks).every(Boolean);
return {
healthy,
checks,
timestamp: new Date().toISOString(),
};
}
Scheduled Health Checks
metadata: {
healthCheck: {
enabled: true,
schedule: '*/15 * * * *', // Every 15 minutes
alertOnFailure: true,
},
},
Metrics Export
Custom Metrics
async execute({ context }) {
// Track custom metric
context.metrics.increment('meetings_processed');
context.metrics.gauge('transcript_length', transcript.length);
context.metrics.histogram('processing_time', Date.now() - start);
}
Metrics Dashboard
View at workway.co/workflows/{workflow-id}/metrics:
- Execution count by status
- Duration percentiles (p50, p95, p99)
- Custom metrics over time
- Error rate trends
Post-Incident Analysis
Incident Timeline
## Incident: Meeting notes workflow failure
**Duration**: 10:00 - 10:45 (45 min)
**Impact**: ~30 meetings not processed
### Timeline
- 10:00 - Notion API rate limit changes deployed by Notion
- 10:05 - Workflow error rate spikes to 80%
- 10:10 - Alert fires to Slack
- 10:15 - On-call acknowledges
- 10:25 - Root cause identified (rate limit)
- 10:35 - Fix deployed (added backoff)
- 10:45 - Error rate returns to 0%
### Action Items
- [ ] Add rate limit monitoring for all integrations
- [ ] Implement automatic backoff on 429 responses
- [ ] Create runbook for integration rate limits
Praxis
Set up comprehensive monitoring for your workflow:
Praxis: Ask Claude Code: "Help me add structured logging and alerting to my workflow"
Implement the observability stack:
async execute({ context }) {
const start = Date.now();
// 1. Log at key points with context
context.log.info('Workflow started', {
triggerId: trigger.id,
triggerType: trigger.type,
});
// 2. Track step durations
const stepStart = Date.now();
const meeting = await zoom.getMeeting(meetingId);
context.log.info('Step: Fetch meeting', {
duration: Date.now() - stepStart,
meetingId: meeting.id,
});
// 3. Log errors with full context
try {
await notion.createPage(pageData);
} catch (error) {
context.log.error('Step: Create page failed', {
error: error.message,
errorCode: error.code,
meetingId: meeting.id,
});
throw error;
}
// 4. Final summary
context.log.info('Workflow complete', {
duration: Date.now() - start,
success: true,
});
}
Configure alerts at workway.co/workflows/{id}/alerts for:
- Error rate > 5%
- Latency > 10 seconds
- No executions in 24 hours
Reflection
- How quickly can you identify a failing workflow today?
- What alerts would have caught past incidents sooner?
- How do you balance logging detail vs. noise?