Operations

Everything an admin needs to know for running and maintaining an inboxes instance.

Background workers

The backend runs several background workers as goroutines alongside the HTTP server. All workers respect graceful shutdown via context cancellation (SIGINT/SIGTERM triggers a 30-second shutdown window).

Worker	What it does
sync-worker	Processes sync jobs from the Redis queue
email-worker	Processes outbound send, inbound fetch, and sent-email fetch jobs
inbox-poller	Auto-syncs emails by polling the Resend API for environments without webhooks
trash-collector	Purges expired trash items (disabled by default, enable with `TRASH_COLLECTOR_ENABLED=true`)
domain-heartbeat	Periodically checks Resend domain verification status
event-pruner	Removes old WebSocket events
status-recovery	Polls Resend for stale outbound email delivery statuses
grace-period	Transitions expired plans to free tier

Stale job recovery

Two companion goroutines detect jobs stuck in running state with stale heartbeats (>90 seconds) and re-enqueue or permanently fail them.

The email stale recovery also recovers orphaned pending jobs — jobs in pending state whose updated_at is older than 5 minutes.

Per-org rate limiting

Outbound Resend API calls are rate-limited per organization (default: 2 requests/second with 15% safety margin). Applies to email sending, sync operations, status recovery, and inbox polling.

Rate limits are loaded from the orgs table (resend_rps column) and can be adjusted per org.

Health checks

GET /api/health

Returns 200 with { "status": "ok" } when all dependencies are healthy. Returns 503 with { "status": "degraded" } if any dependency is down.

Schema management

Migrations are managed by goose and run automatically on server startup.

Migration rollback

Stop the backend to prevent auto-migration
Back up the database
Check current migration version: SELECT * FROM goose_db_version ORDER BY id DESC LIMIT 5;
Roll back: goose -dir internal/db/migrations postgres "$DATABASE_URL" down
Deploy the matching backend version and restart

Migrations with DROP TABLE or DROP COLUMN in their down section will cause data loss on rollback. Always test in staging first.

Backup & restore

What to back up:

PostgreSQL database — contains all emails, users, settings, encrypted API keys
ENCRYPTION_KEY value — without it, stored Resend API keys are unrecoverable

Redis data does not need to be backed up — rate limit counters and job queues are self-healing.

Backup commands

# Docker
docker exec $(docker ps -qf name=postgres) \
  pg_dump -U inboxes -Fc inboxes > inboxes_$(date +%Y%m%d).dump

# Bare metal
pg_dump -Fc inboxes > inboxes_$(date +%Y%m%d).dump

Restore commands

# Docker
cat backup.dump | docker exec -i $(docker ps -qf name=postgres) \
  pg_restore -U inboxes -d inboxes --clean --if-exists

# Bare metal
pg_restore -d inboxes --clean --if-exists backup.dump

Automated backups

# Daily backup at 3 AM, keep 7 days
0 3 * * * pg_dump -Fc inboxes > /backups/inboxes_$(date +\%Y\%m\%d).dump && find /backups -name "inboxes_*.dump" -mtime +7 -delete

Monitoring & logging

All backend logs use Go's slog package with structured key-value pairs. Worker logs are prefixed with the worker name.

Key log patterns to monitor

Pattern	Meaning
`domain heartbeat: API key invalid`	Org's Resend API key was revoked
`domain heartbeat: DNS verification degraded`	SPF or DKIM records lost verification
`email worker: job permanently failed`	Email send exhausted retries
`inbox poller: failed to fetch`	Polling failed for an org
`grace period worker: transitioned org to free`	Subscription grace period expired

Encryption key management

Stored Resend API keys and system settings are encrypted with AES-256-GCM using the ENCRYPTION_KEY environment variable. Each encrypted value has its own IV and authentication tag.

If you lose the ENCRYPTION_KEY:

All stored Resend API keys become unrecoverable
Users must re-enter their keys via onboarding or settings
No email data is lost (emails are stored in plaintext)

Troubleshooting

Problem	Solution
Emails stuck in “received” status	Status recovery polls every 5 min automatically. Check logs for errors and verify the org's API key.
Sync jobs stuck in “running”	Stale recovery detects and re-enqueues automatically. Verify Redis is reachable.
Domains marked disconnected unexpectedly	Domain heartbeat marks disconnected when domain is missing from Resend or API key returns 403. Domains self-heal if they reappear.

Graceful shutdown

On SIGINT or SIGTERM:

HTTP server stops accepting new connections
In-flight requests have 30 seconds to complete
Context cancellation propagates to all workers
Workers finish their current operation and exit
Database and Redis connections are closed