March 16, 2026 · 12 min read · performance.qa

The Pre-Launch Performance Checklist: 40 Items to Verify Before Going Live

Comprehensive pre-launch performance checklist covering infrastructure, application, frontend, and testing - 40 items to verify before your next release.

Launch day performance problems are not surprises - they are predictable failures that were present before launch but not detected. A slow database query that is unnoticeable at 10 requests per minute becomes catastrophic at 10,000 requests per minute. A missing cache layer that works fine in staging will collapse in production when every user hits cold endpoints simultaneously.

This pre-launch performance checklist covers 40 items across four categories: infrastructure, application, frontend, and testing. Work through each section with your engineering team before any significant launch - a new product, a major feature release, a marketing campaign, or an event that will significantly increase traffic.

Why Launch Problems Are Predictable

Three patterns cause the majority of launch-day performance failures:

Pattern 1: Testing at wrong scale. Staging environments typically run with 10% of production capacity and sample data volumes. A query that takes 2ms with 10,000 rows takes 8 seconds with 10 million rows. Testing at staging scale gives false confidence.

Pattern 2: Cold-start blindness. Staging environments stay warm between tests. Production services start cold on launch day: empty caches, unjitted code, no warm connection pools. The first 5-10 minutes of a launch are the hardest on performance.

Pattern 3: Integration dependencies. Your service may be well-optimized, but the third-party payment processor, email service, or CDN origin may become a bottleneck under real launch traffic. These are invisible until launch.

Infrastructure Checklist (10 Items)

1. Auto-scaling is configured and tested

Your compute layer should automatically add capacity when load increases. Verify the auto-scaling policy is enabled, the scale-out threshold is appropriate (typically 60-70% CPU or request rate), the scale-in cooldown is configured, and the maximum instance count is high enough to handle your expected peak.

Test it: Run a load test that triggers scale-out and verify new instances become healthy and serve traffic within 3 minutes.

2. Load balancer health checks are tuned

Health check intervals and thresholds determine how quickly unhealthy instances are removed from rotation. Default health check intervals (30 seconds, 2 failures before removal) mean a failed instance serves traffic for up to 60 seconds before removal.

Verify: Health check interval is 10-15 seconds, failure threshold is 2-3 checks, and the health endpoint is lightweight (no database queries).

3. Database connection pool is sized correctly

Every application instance needs connections to the database. With 10 instances each requesting 20 connections, you need 200 database connections plus overhead. Verify your PostgreSQL max_connections is high enough and that PgBouncer or an equivalent connection pooler is in front of it.

Check: SHOW max_connections; in PostgreSQL. Monitor connection count in staging at peak simulated load.

4. Database read replicas are routing correctly

If you have read replicas, verify that read queries are actually routing to them. A misconfigured read/write split sends all queries to the primary, eliminating the benefit of replicas and potentially overwhelming the primary.

Check: Monitor query counts per instance. Read replicas should be handling the majority of SELECT traffic.

5. Redis/caching layer is deployed and accessible

Verify the caching layer is deployed, accessible from application instances, and that cache TTLs are configured appropriately. A missing cache configuration that defaults to “no caching” is a common launch failure.

Check: Run a cache hit rate query against Redis (INFO stats > keyspace_hits / (keyspace_hits + keyspace_misses)). Expect > 80% hit rate for hot data.

6. CDN is configured and caching correctly

Static assets (CSS, JavaScript, images) should be served from a CDN, not your origin servers. Verify cache-control headers are set correctly: long TTLs for versioned assets, short TTLs for HTML.

Check: Curl a static asset and verify X-Cache: Hit header. Check cache-control headers match your intent.

7. DNS TTLs are set appropriately

High DNS TTLs mean failover is slow during an incident. Low TTLs mean more DNS lookups. Before launch, lower TTLs to 60-300 seconds to enable faster DNS changes if needed. After launch stabilizes, you can increase them.

Check: dig yourdomain.com +noall +answer and verify TTL values.

8. Logging and monitoring are operational

Production monitoring must be ready before launch, not configured reactively during an incident. Verify dashboards are loading, metrics are flowing, and alerts are configured.

Check: Trigger a test error and verify it appears in your error tracking tool within 60 seconds. Verify at least one person’s phone is configured to receive on-call alerts.

9. Database indexes match production query patterns

Indexes in staging may not match production query patterns. Verify that the queries your application will run under real load have appropriate indexes.

Check: Run EXPLAIN ANALYZE on your 10 most common query patterns. No sequential scans on tables larger than 10,000 rows.

10. Network security groups allow necessary traffic

Firewall rules that work in staging can fail in production if VPCs or security groups differ. Verify application instances can reach the database, cache, and external services.

Check: Test connectivity from application instances to each dependency before go-live.

Application Checklist (10 Items)

11. Error handling returns appropriate HTTP status codes

Application errors should return proper 5xx status codes so load balancers, health checks, and monitoring detect them. Applications that swallow exceptions and return 200 with an error body are invisible to infrastructure-level monitoring.

Check: Test your error paths. Simulate a database unavailability and verify the application returns 503, not 200.

12. Timeouts are configured for all external calls

Every call to a database, cache, external API, or internal service needs a timeout. Without timeouts, slow dependencies cause thread/goroutine exhaustion, cascading the failure system-wide.

Check: Search your codebase for HTTP client, database client, and cache client configurations. Every client should have a connection timeout and a read timeout.

13. Circuit breakers are implemented for critical dependencies

Circuit breakers prevent cascading failures by fast-failing requests when a dependency is unhealthy. Without them, a slow payment gateway can hold all your application threads waiting for responses.

Verify: For each critical external dependency, there is a circuit breaker that opens after a configurable failure threshold and closes after the dependency recovers.

14. Retry logic includes exponential backoff

Retries with fixed intervals can amplify load on a struggling dependency (the “thundering herd” problem). Retries must use exponential backoff with jitter.

Check: Find every retry configuration in your codebase. Verify: initial retry delay >= 100ms, exponential backoff factor >= 2, jitter applied, maximum retry count <= 3.

15. Background jobs have concurrency limits

Background job processors without concurrency limits will spin up unlimited workers when a queue backs up, consuming all available database connections and memory.

Check: Every job queue consumer has a configured maximum concurrency that you have tested under load.

16. File uploads have size limits and validation

Unrestricted file uploads allow users to upload files that exhaust disk space or memory. Verify file size limits are enforced at the network layer (nginx/load balancer), not just the application layer.

Check: Test uploading a 1GB file. It should be rejected quickly, not after consuming server resources.

17. Rate limiting is implemented on public endpoints

Without rate limiting, a single aggressive client can exhaust your application capacity. Verify rate limiting is implemented on authentication endpoints, API endpoints, and any endpoint with external access.

Check: Send 1000 requests to your public API in 10 seconds from a single IP. Verify 429 responses before server resources are saturated.

18. Database queries are parameterized (no string interpolation)

SQL injection aside, string-interpolated queries prevent PostgreSQL from caching query plans effectively. Parameterized queries allow plan caching, reducing CPU overhead significantly at scale.

Check: Review your ORM and raw SQL usage. No string formatting in SQL queries.

19. Memory limits are set on application containers

Containers without memory limits can consume all available node memory, triggering OOM kills on other containers or the entire node. Set both requests and limits in Kubernetes.

Check: kubectl get pods -o yaml | grep -A 3 resources - verify both requests and limits are set for all containers.

20. Graceful shutdown is implemented

When rolling deployments terminate old pods, those pods should complete in-flight requests before shutting down. Without graceful shutdown, rolling deployments cause visible error spikes.

Verify: Deploy a new version while sending traffic. Verify zero 5xx errors during the rollout.

Frontend Checklist (10 Items)

21. JavaScript bundles are code-split

Loading your entire JavaScript bundle on every page increases load time for pages that only use a fraction of the code. Verify route-based code splitting is implemented so each page loads only the JavaScript it needs.

Check: Open the Network tab, navigate between pages, and verify that different JavaScript chunks load for different routes.

22. Images are in modern formats (WebP or AVIF)

JPEG and PNG images are typically 25-50% larger than equivalent WebP images. Verify all images are served in WebP or AVIF format with appropriate fallbacks.

Check: Inspect image requests in the Network tab. Images should have .webp or .avif content-type, not image/jpeg.

23. Images have explicit width and height attributes

Missing dimensions cause layout shift (CLS) as the browser cannot reserve space before the image loads.

Check: document.querySelectorAll('img:not([width])').length in the browser console should be 0.

24. Critical CSS is inlined

Above-the-fold content requires CSS to render. If that CSS is in an external stylesheet, the browser must fetch and parse the stylesheet before rendering. Inlining critical CSS enables immediate rendering.

Check: View the page source. The styles needed for the visible content should appear in a <style> tag in <head>.

25. Third-party scripts are loaded asynchronously

Analytics, chat widgets, and marketing scripts loaded synchronously in <head> block rendering. Load them with async or defer, or use a tag manager that loads scripts after page load.

Check: View page source. No <script src="..."> without async or defer attributes in <head>.

26. Fonts are preloaded and use font-display: swap

Web fonts that block text rendering delay LCP and cause layout shift. Preload critical fonts and use font-display: swap or font-display: optional.

Check: View page source. Critical fonts should have <link rel="preload" as="font"> in <head>.

27. The LCP image is preloaded with high priority

The Largest Contentful Paint image should load as early as possible. Add fetchpriority="high" and <link rel="preload"> for the LCP image.

Check: Run Lighthouse. The “Preload Largest Contentful Paint image” audit should pass.

28. Cache-control headers are set correctly

Static assets (versioned JS, CSS, images) should be immutably cached for a long time. HTML should not be cached (or cached briefly). Verify cache-control headers are correct.

Check: curl -I https://yourdomain.com/app.js should show Cache-Control: public, max-age=31536000, immutable.

29. Error pages are served from CDN

If your origin is down, users see a CDN-served error page rather than a timeout. Verify your CDN is configured with custom error pages for 502 and 503 responses.

Check: Stop your origin server and verify a branded error page is returned, not a generic CDN error.

30. Content Security Policy is configured without performance issues

CSP headers that reference many external domains add DNS lookup overhead. Verify your CSP is tight and does not include unnecessary external domains.

Check: curl -I https://yourdomain.com | grep -i content-security-policy. Verify the policy does not contain wildcard origins (*).

Testing Checklist (10 Items)

31. Load test at 2x expected peak traffic

Run a load test that targets 200% of your expected peak traffic. You want headroom above your expected maximum so you are not operating at capacity limits during your peak.

Required: A load test report showing successful operation at 2x peak, with p99 latency within acceptable bounds.

32. Soak test at expected peak for 1 hour

Short load tests do not reveal memory leaks, connection pool exhaustion under sustained load, or disk space issues from log accumulation. Run a 1-hour soak test at peak load.

Required: 1-hour soak test with stable memory usage (no steady upward trend), stable latency, and zero OOM events.

33. Database failover has been tested

If you are running with a database replica for failover, verify that failover actually works and your application reconnects. Many teams discover their auto-failover is broken only during an actual outage.

Required: Trigger a database failover in staging or pre-production and verify application reconnects within 60 seconds.

34. Cache eviction behavior has been tested

Simulate a cache restart or flush and verify the application handles cache misses gracefully without overwhelming the database.

Required: Flush the Redis cache and observe application behavior. Latency spike should be brief (under 30 seconds) and self-resolving.

35. Third-party dependency timeout behavior is tested

Simulate your third-party dependencies becoming slow or unavailable and verify your application handles it gracefully.

Required: Test with: payment processor unavailable (checkout should degrade gracefully), email service unavailable (email queuing should work), analytics service unavailable (should not affect core functionality).

36. Deployment rollout is tested for zero-downtime

Rolling deployments should produce zero 5xx errors for end users. Test a rolling deployment while sending continuous traffic.

Required: Zero 5xx errors during a rolling deployment. Latency blip under 100ms during rollout.

37. Alerting fires at the right thresholds

Alerts should fire before users are significantly impacted. Test your alerting by simulating conditions that should trigger each alert.

Required: List of all production alerts and their tested behavior. No alert should require manual investigation to interpret.

38. Runbooks exist for the top 5 failure scenarios

When an incident happens during launch, engineers are under time pressure. Runbooks for the most common failure scenarios accelerate response significantly.

Required: Written runbooks for: database unavailable, cache unavailable, application OOM, traffic spike (scaling), dependency timeout.

39. Rollback procedure is documented and tested

You need to be able to roll back a problematic deployment in under 5 minutes. Test the rollback procedure before launch day.

Required: Documented rollback procedure. Tested in staging. Rollback time under 5 minutes.

40. Performance baseline is documented

Document your current performance metrics before launch so you can compare during and after the launch. Without a baseline, you cannot tell whether post-launch performance is acceptable.

Required: Documented baseline for: p50/p95/p99 latency, error rate, throughput, CPU and memory utilization, database query latency.

Working through this checklist before every major launch reduces launch-day incidents significantly. If you do not have time to address every item, prioritize the infrastructure and testing sections - they catch the highest-severity problems.

Need help running this checklist or executing the required load tests? Our pre-launch performance review covers all 40 items with a two-day engagement and delivers a go/no-go recommendation with specific remediation steps for any items that fail.

Your P99 Deserves Better

Book a free 30-minute performance scope call with our engineers. We review your latency profile, identify the most impactful optimization target, and scope a sprint to fix it.

Talk to an Expert