How I Scaled a PHP Monolith to Handle 1,000 Concurrent Users Without Rewriting It
How I Scaled a PHP Monolith to Handle 1,000 Concurrent Users Without Rewriting It
When I joined Onest Tech as Tech Lead, the flagship ERP was struggling at 80 concurrent users. The instinct was to rewrite in microservices. Instead, I spent three weeks profiling before touching a single line of code.
Step 1: Measure Before You Guess
We used `EXPLAIN ANALYZE` on every slow query and New Relic APM to find the real bottlenecks. 80% of latency came from three N+1 query patterns in the inventory module — not the architecture.
Step 2: Redis for the Right Things
We added Redis caching at two layers: a 60-second TTL on read-heavy dashboard queries, and a persistent cache for permission lookups that were hitting the DB on every request. This alone cut DB load by 40%.
Step 3: Database Connection Pooling
PHP-FPM was opening a new MySQL connection per request. Switching to PgBouncer-style connection pooling via ProxySQL reduced connection overhead from ~12ms to ~0.8ms per request.
Step 4: Horizontal Scaling with Nginx + Session Offload
Once the DB was no longer the bottleneck, we added a second app server behind Nginx. The blocker was PHP sessions stored on disk — we moved them to Redis, making the app stateless and load-balanceable.
Result
After six weeks: 1,000+ concurrent users, p95 response time under 400ms, zero downtime migration. The lesson: profiling beats intuition every time. The rewrite would have taken 12 months and introduced new bugs. The optimization took 6 weeks and made the existing codebase genuinely fast.