> For now we've found that restarting the httpd and mysqld services brings things back to normal almost immediately.
You need to examine the restart process and analyze why it resolves the issue. If the reason is the abandonment of dead parasitic processes and memory leaks, you need to find out why and correct them. If the reason is that the restart unceremoniously drops all the current transactions, you need to increase capacity.
> Can anyone think of any downside to this (used at least as a temporary measure)?
I certainly can -- a bunch of really irritated visitors, whose transactions are abandoned. But that's only true if that is actually what's going on. Make sure you don't have software issues that are preventing efficient operation. If that's not the issue, you need to grow with your customer base -- increase server capacity.
>Like I said in OP, we've identified MySQL to be the primary bottleneck and are already working on resolving this.
Ah, yes. I remember from your prior post that you have very large databases and table sizes and are considering (or have begun) partitioning the largest tables. It turns out there is an innate partitioning scheme built into the most recent MySQL versions, but it has to be compiled into the running binary by way of a compiler flag:
Yes, we did investigate MySQL's internal partitioning option briefly before deciding to roll our own scheme (which, after working quite well initially, is now beginning to create problems of its own).
Perhaps, it is time for us to revisit this. Thanks again Paul.
You need to examine the restart process and analyze why it resolves the issue. If the reason is the abandonment of dead parasitic processes and memory leaks, you need to find out why and correct them. If the reason is that the restart unceremoniously drops all the current transactions, you need to increase capacity.
> Can anyone think of any downside to this (used at least as a temporary measure)?
I certainly can -- a bunch of really irritated visitors, whose transactions are abandoned. But that's only true if that is actually what's going on. Make sure you don't have software issues that are preventing efficient operation. If that's not the issue, you need to grow with your customer base -- increase server capacity.