With help from our consulting partner we have been digging into the database and resolving some of the big database hogs - resulting in an alleviation of some pressure but obviously not resolving the times when traffic simply overwhelms the site. Since this is a big issue there are many smaller things going on to defeat Goliath.
First, we have put a specification for a new machine. This is a very straightforward approach by adding more "iron" to the problem. It's a stopgap measure and certainly not the final solution. However it accomplishes several goals:
- Creates a fresh install (always good)
- Gives us time to reconfigure the database so it runs faster
- Gives it an extra oomph with disk space, memory, etc.
The second point is actually the most important. Currently the database is configured in such a way that it causes the disk drives to work harder than they need to. By moving disks around to be dedicated to certain tasks, like backups, the database ultimately speeds up.
In addition to this we're still scouring the logs, especially during server too busy times, to tweak the most common queries that take too long. Since the Server Too Busy errors happen when there are too many queries in the queue, speeding up these queries will reduce the queue and help resolve the issue.
We're also starting a documentation stage for Geocaching.com which will look at the entire site and recommend a new version of the back-end code that helps to fix this for good. This will be a 6-8 month project so in parallel we'll be working hard to make sure the site functions well in the current system to allow us time to create a permanent solution.
Keep on geocaching - and remember - Sundays and Mondays are the slowest times of the week. If you visit Tuesday-Thursday you'll have a better experience logging caches. It isn't the right solution but it helps if you get aggrivated by the slow times.
And as always, thanks everyone for your patience and support. We're working hard on the problem even though it may seem like nothing on the site is changing.