Wednesday, May 16, 2007

WRITELOG waits

My team has been assisting another team off and on for several months with performance tuning their overnight DB processes. There are many possible bottlenecks in their process, but since the app writing into the DB is highly complex and multi-tiered, we're trying to focus on the internal DB issues first.

The first big win was dropping triggers from the tables being written to, which resulted in a 20% reduction in overall job time. Now the biggest WAITTYPE during the process is WRITELOG, with about 50-60% of the overall time in both QA and Prod. NETWORKIO is next, and I'd really like to SET NOCOUNT ON, since these writes are created as individual transactions and dropping the count should reduce the NETWORKIO substantially, but that's been problematic so far.

Digging into the WRITELOG waits, the disk queues are pretty low (0-5 range going against 80 or so physical disks on a big Hitachi SAN with a ton of cache), so the storage guys have dismissed the disk as the problem. However, I think it's still the disk, just not in the usual way.

The standard problem is that the disk subsystem runs out of write cache and has to start de-staging data to disk, and the physical disks can't write data as fast as it's coming in. That's not the case here, because the cache on the SAN is huge (70+ GB, according to the SAN guys), it can accept data faster than the logs are created, and it hasn't gotten close to the high watermark, when it starts de-staging. So what's the problem?

Well, in this case, the disks attached to the SQL Server have been attached as a single logical drive, so all the IOs get queued by Windows against that drive. Therein lies the problem, I believe. The SAN is fast, but every operation goes against that same queue, so one slow read for a piece of data not in the cache can slow down all the writelog IOs queued up behind it. This is what I believe is creating the WRITELOG waits.

The proposed solution is simple: add another logical drive dedicated to logs. Even if it's hitting the same physical disks, the addition of another disk queue should allow the log IOs to flow through without being blocked by slower data IOs or data reads. Stay tuned to see if it works.

Addendum: We tried this before at Capital IQ, splitting up sets of disks into multiple metaLUNs and comparing them against the same number of disks in one metaLUN and showed barely any performance gain (within the margin of measurement error). However, the difference between that situation and this one was the amount of cache involved - we were always past cache on the CapIQ SAN, so the disks were the bottleneck. Here I suspect it's the queue.

1 comment:

Bobby Fu said...

Ira,
I am having the same symptum - high %WRITELOG and very low disk queue length - as you have described in your blob.

What's your outcome by adding another logical drive? Thanks.