Friday, March 30, 2012

Another hazard of Laptopistan - slow WiFi

This is hardly unexpected, but one of the side effects of everyone using Starbucks as a de facto workplace and internet cafe is that their WiFi is heavily stressed. People watching online video, using Remote Desktop (like me), probably even BitTorrent. I don't know what sort of routers they put in these coffee shops, but they often seem to be unequal to the load they're handling.

This morning I'm in a Starbucks, and my connection went from perfectly fine, at 8:30 AM before anyone was here, to barely usable now that I'm surrounded by laptops. One more reason for actual workspace.

Friday, March 16, 2012

Sliding Window partitioning

Implemented my first "sliding window" partitioned table. Wound up being easier than I expected, especially since the table will only have 2 partitions for now - Current and Archive.

Most sliding window schemes I've seen out there on the intertubes make use of a date as the partition key and update the sliding window based on the current date. In this case, the data comes in somewhat sporadically, so although it's date-based (a month key), I made the partitioning data-driven instead of date-driven. So for starters we have a RANGE LEFT scheme like this:

Partition Month Key
Current -
Archive 88

Say we're currently on month 90. So months 89 and 90 are in Current, and 1-88 are in Archive. When the sliding window proc runs, it does essentially this:

SELECT @NewBoundary = MAX(monthKey)-2 FROM tbPartitionedFoo
SELECT @CurBoundary = CAST(prv.value as smallint) FROM sys.partition_functions AS pf
JOIN sys.partition_range_values as prv
ON prv.function_id = pf.function_id
WHERE pf.name = 'partfn_foo'

(If there was more than one partition range value, a MIN() or something would be needed to grab the appropriate value.)

IF @NewBoundary > @CurBoundary
BEGIN
ALTER PARTITION FUNCTION partfn_foo() SPLIT RANGE (@NewBoundary)
ALTER PARTITION FUNCTION partfn_foo() MERGE RANGE (@CurBoundary)
END

So when we start getting monthKey 91 data and the maintenance proc runs, it will do this:

IF 89 > 88
BEGIN
ALTER PARTITION FUNCTION partfn_foo() SPLIT RANGE (89)

Now we have 3 partitions:

Partition Month Key
Current -
temp 89
Archive 88

ALTER PARTITION FUNCTION partfn_foo() MERGE RANGE (88)
END

And now we're back to 2:

Partition Month Key
Current -
Archive 89

Monday, March 5, 2012

Python Gets Things Done

A quick example of why I've grown to love Python.

My girlfriend had a laboriously scanned document in two pieces that needed to be combined. I had a .NET command line utility that I wrote a couple years ago that does just that, but because one of the scanned pages was a different size, it got cut off in the resulting output file. So I could either try to dig up some docs on the library I used, play around with Intellisense in Visual Studio, or see if there was a Python library that might work.

2 minutes of Googling later I had pyPDF, and maybe 10 minutes after that, I had reset the mediaBox on the outsized page, written it to a new pdf file, and emailed it off. Q.E.D.

DATEs aren't INTs, at least not any more

I updated a database column today to utilize the SQL 2008 DATE type, upgrading from a SMALLDATETIME which always had a 00:00:00 hh:mm:ss component. I was fairly certain that this would not break any existing code, so I did not perform an exhaustive code search. My mistake.

A function broke with the following message:

Msg 206, Level 16, State 2, Line 1
Operand type clash: date is incompatible with int

"Strange," I thought. "How would an int be getting used as a date?"

Of course there's the basic method of datetime construction from a string: CAST("2012-01-01" as datetime), but that doesn't involve any INTs. Did someone try to use a similar INT (20120101) in place of the string? No, because that would cause an arithmetic overflow...

Well, it turns out that there's another way of using INTs as DATETIMEs:

select cast(CAST('2012-01-10' as datetime) as int)

Apparently this doesn't work any more:

select cast(CAST('2012-01-10' as date) as int)

And this is what was being performed in the function in question:

...
CASE WHEN io_end > 40947 THEN 1 ELSE 0 END
...

Which promptly broke when I changed the column to a DATE. So let this be a warning: even something as innocuous as a DATETIME to DATE change needs QA and code search.