Thursday, March 27, 2008

Returning rows from a stored procedure with pyodbc

I'll start with the executive summary for those of you short on time or who don't care about the background:

When executing multiple statements within a SQL Server stored procedure, including a SELECT statement that you expect to return rows to a rowset via pyodbc, it is necessary to SET NOCOUNT ON.

Backstory:

I ran into a new issue with pyodbc yesterday - an inability to parse rows returned by a SQL Server stored procedure. The proc in question was executing a number of queries, generating logging as rows to a table variable, and then returning the log via a select statement against the table variable.

However, when I executed the statement from a pyodbc connection, it didn't work.

This was a fairly complex proc that caused changes in the database when executed and I didn't want to run it repeatedly, so I mocked up "spTest" as follows:

create proc spTest as
select 'a' as msg
union all select 'b'
go

When executed from Python:

db = pyodbc.connect(CNXNSTRING, True)
rows = db.execute("exec spTest")
for row in rows:
print row

This returned rows. So I added a table variable:

create proc spTest2 as
declare @msgs table (msg varchar(255))

insert into @msgs
select 'a' as msg
union all select 'b'

select * from @msgs
go

This, being more similar to the actual proc, did not work. I changed a few things, tried a temp table instead of a table variable, etc., but didn't get any better results. Then my esteemed colleague Pijus asked if it was possible that pyodbc was reading some other channel of information from SQL that was obscuring the results. Aha!

SQL spits out rowcounts by default. I'm not sure where these turn up in ODBC world, but I know that they do get provided to the ODBC driver, and that does constitute an additional channel of information that could be confusing pyodbc. So I SET NOCOUNT ON, and voila, the final select returns results regardless of how many other operations occur within the procedure.

Wednesday, March 12, 2008

Case Sensitive GROUP BY

This is fairly trivial, but it took a few minutes to work out, so I figured I'd post it. Someone on my team needed to perform a case-sensitive GROUP BY operation on a table with a case-insensitive collation. I had performed case-sensitive SELECTs before by "casting" the field in question to an alternate collation, so I tried that, but couldn't get it to work at first. Of course, had I thought about it, I would have realized that the same operation needs to be applied in both the GROUP BY and the SELECT.

Ex:

create table foo
(bar varchar(20))

insert into foo select 'Banana'
insert into foo select 'banana'
insert into foo select 'Banana'

select * from foo

select
bar COLLATE SQL_Latin1_General_CP1_CS_AS
, count(*)
from foo
group by bar COLLATE SQL_Latin1_General_CP1_CS_AS

Alternately, one could use Microsoft's recommended method and convert the field to binary and then back. I haven't performance-tested the two methods, so I'm not sure which is faster. I leave that as an exercise for the reader. ;)

select convert(varchar(20),x.bar), x.ct
from
(select
convert(binary(20), bar) as bar
, count(*) as ct
from foo
group by convert(binary(20),bar)
) as x

Monday, March 10, 2008

INSERT problems via pyodbc

This is really stupid, but I just wasted about an hour of dev time trying to figure out why my INSERT statements weren't working from a Python script using pyodbc even though SELECTs were fine. As it turns out, they were getting rolled back because I wasn't explicitly committing the transactions and hadn't set the AUTOCOMMIT option to true for the connection.

So if you happen to search for "pyodbc INSERT problems" (as I did), hopefully you'll stumble across this (as opposed to nothing, which is what I found) and slap yourself in the head (as I did).

SQL Server CLR Remote Debugging

My CLR stored procs have matured to the point that they're being deployed, which is to say that the bugs are now insidious rather than blatant. It's a royal pain to make changes, deploy the project locally, run the post-deploy correction script to fix the decimal precision on everything (see previous post), debug any issues, and then do the whole deployment again to the dev server. It's much easier to just debug it straight on the server.

Fortunately, remote debugging is a straightforward process. Predictably, Microsoft's docs (here's one of many) on the subject are somewhat convoluted, although this msdn blog post helped.

All you really need to do is install Remote Debugging Monitor on the server and run it under the same login you use to connect to SQL. It comes with its own standalone installer on the Visual Studio CD, just look for the Remote Debugger directory and run the setup app found there. Then you can run it either manually or as a service.

Next, place a breakpoint in your code somewhere and deploy it to the server. I've had trouble with my test scripts when deploying/debugging all at once - basically, they hang without ever running - so what I've started doing is this:

1) Start without debugging (Ctrl-F5) to deploy to the remote server
2) Run my decimal parameter correction script
3) Attach to Process in Visual Studio to sqlservr.exe on the remote server
4) Execute the proc or UDF in a SQL Management Studio window

Then I can step through the code, look at locals, and generally debug to my heart's content.