Thursday, September 20, 2007

Regex for linked servers

As part of the DTS migration I mentioned in my last post, I wrote a few functions to find less-than-ideal code. One thing we wanted to get rid of was linked-server joins, and the easiest way to find those was to check each Execute SQL task using a regular expression (long live Regulazy). So once again I'm using this blog as my personal source repository and re-producing the regex for this.

Regex for linked server queries:
(\S+)\.(\S+)\.dbo\.(\S+)

I tried to generate an expression to find "SELECT...INTO" where ... did not contain @ or FROM, but couldn't figure it out with the meager regex capabilities of VBScript. Seems like a lookahead or lookbehind would probably work for this purpose, but according to the 4 Guys From Rolla, VBScript regex does not support these.

UPDATE: Spent a little more time on it and figured out a suitable regular expression to find instances of SELECT INTO that excludes INSERT INTO and SELECT FROM.

Regex for SELECT INTO:
select((?!from)(?!insert).\n)*into

UPDATE 2: My first SELECT INTO regex was foiled by something like "SELECT foo, bar, admintool from ...", so I fixed it to make sure there's whitespace before and after the SELECT and the INTO.

select[\s]((?!from)(?!insert).\n)*[\s]into[\s]

Tuesday, September 11, 2007

Parsing the filesystem for files

I'm sure this has been done many times before, but this is my implementation. I needed it to run a large DTS migration from files stored on disk, so first I needed a list of those files. They were contained in a directory tree in no particular order, and in many levels of subdirectories. I wanted a generalized solution to the parsing, so here it is.

Pasting it into Blogger butchered my formatting, but it feels like a waste of time to fix it, since it will probably not paste properly back into an editor anyway. I made the (minimal) comments green just to break it up a bit.

As usual, no warranty or even functionality of this code is expressed or implied. Use at your own risk, especially since I'm not sure the xp_subdirs extended proc is supported and xp_cmdshell can definitely be dangerous.


/*************************************************
** get list of dirs to search for files
*************************************************/


declare @currentdir varchar(1000)
declare @fullpath varchar(1000)
declare @basepath varchar(1000)

set @basepath = 'c:\stuff'
set @fullpath = @basepath

if exists (select 1 from tempdb.dbo.sysobjects where name like '#dirs___%')
drop table #dirs

create table #dirs
(
directory varchar(1000)
, done bit
)

if exists (select 1 from tempdb.dbo.sysobjects where name like '#dirtemp___%')
drop table #dirtemp

create table #dirtemp
( directory varchar(1000) )

insert into #dirtemp exec master.dbo.xp_subdirs @fullpath

insert into #dirs
select
'\' + directory as directory
, 0 as done
from #dirtemp


Loop: -- xp_subdirs spits out errors when a dir has no subdirs, so they have to be caught with this GOTO Loop construct
while exists (select 1 from #dirs where done = 0)
begin
if exists (select 1 from tempdb.dbo.sysobjects where name like '#subdirs___%')
drop table #subdirs

create table #subdirs
( directory varchar(255) )

select top 1 @currentdir = directory from #dirs where done = 0

set @fullpath = @basepath + @currentdir

exec master.dbo.xp_subdirs @fullpath
if (@@ERROR <> 0)
begin
update #dirs set done = 1 where @currentdir = directory
GOTO Loop
end

insert into #subdirs exec master.dbo.xp_subdirs @fullpath
insert into #dirs select @currentdir + '\' + s.directory, 0 as done from #subdirs s
update #dirs set done = 1 where @currentdir = directory
end

if exists (select 1 from tempdb.dbo.sysobjects where name like '#dirtemp___%')
drop table #dirtemp

if exists (select 1 from tempdb.dbo.sysobjects where name like '#subdirs___%')
drop table #subdirs

/*************************************************
** get list of files
*************************************************/


declare @command varchar(1000)
declare @fileextension varchar(5)

set @fileextension = 'dts'

if exists (select 1 from tempdb.dbo.sysobjects where name like '#files___%')
drop table #files

create table #files
(
filepath varchar(1000)
)

update #dirs set done = 0

while exists (select 1 from #dirs where done = 0)
begin
if exists (select 1 from tempdb.dbo.sysobjects where name like '#filetemp___%')
drop table #filetemp

create table #filetemp
(
filepath varchar(1000)
)

select top 1 @currentdir = directory from #dirs where done = 0

set @command = 'dir ' + @basepath + @currentdir + '\*.' + @fileextension + ' /B'

insert into #filetemp exec master.dbo.xp_cmdshell @command

insert into #files select @basepath + @currentdir + '\' + filepath from #filetemp

update #dirs set done = 1 where directory = @currentdir
end

if exists (select 1 from tempdb.dbo.sysobjects where name like '#filetemp___%')
drop table #filetemp

delete from #files where ISNULL(filepath, 'x') not like '%' + @fileextension + '%'

select * from #files