Showing posts with label sql server 2005. Show all posts
Showing posts with label sql server 2005. Show all posts

Wednesday, June 26, 2013

GetBytes error when using a DATE column

Quick troubleshooting post. Given this error from a colleague upon execution of a pretty simple function that used a DATE column:

An error occurred while executing batch. Error message is: Invalid attempt to GetBytes on column 'RelDate'.  The GetBytes function can only be used on columns of type Text, NText, or Image.

I couldn't reproduce the problem on my machine - the same function worked fine for me.  Checked all his query execution settings and didn't see anything, and then he realized he was running SQL Management Studio 2005 connected to a 2008 R2 instance.  As soon as he fired up SMS 2008 R2, the function worked fine.

Wednesday, May 11, 2011

The Rare, Simple SQL Wait Fix

A "power user" reported some slowness on one of our big servers today while it was being pummeled by about 60 simultaneous executions of the same script. The wait type was PAGELATCH_UP, which I vaguely recalled to mean something about allocation contention. The Microsoft Waits and Queues tuning guide confirmed that, so I checked the wait resources. They all started with 2:6 and 2:9, so it definitely had to do with tempdb.

This server is pretty beefy, a 4-socket quad-core with Xeon 7350s, but tempdb only had 8 files. I know that the rule of thumb of 1 file per core is no longer quite so hard and fast, but I figured it probably wouldn't hurt here. Created an extra 8 files, equalized the file sizes on the current ones, and had the user kick off the process again.

No waits! Or at least, no PAGELATCH_UP waits. Some SOS_SCHEDULER_YIELDs and CXPACKETs, but I took that to mean that we had successfully shifted the bottleneck off of allocations and onto CPU, where it should be.

It's rare that 5 minutes of configuration change can effect a significant gain in process speed, but it's pretty satisfying when it happens.

Tuesday, January 19, 2010

Nested CTEs

Had my first need for nested CTEs today - for a table with prices by object and another with adjustment factors by object, I needed to multiply the prices by all factors occurring after the price date. This could be done in the C++ code I was working on via an ugly for loop for each price date, but I knew there had to be a better, set-based way within SQL.

The answer was nested Common Table Expressions (CTEs), which Microsoft officially says are not supported, but which can actually be implemented by defining multiple CTEs (with a comma between each) and using the first one in the second.

For this application, the first CTE orders the object ids and adjustment factors and adds row numbers. An ORDER BY would have sufficed to order everything, but in order to avoid inefficient subqueries in the next CTE, I needed consistent row numbering. The second CTE aggregates the adjustment factors for each date by recursing down the ordered factors for each object and multiplying them as it goes.

This solution condenses all the selection logic into one set of SQL statements, is compact, fairly readable, and has a better complexity profile than the for loop alternative.

(Table and column names changed to protect confidentiality.)

WITH AdjFactorsOrdered (ObjectId, AdjDate, AdjFactor, ord) AS
(SELECT ObjectId, AdjDate, AdjFactor, ROW_NUMBER() OVER (PARTITION BY ObjectId ORDER BY AdjDate desc)
FROM vwAdjFactors
WHERE AdjDate > '2009-01-01'
AND AdjDate <= '2010-01-01'
AND ISNULL(AdjFactor, 1) != 1
AND SubTypeId != 1
),
AdjFactorsCombined (ObjectId, AdjDate, AdjFactor, ord) AS
(SELECT ObjectId, AdjDate, AdjFactor, ord
FROM AdjFactorsOrdered
WHERE ord = 1
UNION ALL
SELECT mul.ObjectId, mul.AdjDate, base.AdjFactor * mul.AdjFactor as AdjFactor, mul.ord
FROM AdjFactorsCombined base
INNER JOIN AdjFactorsOrdered mul
ON mul.ord = base.ord + 1
AND mul.ObjectId = base.ObjectId
)
SELECT ObjectId, AdjDate, AdjFactor
FROM AdjFactorsCombined
ORDER BY ObjectId, AdjDate

Wednesday, October 21, 2009

Notification Services doesn't like version mismatches

I was pinged today to check out a failed installation of Notification Services with a vendor app. The vendor consultant was telling us that there was a mismatch in the version of SQL Tools installed, which sounded odd to me, since generally the tools are consistent between versions and the engines are different.

I tried to start up the Notification Services instance and got a 1359 error:

---------------------------
Services
---------------------------
Could not start the NSxxxx service on Local Computer.

Error 1359: An internal error occurred.
---------------------------
OK
---------------------------

Inspecting the error log, I found this:

Description:
The Notification Services instance encountered an error in one of its components and must stop.
EventParameters:
Instance Name: xxxxx
Problem Description: The database was created with or upgraded to a different edition of Notification Services. Use the Notification Services edition that the database expects.
Notification Services Edition: Standard Edition
Database Edition: Enterprise Edition


So it turns out that the consultant was partially correct - Notification Services won't start if the version is mismatched between the engine installation and its corresponding SQL Server instance. Good to know.

Thursday, January 22, 2009

SSIS transactions, or Another Reason To Hate Windows Firewall

I created an SSIS package today to archive a bunch of tables from a source server into one table on a destination server and then drop them on the source. I wanted the process to be idempotent, such that the tables would either be fully archived and dropped or the whole thing rolled back. Thus the process could be re-run if it failed in the middle without fear of duplicating data or prematurely dropping tables.

The obvious solution was a transaction. I set the TransactionOption property on my ForEach loop to Required and ran the package. However, MSDTC was not cooperating, and I got this failure:


With the following error message: "The transaction has already been implicitly or explicitly committed or aborted."

I checked MSDTC on both servers involved, and they each seemed to be set up correctly, albeit with slightly different settings. MSDTC on my machine was also configured correctly. Then I remembered that my desktop has Windows Firewall running. Don't ask me why, it wasn't my idea, but I can't turn it off.

So I added an exception to Windows Firewall for c:\windows\system32\msdtc.exe (port 135) and the package started working. Hooray!


Except that when I killed it midway through to simulate a network or SSIS host failure, it rolled back the whole thing. I wanted the ForEach loop to commit each table load and drop as it went so that I wouldn't have to start from scratch if the package failed after a few tables. So I created a Sequence container inside the ForEach container, set the TransactionOption to Required for that container, moved the Transfer and Drop tasks inside it, and set the TransactionOption to Supported for the ForEach.

Wednesday, January 7, 2009

sql_variant causes SSIS metadata validation error

Just a quick note about an error that stumped me for a minute.

I was setting up a Lookup task in SSIS 2005 using the results of a SQL query as the lookup, but I kept getting the following error when attempting to create the column mappings:

TITLE: Microsoft Visual Studio
------------------------------

Error at Load Position Limit Data [Lookup [49]]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x8000FFFF.

Error at Load Position Limit Data [Lookup [49]]: The call to Lookup transform method, ReinitializeMetadata, failed.

------------------------------
ADDITIONAL INFORMATION:

Exception from HRESULT: 0xC02090F0 (Microsoft.SqlServer.DTSPipelineWrap)


This didn't seem like a very helpful error, until I thought about what ReinitializeMetadata probably did, and then realized that my query selected a sql_variant column without casting it. Casting the sql_variant to a more appropriate data type (varchar, in this case) resolved the error.

Monday, October 27, 2008

Using local disks in a cluster

With the advent of (relatively) cheap, fast Solid State Disks, DBAs everywhere are realizing that they could potentially see huge performance gains by incorporating these new disks into their database servers. This is straightforward if you've got a monolithic box: buy 2 SSDs, create a new RAID 1, move tempdb and/or logs onto it, restart SQL, shake well, and watch your IO waits go to 0! Hopefully, anyway.

However, if you've got a cluster, things are a bit trickier. The SAN vendors are still ramping up their SSD support, so it can be difficult to get an SSD into a shared storage enclosure unless you roll your own. Especially if you pick up one of the most promising new disks, the FusionIO ioDrive. This disk gets attached to a PCIe bus directly rather than a SAS or SATA backplane, so it can't be put into most commercially available shared storage devices (SANs, NASs, etc.).

Even so, we were sufficiently impressed by the IO stats on this disk to pick up a couple of them, but we still wanted to put them in our production cluster. We can't use them for shared data, obviously, as the data wouldn't persist between machines. But in theory they could be perfect for tempdb, which gets wiped out on each restart and so only needs to see the same path on each machine. I figured there was probably a way to get Microsoft Cluster Services to see the local SSDs on each machine as the same disk.

To test this theory, we (I and coworker Doug H.) took a pair of virtual machines, added drives to simulate a standard local disk, shared SAN storage, and the new SSD, built them into a cluster, and installed SQL Server 2005 on the R: drive.

Disk 0 (C:) - local - boot disk
Disk 1 (Q:) - shared - quorum
Disk 2 (R:) - shared - SQL data
Disk 3 (D:) - local - simulated SSD

When the VMs were initially configured, the fake SSDs were created on the same SCSI bus as the boot disk. The cluster administrator quickly informed us that this would not fly - disks on the same bus as the boot disk cannot be used as shared resources. So we moved the D: drives to a new bus, separate from both the C: (simulated backplane) and Q:/R: (simulated SAN) drives.

The next step was to add the D: drive to the cluster. I assumed this wouldn't work without some hacking, but I decided to try a "baseline" attempt without any modifications. The Cluster Admin New Resource wizard allowed me to add the D: drive on one of the nodes, but it wouldn't come online. So much for the baseline attempt.

I knew that Windows 2003 (though maybe not 2008) clustering relies on disk signatures to identify disks for cluster operations, so I decided to synchronize the signatures of the two fake SSDs. To this end, I found a utility - MBRWizard - that would let me manually set the signature of an MBR disk.

Using this utility, I found the signature of the D: drive on machine #1:

mbrwiz /Disk=3 /Signature

and then overwrote the signature of the D: drive on machine #2:

mbrwiz /Disk=3 /Signature=xxxxxxxx

Then I added the disk as a cluster resource again, and brought it online. Success! Failover worked, too, so I failed the SQL Server group (Group 0 in the image below) back and forth between nodes and created a Data directory on each D: drive.


The next step was to make sure it worked with SQL Server. I took SQL offline, added Disk D: as a dependency, and brought it back up. Then I altered tempdb's location to use the D: drive after a restart.

alter database tempdb
modify file (NAME=tempdev, FILENAME = 'D:\Data\tempdb.mdf')
GO
alter database tempdb
modify file (NAME=templog, FILENAME = 'D:\Data\templog.ldf')
GO

Finally I took SQL offline and brought it back up, with fingers crossed. And it worked! Failover worked too.

So my proof of concept for using a local SSD in a cluster (for tempdb only!) was successful. Stay tuned to see if it works in the real world.

Sunday, September 28, 2008

CPU Affinity Masking plus High CPU = Login Timeouts

Recently we saw an alarming issue with one of our database servers. Under heavy but not prohibitive load (60-80% CPU), it stopped accepting new connections intermittently. Obviously this is bad for any server, especially one that handles thousands of client connections simultaneously.

Some of the hard drives were under heavy load, especially tempdb, but another anomaly was a long-running query using many linked server connections (11!) that had been killed but was stuck in rollback several hours later. It was consuming near 100% of the cycles on one CPU core. We tried a variety of the usual DBA tricks to get rid of this spid, but nothing worked. It wasn't clear to us how this could cause the server to stop accepting connections, however.

Another oddity was the appearance in the System logs of messages like this:

The time stamp counter of CPU on scheduler id 2 is not synchronized with other CPUs.

I asked our Systems guys about this, and they said that this had been noticed a few months ago and a workaround had been put in place as per KB 931279 - the CPU affinity mask had been set in SQL Server.

Hmm.

This happened again a week later, minus the spid stuck in rollback, but with one CPU slammed at 100% again. We had engaged Microsoft PSS to assist with the problem, but so far all they had told us was that we had tempdb IO issues, which we knew. (DVNation, please finish the Windows drivers for your IODrives so we can use them for tempdb!)

So here's my theory, cooked up with one of the other DBAs: the login issues are being caused by the combination of affinity masking and one CPU at 100%. This could happen because schedulers are affinitized by the mask to a single CPU, making them unable to hop CPUs when one is under heavy load. User logins round-robin between schedulers, so if a scheduler is stuck to a single CPU and that CPU is not making enough cycles available to log someone in, eventually the login will timeout and fail.

Plausible? Anyone else seen this kind of issue?

UPDATE:

I was right. After removing the affinity masking, we no longer saw login timeouts, even when the server was near 100% CPU load. Be careful with those affinity masks.

Tuesday, September 9, 2008

SSIS Data Flow Task Error caused by too many Data Flow tasks

I just threw together a very basic SQL 2005 SSIS package, the main component of which is a Data Flow task with 25 sub-flows. The flows are very simple, just direct transfers of a table on one server to the same table name on another server. As an aside, it took WAY too long to create this package - for each little flow, I had to select the source connection, wait for it to populate the table drop-down, then scroll through that drop-down through hundreds of tables to find the one I wanted, then do it again for the destination. This could be better in a number of ways:
  1. Allow typing in the table field. It would have been much quicker to allow me to type or copy and paste the table name rather than waiting for the drop-down and then scrolling through it. 50 times.
  2. Improve copy and paste of flow tasks. If I could have constructed one and duplicated it 25 times, it would have been much quicker, but when I attempted that, I got strange warnings about field truncation that were cleared up only by deleting the source and re-creating it.
  3. Create a wizard that facilitates auto-generating this kind of simple package better than the Transfer Objects wizard.
Maybe one or more of these has been implemented in SQL 2008, but I can't use that yet, since my office has no plans to upgrade our SSIS runtime any time soon.

Anyway, when attempting to execute the package, I was consistently getting a failure from one flow with this error:

TCP Provider: An existing connection was forcibly closed by the remote host

This didn't make much sense to me, as it worked fine when run by itself. I found this article and added that registry key, but it did not fix the problem (although it did seem to increase throughput).

Then I noticed that the two largest flows in the Data Flow task (on the order of millions of rows) were executing at the time when the problem task was throwing an error. I thought SSIS would be able to handle the sequencing of parallel tasks, but I decided to enforce my own execution order by placing the two large flows in a separate Data Flow task.

Voila. Works fine. So if you're having issues with an overpopulated Data Flow task throwing that TCP error, one simple workaround is to manually sequence/reduce the parallel execution of the tasks.

Friday, July 18, 2008

Automated linked server test

Got a big production event scheduled, so I wrote this quick automated test to be incorporated into the post-event testing. I'll probably improve on it incrementally as I notice shortcomings, but at the moment it's just a basic test of data access via all linked servers set up on a machine.


/*** Linked Server Test ***/

declare @linked table (name sysname, done bit, retval int)
declare @srvr sysname
declare @sql varchar(255)
declare @retval int
insert into @linked select name, 0 as done, NULL as retval from sys.servers
while exists (select 1 from @linked where done = 0)
begin
select top 1 @srvr = name from @linked where done = 0
begin try
exec @retval = sp_testlinkedserver @srvr
end try
begin catch
set @retval = sign(@@error)
end catch
exec(@sql)
update @linked set done = 1, retval = @retval where name = @srvr
end
select * from @linked

Monday, March 10, 2008

SQL Server CLR Remote Debugging

My CLR stored procs have matured to the point that they're being deployed, which is to say that the bugs are now insidious rather than blatant. It's a royal pain to make changes, deploy the project locally, run the post-deploy correction script to fix the decimal precision on everything (see previous post), debug any issues, and then do the whole deployment again to the dev server. It's much easier to just debug it straight on the server.

Fortunately, remote debugging is a straightforward process. Predictably, Microsoft's docs (here's one of many) on the subject are somewhat convoluted, although this msdn blog post helped.

All you really need to do is install Remote Debugging Monitor on the server and run it under the same login you use to connect to SQL. It comes with its own standalone installer on the Visual Studio CD, just look for the Remote Debugger directory and run the setup app found there. Then you can run it either manually or as a service.

Next, place a breakpoint in your code somewhere and deploy it to the server. I've had trouble with my test scripts when deploying/debugging all at once - basically, they hang without ever running - so what I've started doing is this:

1) Start without debugging (Ctrl-F5) to deploy to the remote server
2) Run my decimal parameter correction script
3) Attach to Process in Visual Studio to sqlservr.exe on the remote server
4) Execute the proc or UDF in a SQL Management Studio window

Then I can step through the code, look at locals, and generally debug to my heart's content.

Friday, February 22, 2008

SqlDecimal truncation oddity

I've been working with all sorts of SQL CLR integration lately - User Defined Types (which are a bad idea), stored procs, and UDFs. In general, I'm pretty happy with the stored procedures, although I've run up against a number of irritating (but understandable) limitations, like the inability to dynamically load libraries and the requirement that only SQL data types be passed in and out of the CLR stored procs. But I haven't really broken any new ground, so I haven't written about it.

Today's issue is also not ground-breaking, but it is odd, so I thought I'd mention it. I wrote a function to convert a date into a decimal. Better that you not ask why. Here's the code:

public partial class UserDefinedFunctions
{
[Microsoft.SqlServer.Server.SqlFunction]
public static SqlDouble fnTimeSeriesCalcFracYrFromDateS(SqlDateTime inputDT)
{
DateTime dt = (DateTime)inputDT;
double temp = (dt.Year + (Math.Round(((double)dt.Month * (365 / 12) + (double)dt.Day) * 1000000 / 365) / 1000000));
SqlDecimal temp2 = (SqlDecimal)temp;
return temp2;
}
};

Obviously I could have dispensed with almost all of that by casting the data types within a single line, but I broke it out for debugging purposes.

Here's the funny part: when placing a breakpoint at the return statement, here are the values of the variables:

inputDT {1/2/2008 12:00:00 AM} System.Data.SqlTypes.SqlDateTime
dt {1/2/2008 12:00:00 AM} System.DateTime
temp 2008.087671 double
temp2 {2008.0876710000000} System.Data.SqlTypes.SqlDecimal

All as they should be. But the output of the function is 2008.

Why? I have no idea. I'm guessing it has something to do with the SqlDecimal declaration of the function, because when I changed it to SqlDouble, it returns 2008.087671 as it should. Anybody know why this is happening?

UPDATE: Figured it out. Visual Studio deploys the UDF without specifying a precision and scale, which causes the scale to be set to 0. One can circumvent this by dropping and re-creating the function manually within SQL Server and declaring the decimal data type with the desired precision and scale (in this case, (10,6)).

However, this is annoying because it turns a one-button deploy/debug process into a multi-step process, since one can't use the test script functionality in Visual Studio to run a script with multiple "GOs" to drop and re-create the function and then run it. The workaround I devised was to

1) Deploy the function to SQL via Visual Studio.
2) Drop and re-create the function manually in SQL Management Studio.
3) Set a breakpoint in VS.
4) Attach VS to the sqlservr.exe process.
5) Run the function. It should hit the breakpoint, allowing you to debug when it has the correct variable type settings.

But it would be nice to get back to where one press of F5 deployed and debugged the whole thing. Maybe I can figure out a way to set up a custom deploy script from VS...

UPDATE 2: When attempting to deploy the CLR functions after manually dropping them and re-creating them with the desired precision on the decimal return types, you get a Deploy error like the following:

The assembly module 'fnTimeSeriesCalcFracYrFromDateS' cannot be re-deployed because it was created outside of Visual Studio. Drop the module from the database before deploying the assembly.

To get around this, you can write a SQL script to drop the functions if they exist, save it into your project directory, and use the Pre-build event command line on the Build Events tab of the project properties to execute that drop script before each deploy. One less step per build!

Next I have to figure out how to get the post-build to execute the script to drop and recreate the functions with the proper precision.

Tuesday, December 11, 2007

More fun with ranking functions

I don't think I really understood why Mike F at Capital IQ was so excited about the SQL 2005 ranking functions until just recently, but I'm using them all over the place now. Minutes after kicking off the time series load I described in my last post, I came across another use for RANK.

EDIT: As I review this, I recognize that this is pretty esoteric and may not have much practical application for most people. You've been warned.

Like many financial companies, we get data from multiple vendors, and we have to decide which vendor's data to use in each case. So I've got a set of data provided by multiple Vendors, grouped by ObjectId and Date, that is ranked for each date by some business logic built into the procedure that loads the data. The best (lowest, in this case) ranked row for each ObjectId and Date gets a bit flag set; we'll call it IsBest.

However, it was uncovered that there was an extra negation in the business logic for many cases, so many of the rows had the bit flag set incorrectly. I needed to find a way to determine what portion of the data had the flag set incorrectly. This could be accomplished using any number of data warehousing techniques - Analysis Services is well-suited to it, for example, or Microstrategy, or even a complicated series of GROUP BYs. But I didn't want to go to all that trouble just to produce a single ad hoc report, so I found another way.

Some (totally made up, to prevent my employer from having me whacked) sample data, in which only objectId 4 has the correct settings for the bit flag:


I needed to find rows in which the highest-ranked VendorId had the IsBest flag set for a given ObjectId and Date, and any rows in which a lower-ranked VendorId did not have the IsBest flag set. Any other combination is incorrect. If I'm only worried about how many rows need to be fixed, I can stop there, or I can divide the sum of those two rowcounts by the total number of rows to get a percentage of data that is correct.

A query to return the rowcounts follows. I'm using DENSE_RANK rather than RANK to keep the ranking correct despite duplicate rows (as in objectId 3).

SELECT
IsBest, DRank, count(*)
FROM
(SELECT
[Date]
, ObjectId
, VendorId
, IsBest
, DENSE_RANK() OVER (PARTITION BY Date, ObjectId ORDER BY Priority) as DRank
from tbSource
) as allRows
group by IsBest, DRank
order by IsBest, DRank

Additional code could be added to this query to automatically sum the correct cases and determine the percentages, but I will leave that as an exercise for the reader. Here are the raw counts, with the correct cases highlighted:

Another cool use for RANK() - Time Series

I've got a big set of time series data across many object instances that needs to be inserted into a table that uses a trigger to set the "End Time" of each segment in the series. In other words, data like this:



When a new row for a given ObjectId is inserted, the previous row in the series is given an EndTime equal to the StartTime for the new row. Usually, this is done one row at a time, with only one row coming into the table for each ObjectId at a time.

However, in this case, I need to backfill a lot of data, which will cause the trigger to not behave as expected. I have a few options to work around this:
  1. Insert each row one at a time. Turns out there are over a million rows to be inserted and this is a heavily-used table, so this option is undesirable.
  2. Write a script to generate the EndTimes for each row ahead of time. This could be tricky and CPU-intensive, and I'd rather avoid it if possible.
  3. Partition the data such that the trigger will work as it's supposed to, but large batches of data can be inserted instead of inserting one row at a time.
I'm currently investigating option 3, especially because it lets me use the RANK() function. The basic idea is this: for each ObjectId, rank the StartTimes, then loop through and insert all rows where RANK = 1, then 2, 3, etc. This will insert all the ObjectIds with their initial StartTime, then the next StartTime, and so on, allowing the trigger to fire on each round and update the EndTimes en masse.

The query I'm using (generalized for mass consumption):

INSERT INTO tbTemp
SELECT
ObjectId
, Value
, StartTime
, RANK() OVER (PARTITION BY ObjectId ORDER BY StartTime) as Ord
FROM
tbSource


Then the data in tbTemp can be looped through by rank and inserted into the destination table.

And the resulting data should look like this:

Thursday, November 29, 2007

Many-to-many physical table design

I was creating a new set of tables (we'll say 2 for simplicity) that will contain rows that can have many-to-many relationships, and I was pondering how best to design the association table so as to optimize IO.

In this table scenario, joins can come from either direction, so it could be one row from the first table that joins to many in the second, or vice versa. I'm generally a big fan of a clustered index on some column, especially an integer ID if it's useful, but in this case, I think the best performance would be achieved by dispensing with the clustered index and ID column altogether. The basic table design would be as follows:

CREATE TABLE User
(
UserId INT IDENTITY(1,1)
, ...
)

CREATE TABLE Group
(
GroupId INT IDENTITY(1,1)
, ...
)

CREATE TABLE UserToGroup
(
UserId INT
, GroupId INT
, ?
)

A clustered index on UserId would speed up queries starting with a User and joining to Groups, but it would fragment as old users added groups, splitting pages to insert UserIds. The reverse would be true of clustering on GroupId. Unless the query pattern supported such an unbalanced design which would require frequent index defragmentation, either of these clustered indices would be less than ideal.

Adding a surrogate key identity column would be no better; it would prevent index fragmentation, but the index wouldn't be used for seeks, and the column would add 50% more space on disk between the columns that we really care about, making I/O proportionately slower.

The best design should be a heap, with nonclustered indices on UserId, GroupId and GroupId, UserId. Or more specifically, in SQL Server 2005, indices on UserId and GroupId, with the other column set up as an "included column" to make each index covering.

I haven't tested this yet, though, so I welcome any comments or contradictions.

Tuesday, November 27, 2007

Odd SSIS bug

Here's what happened: I added a column to a source table, went into the Data Flow source, refreshed the table list, selected the table, and checked all the output columns that I wanted. Then I resolved all the naming issues with the downstream tasks.

But I'm getting an error (#1 below) when I try to build, because for some reason SSIS will not automatically add the new column to the Error Output, and it won't let me add a column to that output, even in the Advanced Editor. This is annoying. I'm probably going to need to delete and add this data source, which is going to screw up all my transformations.


Error Message #1
Error at Data Flow Task [tbCA Source [1]]: The output column "CaxSubTypeId" (1052) on the non-error output has no corresponding output column on the error output.

Error Message #2
TITLE: Microsoft Visual Studio
------------------------------

Error at Data Flow Task [tbCA Source [1]]: The component does not allow adding columns to this input or output.

Tuesday, August 28, 2007

SQL Management Studio Setup

First off, I'm back at a desk after an excellent time in Europe - a whirlwind tour through Budapest, Bratislava, Krakow, and Vienna. I may have to move to Vienna one of these days, and I may be seeking a trophy wife in Bratislava. So today is my second day at my new employer, an entity I have been informed prefers not to be named publically, so all references to it from here out will be oblique, or possibly substituted with "The Black Chamber."

In any case, I needed to start setting up my new desktop (that was already built when I got here, by the way, and with three (!) 19" monitors mounted on extremely solid articulating arms), and realized that I can never remember the miscellaneous settings I change in SMS to get it set up the way I like for maximum productivity. Generally I just stumble into them after I try to do something the way I usually do and it doesn't work. So like many other posts on this blog, this is purely for my reference purposes, and if it helps anyone else out, even better.




  1. ProFont - Mike Forman introduced me to this, and I find it absolutely essential these days. I use it as my main text editing and printing font, at 9 point, and it's clearer and smaller at that font size than anything else I've found. I'll make it available for download as soon as I manage to get access to my own FTP server again. Until then, I'm sure it can be googled.

  2. Environment->Keyboard - I add the following:

    • Ctrl+F1: sp_helptext

    • Ctrl+3: sp_who2

    • Ctrl+4: sp_spaceused

  3. Text Editor->Plaintext - I change the tab spacing to 2. Just my personal preference.

  4. Query Results->SQL Server->Results to Grid - I check "Display results in a separate tab" and "Switch to results tab after the query executes." I prefer having as much screen real estate as possible for both query and results, and I find it quick to switch between tabs with F6.

  5. The Change Connection button - Another one from Mike, I use the Customize menu option to display the Change Connection icon as Image and Text, and change the Name from "C&hange Connection..." to "Chan&ge Connection...". This allows quick switching of the connection via the keyboard shortcut of Alt-G. The default name needs to be changed because Alt-H brings up the Help menu.
I know there are a few others, but I can't recall them at the moment.
UPDATE 2012-02-16
SET ARITHABORT OFF - refer to http://nyc-dba.blogspot.com/2012/02/query-plan-troubleshooting.html

UPDATE 2013-11-05
Fix the Management Studio tab text - http://www.brentozar.com/archive/2011/06/fixing-sql-server-management-studios-tab-text/