We're moving our Dev environment to a new domain for some unfathomable corporate reason, and it's going to be a major headache. Among our 20+ SQL Servers in the environment, we've got a few SQL 2005 clusters. According to Microsoft, we can't move them without uninstalling SQL.
Needless to say, this would add more pain to the migration process. It wouldn't be the end of the world, but this is weekend work, and uninstalling and re-installing a couple active/active SQL clusters is just the thing to turn 1 day of work into 2+.
However, I think they're wrong.
I built two virtual machines on VMWare Server, added them to a cluster (see Repeatable Read for instructions on creating shared disks in VMWare) on our current domain, and created some databases. Verified failover capability and connectivity.
Then I moved both nodes to the new domain using the following steps:
1) Take all cluster resources offline (except the quorum, which cannot be taken offline)
2) Stop the cluster service on both nodes
3) Change the cluster service startup type to Manual
4) Change the domain of each machine to the new domain and reboot
5) After reboot, on each machine, change the cluster and SQL service accounts to accounts in the new domain
6) Run gpedit.msc or otherwise access Local Security Policy Settings (see below), and grant the following rights:
Cluster Service Account
Act as part of the operating system
Adjust memory quotas for a process
Increase scheduling priority
Manage auditing and security log
Restore files and directories
SQL Service Account
Adjust memory quotas for a process
Lock pages in memory
Log on as a batch job
Log on as a service
Replace a process level token
7) Add the cluster and SQL service accounts to the local Adminstrators group. NOTE: This should not be necessary for SQL, and I will update this with the minimum required permissions as soon as I sort them out. It is necessary for the cluster account, however.
8) Start the cluster service on both machines
9) Bring cluster resources online
10) Go enjoy the rest of your weekend
If you missed some permissions, the cluster service will likely fail to start with an error 7023 or 1321, and will helpfully output an error in the system log with eventId 1234 that contains a list of the necessary user rights that still need to be assigned. Now that's error reporting!
Comprehensive testing is still pending, but the preliminary results look good. After this process, SQL Server comes online on my test cluster, as do SQL Agent and Fulltext. I don't have any machines on the new domain with SQL Management Studio installed, but I could connect to SQL using osql directly on one of the cluster nodes. If anyone out there has any different experiences or comments, I'd love to hear them.
My previous post left out one small but significant detail: the domain groups under which the SQL Server service accounts run. When one installs SQL 2005 on a cluster, the setup program requires domain groups to be entered for each service account. So for example:
SQL Server service account: OLDDOMAIN\SQLService
SQL Agent service account: OLDDOMAIN\SQLAgentService
SQL Browser service account: OLDDOMAIN\SQLBrowserService
Then it comes time to move your cluster, and you've followed my steps above or done your own hacking, and you've changed the service accounts to NEWDOMAIN\SQLService and so on. But the domain groups remain the same. Your cluster will come online and fail over and operate fine, but you won't be able to change it.
This was made evident when I tried to add a node to an existing cluster after moving it to a new domain. It gave me a message like "Cannot add NEWDOMAIN\SQLService to OLDDOMAIN\SQLServiceGroup." Arrgh. Microsoft had already claimed that this was not supported, so I suppose I shouldn't have been surprised.
So I started searching for the reference to OLDDOMAIN\SQLServiceGroup. And couldn't find it. Not in a config file, or a system table (I know, that was dumb, but I was desperate), or the registry, where I expected to find it. Eventually, I started combing the registry key by key within the SQL hives and came across this in HKLM\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL.1\Setup...
(okay, I tried to upload another image here to illustrate, but Blogger hates me, so screw it.)
The keys AGTGroup, FTSGroup, and SQLGroup contain the SIDs for whatever OLDDOMAIN groups you set up when installing SQL. Find the SIDs for your new domain groups (the VBScript below is how I did it), enter those in place of the old ones, restart SQL, and your cluster is moved. You should now be able to add or remove nodes, install hotfixes, etc. You'll need to update the SIDs for each SQL installation (MSSQL.2, MSSQL.3, etc.)
As with any unsupported operation, your mileage may vary, but let me know if you have a different experience with this or you run into additional problems.