Saturday, 1 November 2008

Overview To Disaster Recovery

The majority people talk about disaster recovery and restoring the systems to similar or same hardware and the reason for this is almost 50% the fault of the operating system not being flexible to hardware changes and the over 50% is the fault of the backup software.

A good number backup software programs restores to the same hardware and leaves you to fix any hardware related issues, some backup software vendors have however have developed add-on modules that will do this for you and can drastically improve your disaster recovery time.

However we all know if you spend enough money anything is possible and I’m not writing about what software is best, as with all things there is more than one way to skin a cat and today I’m going to show you how solve the above mentioned problem by design.

Now the way we are going to approach this problem is one of design, we are going to admit that we have some weaknesses in our network and instead of relying on software to save us we are going to design our way out of the issues encountered.

Let’s start at the beginning in a live environment the very first thing you need to restore is the domain, now for a lot of disaster recover the domain is the make or break part if you can’t restore the domain then you have nothing as its pointless having an application of file that you can’t open.

“But I can open files without a domain I hear you say….” Oh dear comes my reply if you can open files without a domain then you have a rather large security problem but this is something I’ll address later in another posting.


Domain
So let’s get down to business we know that we need the most stable and flexible solution for our domain controllers. In a nut shell virtualization, domain controllers are generally processor heavy in large environments but not allot else, I’ve seen a 90,000 account domain run on a server with only 1GB of memory.

So since these servers are not so resource hungry they naturally lend themselves to being virtualized, this also removes dependency on hardware for the domain controller and can make it recoverable to any hardware that can run the image making you vendor independent for the hardware, hypervisor is the key word in all this.

Having one or more forms of virtual technology’s available to you also lets you do your own DR test and ion out the bugs without having the real disaster.
Virtualbox is a nice free one from Sun Microsystems if you wondering where to start or just don’t want to buy new software.


Network Access
Static IPs are not always a good thing and in a disaster they can be a real problem, example when hardware changes the new network adapter is detected and thus all IP addressing is lost, you have to options first is to use DHCP where possible and this I recommend to use as much as you can.

The second is to use netsh to backup and restore the IP configuration info to a common name such as “LAN” or some other common name for the interface so that all you have to do is rename the interface to LAN and execute netsh exec c:\ip.txt to restore your IP this is extremely handy if you have 6 or more network cards and you don’t want to spend half hour setting them up.

network card can often be an issue in a restore if you have the space keep a store of network card drivers on the system disk so you can easily reinstall the network card driver... its quite common for network cards to be intel based these days so the store of drivers shouldn't be large and this can be easy to implement.

Access to the network resource via IP and policy based security can also be time consuming to restore, so backing your IAS, RAS and DHCP scopes file using netsh so a way to save some time on the restore, after all you can make 30 or 40 clicks or you can just type netsh exec file.txt now this might not sound like that much of a time saver at first but when you have 8 to 10 server this can save more than an hour and cut the time spent down to less than a minute if you also have PSTools you can execute the restore simultaneously on all servers.


Scripting
Scripting is an important part of IT not just documenting how tasks are done but using the tools to hand to reduce the time taken to do them.

Imagine for an example I wanted to install a windows hotfix on 50 workstation and I didn’t have WSUS or internet access on the LAN segment for whatever reason do I spend 10 minutes logging on to each workstation and installing said hotfix and ultimately lose several precious hours or do I sit at my desk and execute c:\pstools\psexec @C:\workstationlist.txt “\\fileserver\share\critical\patch.exe /quiet /passive /norestart” and thus never have to leave the my chair.

This kind of scripting can be used in restores to just as much efficiency, as it maybe need redeploy patches and updates that are missing from the restore media.
Remember whatever tools use choice to use, it should be use on the entire environment as differing restore techniques only add to the complications and time needed to restore the environment.


Applications
Now we have finished on the scripting side of things move on to the applications, using Active Directory based authentication is not just easier to centrally administer but also better for application restores as it saves time on restoring SQL based logins where this is mismatch in SID, along with file permission and other such lovely security identifier related issues, so make a check on all existing application that they use domain authentication and identify any that might be a problem to restore.

Another largely over looked point is some application are outside of backup software’s ability to restore so look into best practices of each software vendor you use and check the application has recover path as you might need to delete locked files and or restore the application using its own set of backup tools.

No comments: