Saturday, 23 April 2016

Virtualised Environments and Shared Storage

In my experience, the uncontrollable database disasters outside of the DBA's direct control have been to do with SAN failures. Mixed with virtual machines and less than adequate forward thinking can result in a system setup that fails on a long enough timeline.

SANs are feasible storage solutions that offer convenient storage with decent disk redundancy, however, what I've found is that some administrators can setup one production server on one VM and then setup either the DR or "offsite" backup server on another VM without realising they've setup both the production and DR VMs to function on the same SAN. In the event that the SAN fails, both VMs are lost and the whole disaster recovery is a failure and the organisation is then dependent on either a week old tape backup or, by chance, a deliberately copied database backup - either from a recent database refresh or for testing. The same can be seen with some Oracle RAC installations - both nodes are virtualised yet they've been implemented on the very same disk array outside of the ASM storage. So if the SAN dies - both nodes are going to fail thus defeating one of the main incentives of using RAC.

I write specifically about SANs but it is possible the same arrangement exists on any disk array used to house a multitude of virtual machines. If it's believed that SANs and disk arrays are safe from failure, one may have not accounted for potential administration error where server engineers can potentially destroy partitions or neglect periodic disk checks leading to a slow collapse of a disk system.

What this means is that VMs and SAN structures tend to be obscured behind layers of virtualised machines and disk partitions and I find that it is in the organisationsbest interest to question and document the underlying foundation of the disk structure. If the engineers have setup the system in this fashion, I make alternative arrangements for the backups to be synchronised to disk that is not part of the production disk.


No comments:

Post a Comment