Go to the Penn State Home page

Go to the CLC Home page

Go to the ITS Home page
This site uses .Net links. Please use Text Only version for screen readers.  Text Only Printable Version    Secure Server Search CLC:   
   
  CLC Home
  News
  Labs
  Classrooms
  Assistive Technology
  Printing
  Disk Space
  Authentication
  Lab Admin Support
  Contacts
  About Us
  Mission
  Staff Members
  Reports
  Projects
  Search

Disaster Recovery

An important part of our system is disaster recovery.  This includes backups, mirrors, and clustering services.  This page is an attempt to explain how we implement each of these technologies, providing backup recovery in the shortest time possible.

 

Server Backups-Introduction

In order to provide a plausible disaster recovery solution for on site catastrophes, we run tape backups every night.  We run Veritas Backup Exec V. 9.0 presently to back up servers and services we offer.   The files are backed up to an IBM R3600 LTO tape library in the Computer Building.  In the future we will be taking sets of tapes off-site and switching the sets in the tape library.

How it works

A variety of backup jobs are run during the week, including full and incremental.  This is to ensure that any daily change that are made are recoverable.  Below is a list off all the data that is stored on the tape library.  The type of backup, day it is run, and estimated time for completion is included.

 

Job Name Backup Method Day Executed Time Executed Drive Used Est. GB (as of 1/30/03) Est. time to complete (HH:MM) Est. GB (as of 7/30/03) Est. time to complete (HH:MM) Est. GB (as of 10/20/03) Est. time to complete (HH:MM)

Udrive A-J Full

Full Saturday 18:00 IBM 1 186.29 17:27 248.39 16:40 321.9 21:08

Udrive K-Z Full

Full Saturday 18:05 IBM 2 184.43 18:21 243.12 16:12 318.7 21:11

Udrive A-J Inc.

Incremental Sunday-Friday 04:30 IBM 1 3.73 08:55     39.9 5:48

Udrive K-Z Inc.

Incremental Sunday-Friday 04:31 IBM 2 3.71 02:08     38.3 5:45

Profiles A-J Full

Full Sunday 23:00 IBM 1 81.49 41:28 Total:  235.5 15:53 154.1 24:14

Profiles K-Z Full

Full Sunday 23:00 IBM 2 77.52 39:48     152.6 21:22

Servers Full

Full Friday 16:05 IBM 1 167.91 30:00 79.5 1:35 260.4 07:17

Servers Incremental

Incremental Sunday-Thursday 10:00 IBM 1 6.58 00:11     26.0 00:44

   

 

The files that are included in the backups are self explanatory.  The Profiles are backup up from \\REMORA2\S: and the Udrive is backed up from \\REMORA1\Users on the REMORA NAS Cluster.  You can see a list of the files backup up with the Servers Full and Servers Incremental jobs here.

The IBM R3600 LTO Tape Library has 20 slots that hold 100 GB LTO tapes.  The speed of the backup jobs vary with the number of files/directories it backs up.

The tape library has a web interface that we use for administration purposes.

It is connected to THING1 via fibre channel.

We had to install several different options on Backup Exec for it to work correctly for how it was set up:   

Option Name Description/Reason for using

Library Expansion Option

Needed this to utilize the multiple drives and auto loading features of the R3600

Remote Agent for Windows

Installed on the remote computers to increase speed and reliability of backups.  (Also stops the job from appearing as 'failed" in the Backup Exec logs.)

SAN Shared Storage Option

This option was needed for the proper operation of the fibre controller.

Open File Option

Backs up files that are open.  Before Backup Exec would skip them.

Future Considerations

There are still issues with the time it takes to backup the Profiles directory on the udrive.  The reason for this is that there are millions of small files (about 5 million in Profiles) to be backed up rather than a lower number (about 1.7 million in Udrive) of larger ones. An option that we are looking into to solve this problem is the Intelligent Imaging option for Backup Exec.  This backs up the metadata and then creates an image backup of the original data.  When the backup job is run only the metadata and the image of the actual data is backed up.  According to Veritas, this causes no temp space to be backed up, making the job faster.  However, there is one drawback to this option.  According to Veritas, the restores using this method take much longer than using regularly generated backups.

 

Mirrors

In an effort to provide a quicker, more efficient restoration procedure for the udrive and the profiles, we have recently implemented using the large IBM Enterprise Storage System (ESS) SAN.  You can read about it here.

 

Clusters

Clustering is a large part of our systems.  A cluster is defined as two or more computers working together as one.  This has many benefits, high availability being the most important.  With clustering we can update our servers without taking down the services they provide.  Also, if one of the servers goes down the service will automatically fail over to the other server.  This provides a seamless transition that the end user is usually not aware of and gives us time to look at the problem.

Due to the demand 24 hours a day for certain services we incorporate a number of clusters:

Cluster Name Machine Type Services Provided Operating System
BOSS (2) IBM xSeries 340 SQL, Exchange W2KAS
HOPS (2) Dell PowerEdge 4600 ProfilesUP 5, 6 W2KAS
HAS (2) IBM Netfinity 5600 Pals, ConManServer, UserReg, Web server, SQL W2KAS
SUDS (2) IBM xSeries 360 Udrive and ProfilesUP 3, 4 in WIN domain W2KAS
THING (2) Dell PowerEdge 2450 Tape Backup, CD Images W2003AS



© 2006, The Pennsylvania State University. All rights reserved.
This site maintained by the Classroom and Lab Computing group of Information Technology Services.
Suggestions and comments about this web site: CLC Webmasters; Other contacts here.

This page was last modified: 11/11/2003 9:49:31 AM.