Privacy

Request Information Partners Documents Online See how it Works System Sales Sevices About

     
 


Microsystems Pty Ltd
Established 1975
ACN 002 578 154

ABN 39 002 578 154

Unit 1/2 Parramatta Road, Granville NSW 2142 Australia

P. +61 2 9682 6111
F. +61 2 9682 3390
Freecall. 1800 634 054
 

About us > Policies > Risk Management

Risk Management

Risk Management For Microsystems Data Backup

Microsystems adhere to Australian Standards AS/NZS 4360:1999, this Standard provides a generic guide for the establishment and implementation of the risk management process involving the identification, analysis, evaluation, treatment and ongoing monitoring of risks.


Context


It is the standard policy of Microsystems to ensure that there are three copies of all data at any one stage.

Data falls into two categories:

  • Microsystems Data

  • Client Data


Microsystems Data


Microsystems Data consists of all data related to the company itself. The two vital areas for Microsystems data backup are:

  • Project Data

  • Office Administration Data

Currently Microsystems data is backed up at the following locations:

  • Locally on site to another server in case of critical failure two copies (One hot swappable)

  • Off site at the Microsystems web server

By ensuring that three copies are always present/current it severely reduces data loss should unexpected network failures occur.


Client Data


Client data consists of all images captured and all data relevant to those images. This data is uploaded simultaneously to our web server and local hotswappable machine from the source machine nightly, hence the three copies. Any subsequent data changes made on the Webserver by clients is then replicated back to Microsystems servers.

This ensures that at all times there is three current copies of client data. By backing up this way we have two major advantages:

  • The backup is off -site, giving Microsystems added protection against a local disaster in either building (Microsystems premises and ISP premises) such as fire and theft.

  • The backup is on a RAID 5 Array, which ensures added protection, as data redundancy is present.

The process starts by making an FTP (File Transfer Protocol) connection to the Web Server. This is enabled by typing a specific address (which may be obtained upon written request to Microsystems) and logging in with the administrator password. (or an alternative password with rights to write to the directory).

The administrator password (which is changed regularly) is known to only select people within Microsystems. No person outside of Microsystems is privy to this information.


Microsystems - Backup Architecture

Click here to view the diagram

New data is loaded onto the Microsystems server and uploaded onto the Microsystems server based at the ISP. The ISP server in turn updates the Microsystems server with the relevant changes, which in turn is backed up to hard disk once a day and to CD-ROM once a week. . As a result, in the unlikely event of a complete system failure, loss of client data is minimized.

By backing up this way, Microsystems have several major advantages:

  • The backup is off-site, giving "Microsystems" added protection against a local disaster in the building such as fire, theft etc.

  • The backup is on a RAID 5 Array, giving added protection, as data redundancy is present.

  • Microsystems can switch servers within a reasonable time frame.

  • Security settings are carried over to each site.


Disaster Recovery

Microsystems has built in redundancy as part of its standard solutions architecture to protect and maximise the availability of data to its client's. In addition, Microsystems performs regular disaster recovery testing. As part of this redundancy approach, Microsystems employ:

  • Uninterrupted power supplies.

  • Redundancy with the ISP, communication links and servers.

  • RAID servers.

As part of its continuous desire to improve upon its service and data security, Microsystems plan to move its head office and data centre by mid 2002. The new site is being developed to house the primary client servers into a specially designed underground (and flood-proof) server bunker to provide addition physical security. To ensure data integrity between these systems, Microsystems validate the synchronization of data between servers via a weekly software process. In addition both the primary Microsystems and ISP servers are both hot swappable RAID5 and where possible, Microsystems employs standard technologies and hardware for interoperability in the event of a disaster.


Identify Risks

What can happen?

Should an unexpected disaster occur, the major outcome could result in serious data loss affecting Microsystems normal operations and affecting clients of Microsystems whose data is invaluable to their organisation.

How can it happen?

This can occur in various ways. A major fire, flood, theft or collapse of the building could see all data become irretrievable thus the need to ensure that there is a current backup of data off-site.

Hardware failure is the most realistic disaster that could occur. A failure of one of Microsystems file servers could result in complete loss of data, regardless of the type of failure (e.g hard-drive crash, short circuit on motherboard etc). The on-site backups ensure that time to restore operations to normal is minimised.


Analyse Risks


What prevention do we have to prevent occurrence?

As previously explained, Microsystems ensure that there are always three current backups of data. This enables prevention of loss of data at any one time should a failure/disaster occur. Naturally, you cannot predict or prevent a disaster occurring but by implementing a disaster recovery plan it minimises the impact on the company and clients.

What are the chances of this happening?

You can never predict when a disaster will occur. Although the likelihood of fire, theft etc is rare, it can and does happen. With three backups, Microsystems has ensured that the effect on clients is minimal.

Hardware failure is a different issue. Whilst the majority of failures can be prevented, wear and tear does occur and by consistently monitoring the current status of hardware within the organisation, it provides the opportunity to replace suspect hardware before major failure occurs. Again, the three backups provide redundancy should an unexpected hardware failure occur.

What are the consequences?

Should a disaster/failure occur, not only is there the inconvenience to all clients but Microsystems could be liable for legal action from clients should the disruption affect their operations.

Other consequences that may affect Microsystems are a loss of business and a loss of confidence/trust by clients.

What is the level of Risk?

There are two issues to consider here. These issues are Hardware and Disaster prevention. From a hardware perspective, Microsystems is continually examining the latest technologies and implementing appropriate solutions thus effectively negating the risk of hardware failure.
Disaster prevention is harder to determine the level of risk as it cannot be predicated.

However, measures have been taken to ensure that Microsystems minimises the impact of a disaster. These measures include ensuring that the data is kept behind a fireproof door, is not subject to flooding and meets all the necessary quality assurance requirements of ISO 9002: 1994.


Risk Consequences

Procedure in event of failure

In the event of a hardware failure, the process is as follows:

  • Remove/replace affected hardware and test to see if problem is resolved, unless it is a hard-drive crash then the next step applies. If problem is resolved then normal operations should resume.

  • Substitute failed hardware with temporary server.

  • Copy data to substitute server.

  • Test substitute server.

  • Ensure normal operations can be resumed.

  • Rectify affected hardware.

  • Restore repaired hardware and data and remove substitute server.


In the event of a disaster such as fire then the following process applies:

  • Organise replacement hardware/software.

  • Install required software.

  • Copy data to replacement hardware(server)

  • Test server.

  • Ensure normal operations can be resumed.


Identifying the problem

Obviously a disaster will destroy most if not all of the hardware and software held on-site at Microsystems. Thus, if this situation occurs then identifying the problem is irrelevant.
Identifying the problem in a hardware aspect is a lot more complex.
PC component failures actually fall into three main periods, chronologically:

  • Infancy: Many components fail very soon after they are put into service. How long this takes depends on the component; for example, processors sometimes fail as soon as they are first put into a system. Many other parts fail within a week or a month of being put into use. Failures within this period are caused by defects and poor design that cause an item to be legitimately bad. These are called infant mortality failures and the failure rate in this period is relatively high.

  • Normal Operating Life: If a component does not fail within its infancy, it will generally tend to remain trouble-free over its operating lifetime.. The failure rate during this period is typically quite low.

  • Wear out: After a component reaches a certain age, it enters the period where it begins to wear out, and failures start to increase. When this occurs of course is a matter of luck and also how well you take care of your PC. For example, processors tend to last years longer if they are operated in a cool environment as opposed to a warm one. The period where failures start to increase is called the wear out phase of component life.

Using a process of elimination to determine the hardware fault follows a set order. This order is reflected in the table below:

Component
Infant Mortality Rate
Typical Time to Wear out (years)
Likelihood of Failure Before Wear out
Likelihood of Obsolescence Before Wear out
Power Supply
Low
3-Jun
Moderate
Very Low
Motherboard
Moderate
4-Jul
Low
High
Processor
Low
7+
Very Low
Very High
System Memory
Moderate to High
7+
Very Low
High
Video Card
Low to Moderate
5-Jul
Low
High
Monitor
Low to Moderate
5-7+
Moderate to High
Very Low
Hard Disk Drive
Moderate to High
3-May
Moderate to High
Moderate
Floppy Disk Drive
Low
7+
Low
Low
CD-ROM Drive
Moderate
3-May
Moderate
High
Modem
Low
5-7+
Low
High
Keyboard
Very Low
3-May
Moderate
Low
Mouse
Very Low
1-Apr
Moderate to High
Very Low

 

Using the table to restore to normal operations eliminates certain components. Components that will not affect normal operations are the mouse, keyboard, modem, cd-rom drive, floppy disk drive, monitor and video card.


Resolving the problem

Using the process of elimination as described above assists in resolving the problem.
Typically the process of resolving a problem relates to the particular hardware and listed below is a short description of what process occurs to each individual hardware component.

  • Power Supply: Some problems with power supplies can be repaired, but in practice they rarely are. The main reason is economics: power supplies are cheap, and they take only a few minutes to swap.

  • Motherboard: Motherboards are complicated multi-layer circuit boards and cannot usually be repaired. Some simple problems can be fixed by the manufacturer of the board; this usually means swapping some chip or other component on the board out in favour of a replacement, but this is not often done.

  • Processor: A failed processor cannot be repaired. It needs to be replaced. In the real world, an actual failure of a processor is extremely rare unless it is abused, typically by insufficient cooling over a long period of time etc.

  • System Memory: Memory chips cannot be repaired. Memory modules can be repaired by a company with the right equipment, by diagnosing which chip is flawed (assuming a failure of the memory and not the module circuit board) and replacing it with a good chip.

  • Hard Disk Drive: Hard disk problems have very few solutions that are available to anyone but the original manufacturer, or specialized data recovery firms.


Depending on which component has failed determines what course of action is taken. For future reference an analysis of all failed components will be undertaken to see what preventative measures can be implemented.


What we do for the customer

In the event of failure/disaster, Microsystems will endeavour to advise all clients of the estimated downtime and will update clients on the operational status on a consistent basis until the problem is resolved.

If the failure/disaster relates to the Web Server then client access to this will be affected, but should the failure/disaster relate to on-site at Microsystems, then clients will not be visibly affected but may experience delays in an update of data.