Human error has been a contributing role factor in major datacentre outages over the past three years and organisations need to guard against operator burn-out to lower the chances of downtime, according to a survey from provider of IT standards and certifications, Uptime Institute.
► Burn-out thought to be a factor on many major outages
► More need for training, automation and testing to reduce risk
In a survey by the firm, around 40% of datacentre operators admitted that they had experienced a major outage in the past three years in which human error had been significant with 50% stating that it happened due to workers not following the correct procedures. In a blog post, the organisation said that while thorough training, regular practice in equipment testing, and work experience can all help to reduce errors – particularly in an emergency when a prompt reaction is crucial – the importance of fatigue is often under-appreciated.
With 24/7 service availability required for data centres, best practices in other industries often don’t translate. The firm said that companies operating data centres need to consider a number of key factors, including:
- Complacency and ownership. Shift structure should promote sharing of knowledge, break monotony of routines and help develop a sense of inclusion through rotating shifts. Shift silos, such as staff having a fixed schedule, with some only working at weekends or nights, may create unhealthy attitudes resulting from complacency or a lack of team cohesion.
- Meeting staff lifestyle preferences. Despite data suggesting that long shifts are detrimental to performance, it is difficult for some operators to cut back hours. Uptime Institute often see a staff preference for 12-hour shifts over several days, for the benefits of both additional overtime pay and extended blocks of time off work.
- Relief shifts. Consensus in the industry is that extending shifts to more than 12 hours is ultimately worse for the business than sending employee’s home. For many operators, however, extending shifts to beyond 12 hours is unavoidable as a means of meeting staffing requirements. In practice, identifying individuals that can handle these extended shift lengths is not easy. It is not just very long shifts that carry the risks associated with fatigue. Staff not being able to rest sufficiently due to covering the shifts of absentee staff is another source of potential exhaustion, even if these shifts are not particularly long.
It noted that sourcing appropriate, qualified individuals for a relief shift can be challenging with employees sometimes having to return to work having not had sufficient rest. The cumulative impact of working shifts of more than 10 hours increases the risk of fatigue and other health issues. A balance needs to be struck and forward planning is important, especially for growing businesses.
Uptime recommends:
- Avoiding shift lengths of more than 12 hours. Staffing levels and schedules should be defined to minimise the occurrences of abnormally long shifts.
- Identifying shifts that are not appropriate as relief shifts. A system should be established to ensure well-rested coverage. There should be monitoring of overtime and rest periods between shifts to avoid calling in exhausted staff.
- Considering individual employee preferences while remaining mindful that shift workers often ignore potential risks to their own job performance and health when requesting their preferred schedule.
Click here for more information