The true cost of downtime: Businesses risk losing over $200 million annually in revenue, and customer trust
DID you know that a single downtime event can sink your stock price by up to 9 per cent, taking nearly three months to recover? Or that your top talent is being pulled away from innovation to patch holes and conduct post-mortems?
On a personal level, 39 per cent of tech professionals are concerned about being held liable for these incidents, while another 38 per cent worry about how it could impact their performance reviews. These are among the findings of Splunk’s 2024 Hidden Cost of Downtime Report.
“Downtime can happen from anywhere. Splunk’s report identifies two main causes of downtime – 56 per cent result from security incidents, such as phishing attacks, while 44 per cent are due to application or infrastructure failures,” says Simon Davies, senior vice president and general manager of Splunk in the Asia-Pacific (APAC) region.
“Business leaders may be surprised to learn that human error remains the leading cause in both cases — difficult to detect and even harder to remediate.”
As more people depend on digital solutions every day — from online shopping to remote work — the ripple effects of unplanned downtime become ever more pronounced, he adds.
Below Davies answers questions about why businesses and their leaders must take downtime seriously and the essential tools they can use to maintain resilience.
BT in your inbox
Start and end each day with the latest news stories and analyses delivered straight to your inbox.
Q: Why does downtime feel more frequent in today’s digital world?
A: Downtime appears more common today due to the increasing complexity of modern Information Technology (IT) infrastructures, which evolve as organisations adapt to the growing customer and user demands. As organisations modernise, their systems evolve — leading to more interdependencies and potential points of failure.
For example, IT tasks may be run from public cloud servers as well as self-run data centres that are configured differently. They often connect to third parties as well, say, when an online store needs to verify a customer’s credit card for payment. These multiple links are common today.
They also mean that a single breakdown can cascade into wider-scale disruption. Think of an error in a data centre causing nationwide disruption of bank’s payment services. An issue caused by a vendor updating everyone’s cyber-security software could bring down entire shopping malls and airports.
And this is just for existing systems and features. To remain competitive, businesses are pushing out new features to their digital apps and services all the time, which could add to the complexity and potential for new issues.
The rapid pace of digital transformation has also left many organisations with siloed tools spread across their hybrid, multi-cloud environments. The surge in data volumes, combined with data decentralisation across tools and clouds isolates operations and development teams, making it challenging for them to obtain the insights needed to keep critical, customer-facing applications secure and running smoothly.
Q: How can underestimating the impact of downtime lead to long-term consequences for businesses?
A: The most obvious and immediate impact of unplanned downtime is the disruption it causes businesses and their customers. If a critical service is affected, it means users can have a vital link to the digital economy cut off for a significant amount of time. This can severely affect customer lifetime value and their trust in a company. For some businesses, a lengthy disruption could also attract stiff regulatory penalties.
What is less obvious is the impact that downtime has on other key parts of a business. Below the surface, damage can be more severe and recovery, even harder.
According to Spunk’s study, revenue can take a huge hit. Downtime impacts more than just one department or cost category for businesses. Our report shows that downtime costs Global 2000 companies US$400 billion (SG$ 523 billion) annually — that’s US$200 million per company per year and roughly 9 per cent of their profits. Companies can also anticipate a 1 to 9 per cent fall in their stock price following a single downtime event.
Downtime also diminishes productivity. Teams must shift from high-value work, such as launching new digital products and experiences to focus on resolving customer issues, applying software patches and participating in postmortems.
The road to recovery is long as well. It takes 60 days to recover one’s brand health, 75 days to recover the lost revenue and 79 days to get back to the pre-event stock price. Painful, yes, but it also impacts your competitiveness as well.
Q: How does a business and its leaders strengthen their systems against unplanned disruptions?
A: The most important thing is to think long term. Yes, downtime seems temporary, but its implications stretch out over time.
First, organisations need to draft a downtime plan. Disruptions happen to every business, so it is key to have a playbook for outages, to ensure everyone involved in resolving a disruption knows what they have to do.
Second, make sure you perform post-mortems. This helps to root out the root cause of a disruption rather than rely on a temporary fix. Use observability tools to gain visibility and eliminate silos that diminish your insights.
Resilient companies stand out by adopting generative artificial intelligence (AI) solutions at an advanced level. Their leaders embrace individual AI tools five times faster than non-leaders and adopt AI features integrated into existing tools four times faster.
However, adopting generative AI is just one part of their strategy. These leaders also invest in additional infrastructure capacity, cyber insurance, backups, and advanced cyber-security and observability tools.
Q: Does that mean being a resilience leader simply requires having a larger budget?
A: Not necessarily. These leaders invest smarter, not just more extensively.
By prioritising data management and tool consolidation, they achieve better cost control and implement more innovative security and observability strategies. This approach translates into fewer but smarter investments that yield improved outcomes, such as comprehensive visibility and cross-collaboration, thereby enabling a proactive approach to managing downtime.
Q: How can one cut through the complexity of today’s IT systems?
A: A key aspect of being more resilient is having visibility of your infrastructure, which can become overly complex over time, hard to manage and more susceptible to disruption.
What is needed is end-to-end visibility across hybrid environments, if you’re running some parts of your infrastructure on the cloud and another on our own data centre.
This is a task that has been significantly bolstered by the integration of generative AI and machine learning (ML) tools.
AI-driven automation enables companies to optimise resource allocation, redirecting valuable human capital towards strategic growth initiatives. For example, the use of ML within their threat detection approaches enables security teams to automate the filtering of false positives, leading to more efficient and accurate identification of suspicious activity patterns.
Q: How can organisations leverage AI to enhance their operations?
A: Users should not view AI as a fully independent agent, but rather as a capable teammate. By integrating the technology with human experiences, organisations can achieve faster detection, investigation and response while maintaining control over how AI is applied to their data.
By investing in a unified solution and emerging capabilities in line with the evolving cybersecurity landscape and organisational boundaries, companies can stay ahead of the curve and adapt to new technologies as they emerge.
Q: Would AI help to ward off disruptions or make things worse?
A: It’s crucial to recognise that, like any powerful tool, AI can be employed for both constructive and harmful purposes. Our latest research shows that 45 per cent of respondents believe that AI will be a net win for cyber attackers, and a staggering 77 per cent say it expands the attack surface to a concerning degree.
Yet, while concern about the evolving cyber threat persists, it’s also important to acknowledge AI’s capacity to enhance productivity and bolster companies’ competitiveness.
Generative AI features embedded into existing tools — such as chat assistants — can help address downtime. These domain-specific assistants can improve productivity and elevate employees’ skill sets, benefiting your organisation over time.
Organisations can also improve proactive and collaborative downtime prevention by investing in AI- and ML-driven solutions for pattern recognition.
These solutions can detect potential issues, from a potential cyberattack to a configuration problem in a server, before they become more serious.
Q: How can business leaders determine the right AI solution for their company?
A: Business leaders should start by setting clear goals and define specific objectives for AI implementation. What business problems are you trying to solve? What outcomes do you hope to achieve?
Next, establish organisational guardrails and policies. Develop a framework for responsible AI use within your company. This should cover ethical considerations, data privacy, security protocols, and compliance with relevant regulations.
Also recognise that AI is a tool to augment human capabilities, not replace them entirely. Design AI solutions that facilitate collaboration between humans and machines, allowing for human oversight and intervention when needed.
Splunk’s AI strategy focuses on domain-specific customisation, human-in-the-loop decision-making, and an open, extensible platform. This offers a valuable framework for businesses to emulate. We believe that this approach will benefit organisations as they survey the cyber landscape ahead.
Visit Splunk to find out how to strengthen your business’s resilience and prevent costly disruptions.