Summary: Root cause analysis (RCA) is a systematic investigation method that identifies the fundamental underlying causes of workplace incidents, near misses, and safety failures rather than just addressing surface-level symptoms. The five-step RCA process — combined with techniques like the 5 Whys — enables safety teams to break the cycle of recurring incidents, implement lasting corrective actions, and transform safety data into meaningful risk reduction across the organization.
Table of contents

When workplace incidents recur despite repeated corrective actions, or when safety metrics plateau despite ongoing investment in training and controls, organizations are almost always treating symptoms rather than causes. The pattern is recognizable: a slip-and-fall is responded to with a housekeeping reminder, an equipment failure is addressed by replacing the part, a near miss generates a brief safety talk — and then the same event happens again three months later. Subpar safety performance is the predictable result of management systems that respond to surface-level causes without penetrating to the organizational and systemic failures that make incidents possible in the first place.
Root Cause Analysis (RCA) is the structured methodology that breaks this cycle — enabling EHS managers, safety directors, and operations leaders to identify and eliminate the fundamental causes of workplace incidents, quality failures, and compliance gaps rather than continuously managing their recurring symptoms.
In this comprehensive guide, we explore what Root Cause Analysis is, why it is essential for safety and compliance management, how to perform it effectively in five systematic steps, and how Certainty Software enables organizations to conduct and act on RCA findings at enterprise scale.
What is a Root Cause Analysis?
A Root Cause Analysis (RCA) is a structured, evidence-based methodology for identifying and eliminating the fundamental causes of a problem, incident, or failure rather than only addressing its immediate, visible symptoms. It involves systematically asking a series of investigative questions — using tools such as the 5 Whys, fishbone diagrams, fault tree analysis, and bow-tie analysis — to move from the surface-level event to the underlying process, system, or management failures that made that event possible. A root cause is the deepest identifiable causal factor that, if eliminated or permanently changed, would prevent the problem from recurring.
In workplace safety management, RCA is an OSHA-expected component of incident investigation and a requirement under ISO 45001:2018 Clause 10.2, which mandates that organizations investigate incidents and nonconformities, determine root causes, and implement corrective actions to prevent recurrence. Beyond regulatory compliance, RCA is equally applicable to quality escapes, process inefficiencies, customer complaints, and any other recurring organizational problem where the goal is permanent elimination rather than temporary suppression of symptoms.
Why Do Root Cause Analyses Matter?
Root Cause Analysis delivers measurable, strategic value across the full scope of organizational performance. Key benefits include:
Enhanced efficiency:
Organizations that address root causes rather than symptoms eliminate the recurring costs of rework, defects, repeat incidents, and the management time consumed by addressing the same problems repeatedly. Permanent corrective actions that resolve underlying process and system failures free operations teams to focus on value-creating activities rather than continuous firefighting. The cumulative efficiency gains from systematic RCA across an organization’s incident and quality management programs compound over time into measurable improvements in productivity, equipment reliability, and operational cost.
Risk mitigation:
From a safety and compliance perspective, RCA is the most powerful risk reduction tool available to EHS managers. By identifying the systemic conditions — inadequate procedures, insufficient training, equipment design flaws, production pressure overriding safety controls — that allow incidents to occur, RCA enables organizations to implement controls that address risk at its source rather than at the point of consequence. Effective RCA programs are directly associated with lower Total Recordable Incident Rates (TRIR), reduced Days Away, Restricted, or Transferred (DART) rates, fewer repeat violations in OSHA inspections, and stronger performance in ISO 45001 certification and surveillance audits. OSHA’s Recommended Practices for Safety and Health Programs explicitly identify incident investigation with root cause analysis as a core element of effective safety program management.
Overall business success:
Organizations that build RCA capability into their safety and quality management systems develop a durable competitive advantage: they get better over time at the systematic problems that constrain performance, rather than cycling repeatedly through the same failures. The discipline of evidence-based root cause identification — as opposed to assumption-based blame assignment — also builds the organizational trust and psychological safety that enable workers at all levels to report near misses, raise hazard concerns, and participate constructively in improvement initiatives. This continuous learning culture is the foundation of sustained safety performance improvement and is explicitly valued by the ISO 45001 framework through its requirements for worker participation, continual improvement, and management review.
5 Steps to Performing an Effective Root Cause Analysis
A rigorous Root Cause Analysis follows a structured five-step process that moves systematically from problem definition through to verified corrective action. Each step builds on the previous one — skipping or abbreviating steps is the most common cause of RCA failure and produces the shallow “corrective actions” that allow root causes to persist:
1. Define the Problem
A precisely defined problem statement is the non-negotiable foundation of effective Root Cause Analysis. Vague problem definitions — “we have a safety issue” or “quality is poor” — produce vague investigations that never locate the actual root cause. A well-formed problem statement is specific, measurable, time-bounded, and supported by objective data. To define the problem correctly, the investigation team must:
- State the problem in specific, quantifiable terms supported by documented evidence. For a safety incident: “On [date], at [location], a [type of event] resulted in [specific injury/damage] while [worker role] was performing [specific task].” For a quality issue: “Our customer satisfaction rating dropped 15% in Q3 2025, driven by a 40% increase in late deliveries from Facility B.” Precision at this step determines the accuracy of the entire investigation.
- Quantify the problem’s impact on workers, operations, regulatory standing, and business performance. For safety incidents, this includes injury severity, OSHA recordability classification, workers’ compensation costs, production disruption, and potential regulatory citation exposure. For quality or compliance issues, this includes financial loss, customer impact, and audit risk.
- Define measurable success criteria for the corrective action: what specific outcome — reduction in incident rate, elimination of the identified deficiency, verified process change — will confirm that the root cause has been effectively addressed? Setting this criterion before the investigation begins prevents the common failure mode of declaring success before root causes are actually eliminated.
2. Determine the Factors that Caused the Problem
The second step moves from the defined problem to a comprehensive mapping of the causal factors — the specific conditions, actions, and failures that produced or contributed to the event. This step requires systematic evidence gathering and structured analytical thinking. To identify causal factors effectively, the investigation team must:
- Gather comprehensive evidence from all available sources: physical inspection of the incident scene and equipment, photographs and video, maintenance and inspection records, training documentation, standard operating procedures, interviews with involved workers and witnesses, supervisory observations, and any relevant audit or inspection findings from preceding weeks or months. Evidence collection should occur as quickly as possible following the incident, before scene conditions change and memory degrades.
- Apply structured causal factor identification tools to organize and analyze the evidence: fishbone (Ishikawa) diagrams categorize contributing factors across the classic dimensions of People, Methods, Machines, Materials, Environment, and Management; fault tree analysis works backward from the event using Boolean logic to map all causal pathways; and timeline analysis reconstructs the sequence of events and decisions that preceded the incident. Use the tool or combination of tools best suited to the event type and complexity. Looking for an effective way to jump right into determining factors that are leading to your pertinent problems? Check out our extensive free-to-download checklists to help get you started.
3. Identify Root Causes
Identifying the true root cause — as opposed to the proximate or immediate cause — is the most analytically demanding step in the RCA process. It requires the investigation team to move beyond the first plausible explanation and continue asking “Why?” until they reach a systemic or organizational failure that, if corrected, would prevent recurrence. To identify root causes accurately, the team must:
- Test each identified causal factor against available data to determine whether it is a confirmed contributor or a plausible-but-unverified hypothesis. Remove speculative or irrelevant factors from the analysis to keep the investigation focused on factors that evidence actually supports. The discipline of staying evidence-based — rather than settling on the most convenient explanation — is what distinguishes effective RCA from surface-level incident investigation.
- Apply the 5 Whys technique (described in detail below) to each confirmed contributing factor, asking “Why did this happen?” repeatedly until the answer points to a systemic failure rather than an individual action or equipment failure. True root causes in workplace safety incidents are almost always found at the management system level: inadequate or absent procedures, training that was never verified for effectiveness, inspection programs that missed the developing hazard, production pressure that normalized bypassing safety controls, or safety management systems that lacked the specificity to detect the preconditions that enabled the event.
4. Decide the Corrective Actions
Once root causes are confirmed through evidence-based analysis, the investigation team develops corrective actions that directly address those root causes — not just the surface-level event symptoms. Corrective action selection must account for feasibility, effectiveness, permanence, and the hierarchy of controls: elimination and engineering controls that remove or engineer out the hazard are preferred over administrative controls and PPE that require sustained human behavior to be effective.
Key questions to guide effective corrective action selection include:
- What corrective actions will directly eliminate or permanently reduce the root cause — not just respond to the immediate incident?
- What are the expected benefits of each proposed action, and what are the potential unintended consequences or new risks introduced?
- How will each corrective action be implemented — what specific steps, sequence, and resources are required?
- Who is the accountable owner for each action, with specific authority and resources to complete it by the defined deadline?
- What tools, budget, or cross-functional support will be required to implement the corrective action effectively?
- How will the effectiveness of each corrective action be verified — what specific metric, observation, or audit finding will confirm that the root cause has been addressed and recurrence prevented?
The output of this step is a formal corrective action plan: a documented list of specific actions, assigned owners, completion deadlines, and verification criteria — communicated to all relevant stakeholders and tracked through completion. Organizations that manage corrective actions in digital safety management platforms like Certainty achieve significantly higher closure rates and faster resolution times than those relying on email threads or manual tracking spreadsheets, because automated assignment, escalation, and reporting create accountability at every step of the process.
5. Review and Evaluate
The final step of the RCA process — and the one most frequently neglected — is verifying that implemented corrective actions have actually produced the intended outcome and that the identified root cause no longer poses the risk of recurrence. This verification step is what closes the loop between incident investigation and genuine safety improvement. Without it, organizations may implement technically correct corrective actions that fail in practice due to implementation gaps, worker non-adoption, or unanticipated barriers. Key evaluation questions include:
- Have the implemented corrective actions demonstrably addressed the confirmed root cause — is there objective evidence that the systemic failure has been corrected?
- What measurable outcomes have been produced by the corrective actions since implementation — changes in incident rates, inspection findings, near-miss report frequency, or audit results in the affected area?
- Did the corrective actions achieve the success criteria defined in Step 1, within the committed timeframe?
- Were there any unintended consequences from the corrective actions — new hazards introduced, workflow disruptions, or compliance implications?
- What lessons from this RCA can be applied proactively to other similar processes, equipment, or locations where the same root cause conditions might exist?
The goal of this step is to confirm effectiveness through measured outcomes — not implementation activity — and to document the findings for organizational learning. Sharing RCA outcomes, including both the root cause findings and the verified corrective actions, with relevant stakeholders across the organization enables systematic learning that extends safety improvements beyond the specific site or work group involved in the original incident. This continuous improvement feedback loop is a core requirement of ISO 45001 Clause 10.3 and OSHA’s Recommended Practices for Safety and Health Programs.
What are the 5 Whys of Root Cause Analysis?
The 5 Whys is the most widely used root cause identification technique in workplace safety and quality management — valued for its simplicity, accessibility, and effectiveness in guiding investigators past surface-level causes to systemic failures. Developed by Sakichi Toyoda, founder of Toyota Industries, as part of the Toyota Production System, the 5 Whys technique is built on the principle that most problems have a chain of causes and effects — and that repeatedly asking “Why?” will trace that chain from the observable symptom to the underlying systemic failure that enabled it. The technique is most powerful when combined with physical evidence and direct observation, rather than applied as a purely theoretical exercise in a conference room.
- Write down the specific, well-defined problem statement developed in Step 1 of the RCA process — the observable event with its specific context, time, and measurable impact.
- Ask “Why did this happen?” and record the answer, supported by evidence rather than assumption. The first answer is typically the immediate or proximate cause — the direct physical or behavioral event that preceded the problem.
- If the answer does not yet point to a systemic or organizational failure — if it is still describing what happened rather than why the system allowed it to happen — ask “Why?” again about that answer, and record the new response with supporting evidence.
- Continue iterating through “Why?” answers until the response identifies a failed or missing process, procedure, training element, management system control, or organizational decision — a root cause that, if corrected, would prevent the problem from recurring rather than merely responding to its latest occurrence.
- Develop specific, measurable corrective actions that directly address the identified root cause and assign them to accountable owners with defined deadlines and verification criteria.

Image source: Toyota Industries
Improving Risk Management
Conducting a rigorous Root Cause Analysis is a data-intensive, multi-step process that places significant demands on investigators’ time, analytical tools, and organizational discipline. For enterprise-level organizations managing complex operations across multiple sites, the challenge is compounded: incidents occur simultaneously across different facilities, causal data is distributed across paper forms and disconnected systems, corrective actions are tracked inconsistently, and there is no unified view of whether root cause trends are improving or worsening across the enterprise.
Certainty Software addresses these challenges directly. The platform enables organizations to conduct standardized incident investigations from any device — including mobile, online and offline — capturing the structured evidence needed for rigorous RCA at the point of occurrence, before scene conditions change. Certainty’s configurable forms and workflows let EHS managers design investigation processes aligned with their specific RCA methodology — whether 5 Whys, fishbone, fault tree, or a hybrid approach — ensuring that every investigation follows a consistent, high-quality process regardless of which site or which investigator conducts it. Real-time analytics aggregate investigation data across the enterprise, surfacing root cause patterns — recurring equipment failures, systemic training gaps, procedural compliance weaknesses — that would remain invisible in siloed, site-level systems.
With Certainty, your organization can enhance the RCA and corrective action management process by:
- Reducing human errors and data inconsistencies in investigation documentation through standardized digital forms with required fields and logic branching
- Increasing the accuracy and completeness of incident investigation data by enabling mobile capture at the scene with photo and video attachment
- Simplifying causal factor analysis with structured investigation workflows that guide investigators through each RCA step systematically
- Accelerating evidence-based corrective action decision-making through real-time data access and enterprise-level trend analysis
- Automating corrective action assignment, escalation, and closure tracking — with automatic notifications to responsible owners and management when deadlines approach or are missed
- Improving accountability and demonstrating continuous improvement through audit-ready RCA and corrective action records that satisfy OSHA investigation requirements and ISO 45001 Clause 10.2 documentation obligations
To witness the advantages of Certainty’s effective solutions firsthand, schedule a demo today and discover how it can transform your organization’s Root Cause Analysis and corrective action management capabilities.
You might also be interested in:
Frequently Asked Questions (FAQs)
What is the difference between a root cause and a contributing cause?
A contributing cause is a factor that influenced or enabled an incident but, by itself, would not necessarily have caused the problem to occur or recur. A root cause is the deepest systemic or organizational failure that, if permanently eliminated, would prevent the problem from happening again. In practice: a wet floor (immediate cause) may have caused a slip — but the root cause was an absent or unenforced inspection and cleaning schedule that allowed liquids to accumulate undetected. Addressing only the immediate cause (cleaning up the spill) leaves the root cause (the absent inspection process) intact, ensuring recurrence. Effective RCA always aims to identify and address root causes, not just immediate or contributing causes.
Does OSHA require root cause analysis after workplace incidents?
OSHA does not prescribe a specific RCA methodology, but its Recommended Practices for Safety and Health Programs explicitly direct employers to investigate incidents and near misses “to find root causes” — not just immediate causes. For recordable injuries and illnesses, OSHA’s OSHA 301 Incident Report form asks for the event that directly produced the injury and the objects or substances involved — the starting point for RCA, not the conclusion. For serious incidents (fatalities, hospitalizations, amputations), OSHA’s inspection procedures assess the adequacy of the employer’s incident investigation and whether corrective actions address underlying causes. Organizations subject to OSHA’s Process Safety Management standard (29 CFR 1910.119) are required to conduct formal incident investigations that identify the chain of events and contributing factors using documented procedures.
How does ISO 45001 require root cause analysis?
ISO 45001:2018 Clause 10.2 (Incident, nonconformity and corrective action) requires that organizations investigate incidents and nonconformities by: reviewing the incident, determining contributing factors and root causes, and evaluating the need for corrective actions to prevent recurrence. Clause 10.2 further requires that corrective actions be appropriate to the effects of the incidents or nonconformities encountered — meaning superficial responses to serious incidents will be identified as deficiencies during certification audits. Documented evidence of root cause investigation and corrective action effectiveness review is required to demonstrate conformance with Clause 10.2 during ISO 45001 certification and surveillance audits.
What tools are used in a Root Cause Analysis?
The most commonly used RCA tools in workplace safety management include: the 5 Whys technique (iterative causal questioning); fishbone / Ishikawa diagrams (categorized causal mapping across People, Methods, Machines, Materials, Environment, and Management dimensions); fault tree analysis (Boolean logic-based causal pathway mapping for complex technical incidents); bow-tie analysis (mapping both the pathways to an event and the consequences, with controls on each side); and timeline/sequence of events analysis (reconstructing the precise chronological sequence of decisions and actions preceding the incident). The appropriate tool depends on incident complexity, investigator skill, and organizational methodology. For most workplace safety incidents, the 5 Whys — applied rigorously with physical evidence — is sufficient to reach the systemic root cause.



