Tuesday, June 23, 2026

How to Read an Oracle AWR Report: A Step-by-Step Guide for DBAs

How to Read an Oracle AWR Report: A Step-by-Step Guide for DBAs

Oracle AWR report performance analysis dashboard showing top SQL and wait events

If you've been doing Oracle performance tuning for any length of time, you know that knowing how to read an Oracle AWR report is one of the most important skills in your toolkit. The Automatic Workload Repository (AWR) report is a snapshot-based diagnostic report that captures database statistics at regular intervals — and it contains just about everything you need to identify what's hurting your database. In this post, I'm going to walk through an AWR report from top to bottom the way I do it after 20 years of performance work, so you know exactly what to look at and in what order.

What Is the Oracle AWR Report?

The Oracle Automatic Workload Repository (AWR) collects, processes, and maintains performance statistics for problem detection and self-tuning purposes. It's available in Oracle 11g and later and requires the Oracle Diagnostic Pack license (or Enterprise Edition with the Diagnostic and Tuning Pack). In Oracle 19c, AWR has been enhanced with global AWR for multitenant environments (CDB-level AWR), but the structure of a single-instance AWR report has been stable for years.

AWR snapshots are taken by default every 60 minutes and retained for 8 days. You can adjust this:

-- Check current AWR settings
SELECT snap_interval, retention
FROM   dba_hist_wr_control;

-- Change snapshot interval to 30 minutes, retention to 14 days
BEGIN
  DBMS_WORKLOAD_REPOSITORY.modify_snapshot_settings(
    retention => 20160,   -- 14 days in minutes
    interval  => 30       -- 30 minutes
  );
END;
/

Generating the AWR Report

Before you can analyze anything, you need to generate the report. The quickest way on the command line:

-- Find snapshot IDs for your window of interest
SELECT snap_id, begin_interval_time, end_interval_time
FROM   dba_hist_snapshot
ORDER BY snap_id DESC
FETCH FIRST 20 ROWS ONLY;

-- Generate HTML AWR report between snap_id 1100 and 1105 for instance 1
@$ORACLE_HOME/rdbms/admin/awrrpt.sql

You'll be prompted for report type (text or HTML — always choose HTML for readability), the number of days to look back, and the begin/end snapshot IDs. For programmatic generation:

SELECT output
FROM   TABLE(
         DBMS_WORKLOAD_REPOSITORY.awr_report_html(
           l_dbid       => (SELECT dbid FROM v$database),
           l_inst_num   => 1,
           l_bid        => 1100,
           l_eid        => 1105
         )
       );

The AWR Report Header: Establishing Baseline Context

The first thing I look at on every Oracle AWR report analysis is the header. It tells you the scope and health of the snapshot window before you dig into anything else. Key fields:

  • DB Time: Total time all sessions spent doing database work during the interval. This is the denominator for everything that follows.
  • Elapsed Time: Wall-clock duration of the snapshot window.
  • DB CPU: Time spent on CPU (as opposed to waiting).
  • Avg Active Sessions (AAS): DB Time / Elapsed Time in seconds. If this number is consistently above your CPU count, you have a congestion problem.

Example header values that would concern me:

Snap Id      Snap Time         Sessions  Curs/Sess
---------    ---------------   --------  ---------
Begin: 1100  20-Jun-26 10:00:00   320      12.5
End:   1105  20-Jun-26 11:00:00   318      13.1

Elapsed:           60.05 (mins)
DB Time:          847.31 (mins)
DB CPU:           112.44 (mins)
Redo size:    4,521,034,240  bytes

Here, DB Time is 847 minutes over a 60-minute window — that means roughly 14 average active sessions (847/60). If this server has 16 CPUs and those sessions are CPU-bound, you're nearly saturated. If they're wait-bound, something else is the culprit. This ratio is the very first diagnostic signal you should internalize.

Load Profile: The Big Picture in Numbers

The Load Profile section gives you per-second and per-transaction rates for key workload metrics. I look at this to understand the type of workload:

  • Logical reads/sec (consistent gets + db block gets): High values with low physical reads → good buffer cache hit ratio. Extremely high values overall → possibly inefficient SQL doing too many block visits.
  • Physical reads/sec: Reads from disk. A spike here is worth investigating in Top SQL.
  • Parses/sec vs. Hard parses/sec: Hard parses are expensive. If hard parses are > 5-10% of total parses, you have a cursor-sharing or bind variable problem.
  • Executes/sec: The throughput rate.
  • Redo size/sec: Proxy for write-heaviness of the workload.

A red flag I've seen many times: a system where parses/sec ≈ executes/sec. That almost always means the application is not using bind variables or is reconnecting constantly. Every parse is wasted CPU that could be serving real work.

Instance Efficiency Percentages

This section is a quick sanity check. You want most of these above 99%:

  • Buffer Hit %: Should be >99% for OLTP. Below 95% warrants buffer cache investigation.
  • Execute to Parse %: Should be high (80%+). Low values → application re-parsing too often.
  • Library Hit %: Shared pool efficiency. Low values mean you're wasting time re-loading cursors.
  • Parse CPU to Parse Elapsd %: If this is low, parses are spending more time waiting than on CPU — shared pool latch contention or hard parse overload.
  • Redo NoWait %: Should be near 100%. Low values mean redo log groups are too small or I/O is too slow for log writes.

Top 10 Foreground Wait Events: The Real Story

This is where I spend most of my time on an AWR report analysis. The Top Wait Events section tells you what Oracle sessions spent the most time waiting for. Do not conflate "foreground" (session waits) with "background" waits — we focus on foreground for user impact.

Common wait events and what they mean:

  • db file sequential read: Single-block I/O reads — usually index lookups. High values mean your indexes are doing a lot of work (which can be fine) or you have I/O subsystem latency. Check avg wait time. Under 1ms is excellent; above 5ms is a concern on SSD storage.
  • db file scattered read: Multi-block I/O — full table or index fast full scans. High values → full scans, possibly missing indexes.
  • log file sync: Sessions waiting for their redo to be flushed to disk on COMMIT. Consistently high → redo log groups are on slow storage, or the application is committing too frequently.
  • enq: TX - row lock contention: Sessions blocked waiting for row locks. Application-level locking issue — commits are not happening often enough, or there's a hot row.
  • library cache: mutex X: Contention in the shared pool — often too many hard parses or too many versions of the same cursor (literally thousands of child cursors due to bind-sensitive plans).
  • latch: cache buffers chains: Hot blocks in the buffer cache — a block being read or modified constantly by many sessions. Usually points to a hot index block (right-hand side of a monotonically increasing index) or a segment header.
  • CPU time: Not really a wait — this is time on CPU. If CPU time is at the top of this list, you are CPU-bound. Check Top SQL for CPU-hungry statements.

Here's the mental model I use: calculate each wait event's percentage of total DB Time. If one event accounts for more than 20-25% of DB Time, it is the thing to fix first. Everything else is noise until that one is resolved.

Top SQL: Finding the Offenders

The AWR report includes several Top SQL subsections:

  • SQL ordered by Elapsed Time — the most important for overall throughput impact
  • SQL ordered by CPU Time — pure CPU consumers
  • SQL ordered by Gets — logical I/O heavy hitters
  • SQL ordered by Reads — physical I/O heavy hitters
  • SQL ordered by Executions — high-frequency statements
  • SQL ordered by Parse Calls — statements being re-parsed frequently

For each statement you identify, grab the SQL_ID and examine it further:

-- Pull the full SQL text from AWR history
SELECT sql_text
FROM   dba_hist_sqltext
WHERE  sql_id = '&sql_id';

-- Review execution history and plan hash values
SELECT snap_id,
       executions_delta,
       elapsed_time_delta / NULLIF(executions_delta, 0) / 1e6 AS avg_elapsed_secs,
       rows_processed_delta / NULLIF(executions_delta, 0) AS avg_rows,
       plan_hash_value
FROM   dba_hist_sqlstat
WHERE  sql_id = '&sql_id'
ORDER  BY snap_id DESC;

-- Retrieve the actual execution plan used during the window
SELECT * FROM TABLE(
  DBMS_XPLAN.display_awr('&sql_id', format => 'ALLSTATS LAST')
);

The plan_hash_value column is gold. If you see a statement that had one plan hash value for months and suddenly switched to a new one right before a performance incident, you've found your culprit. Pin the old plan using SQL Plan Baselines — see our guide on Oracle SQL Plan Baselines for Performance Stability.

Memory Statistics: SGA and PGA

The SGA Breakdown and PGA sections confirm whether memory is sized correctly:

  • Buffer Cache Advisory: Shows estimated physical reads if the cache were larger or smaller. If the curve flattens before your current size, you're fine. If it's still steep at your current size, you need more buffer cache.
  • PGA Memory Advisory: Shows estimated PGA over-allocation factor. A factor >1 means Oracle had to spill sorts and hash joins to disk — you need more PGA.
  • Shared Pool Advisory: Tracks library cache hit ratios at different shared pool sizes.
-- Confirm PGA spill to disk from AWR
SELECT snap_id,
       cache_hit_percentage,
       over_alloc_count
FROM   dba_hist_pgastat
WHERE  name = 'cache hit percentage'
ORDER  BY snap_id DESC;

Common Pitfalls When Reading AWR Reports

Over the years I've seen DBAs make the same mistakes on Oracle AWR report analysis. Avoid these:

  1. Comparing across different snapshot intervals. A 30-minute AWR snapshot and a 60-minute AWR snapshot will show wildly different absolute numbers. Always normalize to per-second rates or look at percentages of DB Time.
  2. Ignoring "idle" wait events. Events like SQL*Net message from client and jobq slave wait are idle waits — the session is doing nothing. The AWR Top 10 shows all wait events; filter out idle ones when computing the percentage of active work.
  3. Fixing the second problem before fixing the first. The wait event at the top of the list is the one to fix. Don't go hunting for the third or fourth event while the number-one wait is still unresolved — fixing it may eliminate the downstream waits entirely.
  4. Tunnel vision on SQL without checking instance-level metrics. Sometimes the SQL looks fine but the storage is degraded, or a node is down in RAC and all connections are on one instance. The header and the Load Profile often tell this story before you even get to Top SQL.
  5. Generating AWR reports over windows that span a restart. Post-startup stats are not comparable to steady-state stats. Always check the header for instance start time and avoid snapshots that cross it.

A Quick Triage Workflow

When I open an AWR report under pressure — because the business is calling and production is slow — I use this 5-minute triage sequence:

  1. Header → AAS. Is AAS above CPU count? If yes, resource saturation. If no, latency problem.
  2. Top Wait Events → #1 event. What are sessions waiting for? This is the primary constraint.
  3. Load Profile → Hard Parses. Are hard parses unusually high? If yes, application or shared pool problem.
  4. Top SQL by Elapsed Time → Top 1-3 statements. Grab SQL_IDs, check execution plans.
  5. Instance Efficiency → Buffer Hit %, Redo NoWait %. Quick confirmation of cache and redo health.

In most cases, five minutes of this process narrows the problem to one of three buckets: bad SQL, resource saturation, or I/O subsystem degradation. Everything else is a follow-up investigation.

Conclusion

Knowing how to read an Oracle AWR report efficiently is what separates a reactive DBA from a proactive one. The report contains an enormous amount of data, but the sequence matters: start with the header to understand workload intensity, move to wait events to understand the primary constraint, then drill into Top SQL to find the specific statements driving that constraint. Combined with the advisory sections for memory and the Load Profile for workload characterization, you have a complete picture of database health in a single document.

If you found this Oracle AWR report tutorial useful, drop a comment below — I'd especially love to hear about specific wait events you're struggling to interpret. And if you want posts like this delivered to you twice a week, subscribe to oratab19c.blogspot.com. Next up: SQL Plan Baselines for locking in good execution plans and preventing unplanned regressions.

No comments:

Post a Comment