JAS-MIN

Table of Contents

ALL CHARTS

Alright, buckle up, buttercup. JAS-MIN is here to dissect this mess of numbers you call a performance report. You want comprehensive? You got it. Don't worry, I won't hold back, because apparently, you want all the data I can find. Let's see what horrors lie within.


πŸ“ˆ Load Profile Summary - First Impressions from the Pretty Pictures

Well, aren't these just delightful visual aids. Let's dig into the actual story these pretty boxes are trying to tell us, rather than just admiring the squiggles.

Overall Load Profile Synopsis: Your database is a wait-bottlenecked beast. The consistently high Average Active Sessions with low CPU utilization, coupled with astronomical logical I/O, tells me your sessions are spending most of their time fighting over memory resources or waiting for Oracle to process already-cached blocks. This isn't about slow disks; it's about inefficient data access patterns and contention within the SGA. The high commit rate and block changes contribute significantly to the internal contention.


πŸ“‰ DBCPU/DBTime Ratio Analysis: Confirmed Bottleneck

Your analysis confirms my initial suspicion from the load profile. The default ratio for finding peaks is 0.666, meaning CPU should account for at least two-thirds of the DB Time. Your actual ratios? Abysmally low, ranging from 0.01 to 0.19.

This isn't a "peak," darling, this is your database suffocating for lack of resources other than CPU. When DB CPU is 15.00 and DB Time is 1196.10 (ratio 0.01 on 25-Cze-25 13:00:32, snap 1665), it means that on average, about 1196 active sessions were doing something, but only 15 of them were actually crunching numbers on the CPU. The rest were cooling their heels, waiting for I/O, locks, latches, or some other internal Oracle resource.

The lowest ratios, indicating the most severe non-CPU bottlenecks, consistently occur during your business hours (08:00-15:00 on weekdays):

The periods around 28-Cze-25 00:00:55 (snap 1724) to 05:00:58 (snap 1729) and 29-Cze-25 00:00:44 (snap 1748) to 05:00:29 (snap 1753), and 05-Lip-25 00:00:02 (snap 1892) to 06:00:05 (snap 1898) have slightly higher ratios (up to 0.19), but still indicate significant non-CPU waits. This suggests your database is always waiting, just more so during peak hours.


⏳ Heaviest Wait Events Impacting Database Performance

Let's break down what's got your database tied up in knots. The problem isn't CPU, it's waiting for stuff. A lot of stuff.

πŸ’” Foreground Wait Events

These are the events that directly impact user session response times. They're the bane of any application user's existence.

🌌 Background Wait Events

These affect the overall health and performance of the database's internal operations.

Summary of Heaviest Wait Events:

Your database is primarily I/O and concurrency bound.


πŸ” SQL IDs Requiring Further Performance Analysis

Let's name and shame some of these SQLs. The distinction between "short but many executions" and "inherently slow" is crucial.

SQL ID Summaries:


πŸ“Š Comparing AVG and STDDEV, and Correlated Events

A high STDDEV relative to AVG implies inconsistent performance. Let's look at the worst offenders:

Interpretation: The high STDDEV values, especially for enq: TX - row lock contention and direct path write temp, highlight that the problem isn't just consistent slowness but unpredictable spikes that likely cause significant user complaints. db file parallel write and log file parallel write are more consistently high but also show periods of increased struggle.


🚨 Anomaly Clusters (Median Absolute Deviation)

Your MAD parameters are threshold = 7 and window size = 10% (32 of probes out of 327). This is a reasonable setup for detecting significant deviations from the median, indicating "anomalies".

The Anomaly Summary table is gold. It ties together problematic events, statistics, and even SQL IDs by specific dates and snap IDs.

The Period with the Biggest Amount of Anomalies: Looking at the Count column in the Anomaly Summary, the most problematic periods (dates with the highest number of co-occurring anomalies) are:

These dates (especially June 29th and 30th, around business hours) clearly represent periods of severe, widespread performance degradation affecting multiple areas of the database.

Patterns of Occurring Latches, Statistics, and Wait Events:

Many clusters are interconnected, showing systemic problems:


πŸ“… Most Problematic Dates for Performance

Based on the highest counts of co-occurring anomalies and the lowest DB CPU / DB Time ratios, the most problematic periods are consistently during your weekdays, especially from June 29th through July 4th, within the typical business hours.

Specifically,

These periods are characterized by high enq: TX - row lock contention, massive db file sequential read events, and widespread dictionary cache (row cache mutex) and redo allocation (latch: redo allocation) contention, all contributing to the severely low DB CPU / DB Time ratio.


🧩 SQLs in the Same Anomaly Clusters as Heaviest Wait Events

Let's cross-reference the top wait events with the anomalous SQLs:


⛓️ Chained Rows in Statistical Anomalies

The statistic table fetch continued row shows up in several anomaly clusters, specifically:

Meaning: table fetch continued row indicates the number of rows that spanned multiple blocks. This occurs with chained or migrated rows. Chained rows happen when a row is too large to fit in a single data block and has to be stored in multiple blocks. Migrated rows happen when a row is updated and grows in size, but there isn't enough free space in the original block, so it's moved to a new block, leaving a "pointer" behind. Impact: Both chained and migrated rows increase logical I/O (requiring more block reads to retrieve a single row) and can degrade performance, especially for full table scans or queries accessing many rows. This aligns with your very high Logical Reads (MB)/s in the Load Profile.

Further AWR Sections to check: To find the probable segments responsible for chained/migrated rows, you should look into the "Segments by Logical Reads" or "Segments by Physical Reads" sections of the full AWR report. Specifically, you'd look for:

Action: For tables with significant chained/migrated rows, consider:

  1. ALTER TABLE MOVE: Rebuild the table (and its indexes) to defragment space and ensure rows fit in single blocks. This usually requires an outage or online redefinition.
  2. Adjust PCTFREE: Increase PCTFREE for tables with growing rows to leave more space in blocks for future updates, reducing migration.

πŸ“ IO STATS by Function Analysis

This section gives us a breakdown of I/O activity by internal Oracle functions.

Cross-reference Conclusion from IO Stats by Function: Your storage system seems reasonably fast for both sequential and random I/O at the individual operation level. The bottleneck is not primarily disk latency, but the volume of I/O requests and redo generated by the application workload. The database is churning through data and changes at an incredible rate, and while the I/O system is efficient, it's struggling to keep up with the sheer demand. This leads to queues and waits, despite fast individual operations. The log file parallel write is a prime example: many fast small writes still stack up.


πŸ“‹ TOP 10 Segments Analysis

This section helps pinpoint which database objects are at the heart of your problems.


πŸ’‘ Suggestions for MAD Algorithm and MOS Notes

Suggested other window size or threshold for MAD algorithm:

Oracle MOS notes describing the problems:


πŸš€ Conclusion and Recommendations

Alright, here’s the bottom line, spelled out like it's for kindergarteners, but with more sarcasm: Your database isn't slow because it's thinking too hard. It's slow because it's waiting for everything.

The Core Problems:

  1. I/O & Redo Saturation: log file parallel write and db file parallel write are killing you. Your disk system isn't slow per operation, but it's overwhelmed by the sheer volume of changes (Redo MB/s, Phy Writes MB/s). Your application is a DML-generating machine.
  2. Concurrency Hotspots: enq: TX - row lock contention (driven heavily by dvtr3uwuqbppa), read by other session, and buffer busy waits indicate sessions are constantly bumping into each other on frequently accessed data blocks and records.
  3. Dictionary Cache Nightmare: row cache mutex, OBJ$, TAB$, COL$, CDEF$ in busy waits, and high Hard Parses/s mean Oracle's internal metadata structures are a major bottleneck. This is exacerbated by RECYCLEBIN$ activity and potentially frequent DDL or grant changes.
  4. Inefficient Data Access: High db file sequential read (millions of small reads) and massive Logical Reads (MB)/s suggest queries are doing way too much work in memory, visiting unnecessary blocks, or suffering from chained rows.

Top Priority Action Plan (Because nobody wants to hear "it's complicated"):

  1. Conquer dvtr3uwuqbppa (INSERT) and enq: TX - row lock contention:
  2. Tame the Redo Monster:
  3. Address Dictionary Contention:
  4. Tune Logical I/O Hogs:
  5. PGA Tuning: Given the direct path write temp activity and PGA memory operation anomalies, re-evaluate PGA_AGGREGATE_TARGET. If queries are frequently spilling to disk, they might benefit from more PGA, or the queries themselves need tuning to reduce memory requirements.

Don't just stare at the pretty graphs. Get to work. The database isn't going to fix itself while you're busy making coffee.


Your source code: https://github.com/ora600pl/jas-min Need more expert analysis? The good performance tuning experts are at ora-600.pl.

ORA-600