Domino effect in T10 database on g1rush Server

27/6/2024

Tomasz Chwasewicz

Table of contents

Performance problems rarely occur in isolation. Often, a single problem can trigger a cascade of related problems, each of which amplifies the overall impact on the system. This case study analyzes such a scenario in the T10 database on g1rush Server.

Identifying the initial problem: Temporary tablespace overflow

On June 21, 2024 at 13:04:32, DBPLUS PERFORMANCE MONITOR flagged a critical error: ORA-1652, indicating the inability to extend the temporary segment by 128 in the TEMPORARY_DATA tablespace in the T10 database on g1rush Server. This error occurred due to simultaneous sessions overwhelming the temporary space of tables, consuming more than 30 GB.

g1rush Server - Identyfikacja początkowego problemu

Highlighted in the chart, the spike in resource utilization immediately caught the company's attention. This overflow interfered with ongoing operations and indicated inefficient resource management.

Secondary problem: High CPU load

Further investigation revealed a persistent high load on the processor, even after reducing the number of active processes. The cyclical nature of this load, peaking during periods of high activity, suggested deeper problems with query optimization and resource allocation.

g1rush Server - Wysokie obciążenie procesora

The graph above shows busy time (CPU), system time (CPU), and user time (CPU) with a clear pattern of cyclic peaks. This persistent high load required us to delve into specific queries and processes that consume CPU resources.

Detailed query analysis: Execution time and plan changes

The next step was to analyze the specific queries contributing to the high CPU load. In particular, the query for the identifier 3445255751 showed significant changes in execution time and schedule. It was suspected that these changes were related to the increased execution time of another query, ID 904000402, due to the use of the shared function.

g1rush Server - Szczegółowa analiza zapytań: Czas wykonania i zmiany planu

The analysis showed that query 3445255751 experienced run-time fluctuations, which had an impact on overall performance. Comparing execution plans, we identified plan hash 232983816 as a potential cause of inefficiency.

This comparison highlighted the need to stabilize query performance by forcing a specific plan. The change in the implementation strategy suggested the need for consistent monitoring and timely adjustment of plans.

Network load

As we continued our investigation, we noticed significant network load, with data transfers as high as 52GB per hour. This indicated a significant flow of data, which may have exacerbated the observed performance problems.

Analysis of network traffic showed a large number of bytes received and sent by SQL*net, highlighting the impact of data transfer on overall performance.

Key Metrics for Reliable Database Replication

6/2/2025

Key Metrics for Reliable Database Replication

Domino effect in T10 database on g1rush Server

Identifying the initial problem: Temporary tablespace overflow

Secondary problem: High CPU load

Detailed query analysis: Execution time and plan changes

Network load

Related articles

Key Metrics for Reliable Database Replication

From .NET to Azure: How Did Microsoft Compete with Oracle Database Over the Years?

software Products

Services

Support

DBPlus