RAC 相關的等待及原因
Wait events for Oracle RAC include the following categories:
1.Block-Related Wait Events
2.Message-Related Wait Events
3.Contention-Related Wait Events
4.Load-Related Wait Events
Block-Related Wait Events
The main wait events for block-related waits are:
gc current block 2-way
gc current block 3-way
gc cr block 2-way
gc cr block 3-way
*******************
在node1的cache里面沒有找到需要的數據,于是出現了跨node的 fusion cache 從node2 的Cache中得到數據。
在gc current block 2-way or gc current block 3-way等待事件上的過多等待,通常要么是由于(a)一種低效的執行計劃,導致了大量的塊訪問,
或者(b)應用數據相似度(應用親和力)沒有被實施。如果對象訪問本地化,考慮實施應用親和力(應用數據的相似度)。
*********************
The block-related wait event statistics indicate that a block was received as either the result of a 2-way or a 3-way message, that is, the block was sent from either the resource master requiring 1 message and 1 transfer, or was forwarded to a third node from which it was sent, requiring 2 messages and 1 block transfer.
If the average wait times are acceptable and no interconnect or load issues can be diagnosed, then the accumulated time waited can usually be attributed to a few SQL statements which need to be tuned to minimize the number of blocks accessed.
The column CLUSTER_WAIT_TIME in V$SQLAREA represents the wait time incurred by individual SQL statements for global cache events and will identify the SQL which may need to be tuned.
Message-Related Wait Events
The main wait events for message-related waits are:
gc current grant 2-way
gc cr grant 2-way
*********************
如果被請求的塊沒有駐留在任何緩沖區中,需要請求master讀取物理磁盤上的數據,出現物理讀寫。就會遭遇gc cr grant 2-way 和 gc current grant 2-way等待事件。
*********************
The message-related wait event statistics indicate that no block was received because it was not cached in any instance. Instead a global grant was given, enabling the requesting instance to read the block from disk or modify it.
If the time consumed by these events is high, then it may be assumed that the frequently used SQL causes a lot of disk I/O (in the event of the cr grant) or that the workload inserts a lot of data and needs to find and format new blocks frequently (in the event of the current grant).
Contention-Related Wait Events
The main wait events for contention-related waits are:
gc current block busy
gc cr block busy
gc buffer busy acquire/release
******************
一般是并發的讀寫,各個session 中間出現資源競爭,需要等其他session 把修改的數據寫入redo log,才會把控制權返回給其他session。如果一個競爭沒有結束,再有其他的競爭增加,會出現雪崩的效應,系統性能急劇下降。
繁忙事件(Busy events)表明,LMS執行了額外的工作去處理并發相關的問題。
******************
The contention-related wait event statistics indicate that a block was received which was pinned by a session on another node, was deferred because a change had not yet been flushed to disk or because of high concurrency, and therefore could not be shipped immediately. A buffer may also be busy locally when a session has already initiated a cache fusion operation and is waiting for its completion when another session on the same node is trying to read or modify the same data. High service times for blocks exchanged in the global cache may exacerbate the contention, which can be caused by frequent concurrent read and write accesses to the same data.
Load-Related Wait Events
The main wait events for load-related waits are:
gc current block congested
gc cr block congested
***************
如果LMS進程在接收到請求后沒有在1毫秒內處理該請求,那么LMS進程標記這個響應為:該塊正遭遇擁堵相關的等待事件。
堵塞相關的等待事件有很多原因,比如說,LMS進程被大量全局高速緩存的請求所淹沒。LMS進程正遭遇CPU的調度延遲,LMS進程已經遇到了另一種資源耗盡(如內存)等。
通常情況下,LMS進程運行在實時CPU調度優先級,因此,CPU調度的延遲將是最小的。大量這類的等待此事件表明出現了全局緩存請求的突然飆升,且LMS進程無法快速處理這些請求。
服務器內存匱乏也可能導致LMS進程的分頁,影響全局緩存的性能。
您可以去檢查為什么LMS進程不能夠有效地處理請求。
就是硬件不足需要增加硬件資源,最常見的只增加node。也可以考慮升級硬件。
**************
The load-related wait events indicate that a delay in processing has occurred in the GCS, which is usually caused by high load, CPU saturation and would have to be solved by additional CPUs, load-balancing, off loading processing to different times or a new cluster node.For the events mentioned, the wait time encompasses the entire round trip from the time a session starts to wait after initiating a block request until the block arrives.