PostgreSQL中StrategyGetBuffer函數有什么作用

發布時間：2021-11-09 15:57:52 來源：億速云閱讀：128 作者：iii 欄目：關系型數據庫

本篇內容介紹了“PostgreSQL中StrategyGetBuffer函數有什么作用”的有關知識，在實際案例的操作過程中，不少人都會遇到這樣的困境，接下來就讓小編帶領大家學習一下如何處理這些情況吧！希望大家仔細閱讀，能夠學有所成！

一、數據結構

BufferDesc
共享緩沖區的共享描述符(狀態)數據

/*
 * Flags for buffer descriptors
 * buffer描述器標記
 *
 * Note: TAG_VALID essentially means that there is a buffer hashtable
 * entry associated with the buffer's tag.
 * 注意:TAG_VALID本質上意味著有一個與緩沖區的標記相關聯的緩沖區散列表條目。
 */
//buffer header鎖定
#define BM_LOCKED               (1U << 22)  /* buffer header is locked */
//數據需要寫入(標記為DIRTY)
#define BM_DIRTY                (1U << 23)  /* data needs writing */
//數據是有效的
#define BM_VALID                (1U << 24)  /* data is valid */
//已分配buffer tag
#define BM_TAG_VALID            (1U << 25)  /* tag is assigned */
//正在R/W
#define BM_IO_IN_PROGRESS       (1U << 26)  /* read or write in progress */
//上一個I/O出現錯誤
#define BM_IO_ERROR             (1U << 27)  /* previous I/O failed */
//開始寫則變DIRTY
#define BM_JUST_DIRTIED         (1U << 28)  /* dirtied since write started */
//存在等待sole pin的其他進程
#define BM_PIN_COUNT_WAITER     (1U << 29)  /* have waiter for sole pin */
//checkpoint發生,必須刷到磁盤上
#define BM_CHECKPOINT_NEEDED    (1U << 30)  /* must write for checkpoint */
//持久化buffer(不是unlogged或者初始化fork)
#define BM_PERMANENT            (1U << 31)  /* permanent buffer (not unlogged,
                                             * or init fork) */
/*
 *  BufferDesc -- shared descriptor/state data for a single shared buffer.
 *  BufferDesc -- 共享緩沖區的共享描述符(狀態)數據
 *
 * Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change
 * the tag, state or wait_backend_pid fields.  In general, buffer header lock
 * is a spinlock which is combined with flags, refcount and usagecount into
 * single atomic variable.  This layout allow us to do some operations in a
 * single atomic operation, without actually acquiring and releasing spinlock;
 * for instance, increase or decrease refcount.  buf_id field never changes
 * after initialization, so does not need locking.  freeNext is protected by
 * the buffer_strategy_lock not buffer header lock.  The LWLock can take care
 * of itself.  The buffer header lock is *not* used to control access to the
 * data in the buffer!
 * 注意:必須持有Buffer header鎖(BM_LOCKED標記)才能檢查或修改tag/state/wait_backend_pid字段.
 * 通常來說,buffer header lock是spinlock,它與標記位/參考計數/使用計數組合到單個原子變量中.
 * 這個布局設計允許我們執行原子操作,而不需要實際獲得或者釋放spinlock(比如,增加或者減少參考計數).
 * buf_id字段在初始化后不會出現變化,因此不需要鎖定.
 * freeNext通過buffer_strategy_lock鎖而不是buffer header lock保護.
 * LWLock可以很好的處理自己的狀態.
 * 務請注意的是:buffer header lock不用于控制buffer中的數據訪問!
 *
 * It's assumed that nobody changes the state field while buffer header lock
 * is held.  Thus buffer header lock holder can do complex updates of the
 * state variable in single write, simultaneously with lock release (cleaning
 * BM_LOCKED flag).  On the other hand, updating of state without holding
 * buffer header lock is restricted to CAS, which insure that BM_LOCKED flag
 * is not set.  Atomic increment/decrement, OR/AND etc. are not allowed.
 * 假定在持有buffer header lock的情況下,沒有人改變狀態字段.
 * 持有buffer header lock的進程可以執行在單個寫操作中執行復雜的狀態變量更新,
 *   同步的釋放鎖(清除BM_LOCKED標記).
 * 換句話說,如果沒有持有buffer header lock的狀態更新,會受限于CAS,
 *   這種情況下確保BM_LOCKED沒有被設置.
 * 比如原子的增加/減少(AND/OR)等操作是不允許的.
 *
 * An exception is that if we have the buffer pinned, its tag can't change
 * underneath us, so we can examine the tag without locking the buffer header.
 * Also, in places we do one-time reads of the flags without bothering to
 * lock the buffer header; this is generally for situations where we don't
 * expect the flag bit being tested to be changing.
 * 一種例外情況是如果我們已有buffer pinned,該buffer的tag不能改變(在本進程之下),
 *   因此不需要鎖定buffer header就可以檢查tag了.
 * 同時,在執行一次性的flags讀取時不需要鎖定buffer header.
 * 這種情況通常用于我們不希望正在測試的flag bit將被改變.
 *
 * We can't physically remove items from a disk page if another backend has
 * the buffer pinned.  Hence, a backend may need to wait for all other pins
 * to go away.  This is signaled by storing its own PID into
 * wait_backend_pid and setting flag bit BM_PIN_COUNT_WAITER.  At present,
 * there can be only one such waiter per buffer.
 * 如果其他進程有buffer pinned,那么進程不能物理的從磁盤頁面中刪除items.
 * 因此,后臺進程需要等待其他pins清除.這可以通過存儲它自己的PID到wait_backend_pid中,
 *   并設置標記位BM_PIN_COUNT_WAITER.
 * 目前,每個緩沖區只能由一個等待進程.
 *
 * We use this same struct for local buffer headers, but the locks are not
 * used and not all of the flag bits are useful either. To avoid unnecessary
 * overhead, manipulations of the state field should be done without actual
 * atomic operations (i.e. only pg_atomic_read_u32() and
 * pg_atomic_unlocked_write_u32()).
 * 本地緩沖頭部使用同樣的結構,但并不需要使用locks,而且并不是所有的標記位都使用.
 * 為了避免不必要的負載,狀態域的維護不需要實際的原子操作
 * (比如只有pg_atomic_read_u32() and pg_atomic_unlocked_write_u32())
 *
 * Be careful to avoid increasing the size of the struct when adding or
 * reordering members.  Keeping it below 64 bytes (the most common CPU
 * cache line size) is fairly important for performance.
 * 在增加或者記錄成員變量時,小心避免增加結構體的大小.
 * 保持結構體大小在64字節內(通常的CPU緩存線大小)對于性能是非常重要的.
 */
typedef struct BufferDesc
{
    //buffer tag
    BufferTag   tag;            /* ID of page contained in buffer */
    //buffer索引編號(0開始)
    int         buf_id;         /* buffer's index number (from 0) */
    /* state of the tag, containing flags, refcount and usagecount */
    //tag狀態,包括flags/refcount和usagecount
    pg_atomic_uint32 state;
    //pin-count等待進程ID
    int         wait_backend_pid;   /* backend PID of pin-count waiter */
    //空閑鏈表鏈中下一個空閑的buffer
    int         freeNext;       /* link in freelist chain */
    //緩沖區內容鎖
    LWLock      content_lock;   /* to lock access to buffer contents */
} BufferDesc;

BufferTag
Buffer tag標記了buffer存儲的是磁盤中哪個block

/*
 * Buffer tag identifies which disk block the buffer contains.
 * Buffer tag標記了buffer存儲的是磁盤中哪個block
 *
 * Note: the BufferTag data must be sufficient to determine where to write the
 * block, without reference to pg_class or pg_tablespace entries.  It's
 * possible that the backend flushing the buffer doesn't even believe the
 * relation is visible yet (its xact may have started before the xact that
 * created the rel).  The storage manager must be able to cope anyway.
 * 注意:BufferTag必須足以確定如何寫block而不需要參照pg_class或者pg_tablespace數據字典信息.
 * 有可能后臺進程在刷新緩沖區的時候深圳不相信關系是可見的(事務可能在創建rel的事務之前).
 * 存儲管理器必須可以處理這些事情.
 *
 * Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have
 * to be fixed to zero them, since this struct is used as a hash key.
 * 注意:如果在結構體中有填充的字節,INIT_BUFFERTAG必須將它們固定為零，因為這個結構體用作散列鍵.
 */
typedef struct buftag
{
    //物理relation標識符
    RelFileNode rnode;          /* physical relation identifier */
    ForkNumber  forkNum;
    //相對于relation起始的塊號
    BlockNumber blockNum;       /* blknum relative to begin of reln */
} BufferTag;

SMgrRelation
smgr.c維護一個包含SMgrRelation對象的hash表,SMgrRelation對象本質上是緩存的文件句柄.

/*
 * smgr.c maintains a table of SMgrRelation objects, which are essentially
 * cached file handles.  An SMgrRelation is created (if not already present)
 * by smgropen(), and destroyed by smgrclose().  Note that neither of these
 * operations imply I/O, they just create or destroy a hashtable entry.
 * (But smgrclose() may release associated resources, such as OS-level file
 * descriptors.)
 * smgr.c維護一個包含SMgrRelation對象的hash表,SMgrRelation對象本質上是緩存的文件句柄.
 * SMgrRelation對象(如非現成)通過smgropen()方法創建,通過smgrclose()方法銷毀.
 * 注意:這些操作都不會執行I/O操作,只會創建或者銷毀哈希表條目.
 * (但是smgrclose()方法可能會釋放相關的資源,比如OS基本的文件描述符)
 *
 * An SMgrRelation may have an "owner", which is just a pointer to it from
 * somewhere else; smgr.c will clear this pointer if the SMgrRelation is
 * closed.  We use this to avoid dangling pointers from relcache to smgr
 * without having to make the smgr explicitly aware of relcache.  There
 * can't be more than one "owner" pointer per SMgrRelation, but that's
 * all we need.
 * SMgrRelation可能會有"宿主",這個宿主可能只是從某個地方指向它的指針而已;
 * 如SMgrRelationsmgr.c會清除該指針.這樣做可以避免從relcache到smgr的懸空指針,
 *   而不必要讓smgr顯式的感知relcache(也就是隔離了smgr了relcache).
 * 每個SMgrRelation不能跟多個"owner"指針關聯,但這就是我們所需要的.
 *
 * SMgrRelations that do not have an "owner" are considered to be transient,
 * and are deleted at end of transaction.
 * SMgrRelations如無owner指針,則被視為臨時對象,在事務的最后被刪除. 
 */
typedef struct SMgrRelationData
{
    /* rnode is the hashtable lookup key, so it must be first! */
    //-------- rnode是哈希表的搜索鍵,因此在結構體的首位
    //關系物理定義ID
    RelFileNodeBackend smgr_rnode;  /* relation physical identifier */
    /* pointer to owning pointer, or NULL if none */
    //--------- 指向擁有的指針,如無則為NULL
    struct SMgrRelationData **smgr_owner;
    /*
     * These next three fields are not actually used or manipulated by smgr,
     * except that they are reset to InvalidBlockNumber upon a cache flush
     * event (in particular, upon truncation of the relation).  Higher levels
     * store cached state here so that it will be reset when truncation
     * happens.  In all three cases, InvalidBlockNumber means "unknown".
     * 接下來的3個字段實際上并不用于或者由smgr管理,
     *   除非這些表里在cache flush event發生時被重置為InvalidBlockNumber
     *   (特別是在關系被截斷時).
     * 在這里,更高層的存儲緩存了狀態因此在截斷發生時會被重置.
     * 在這3種情況下,InvalidBlockNumber都意味著"unknown".
     */
    //當前插入的目標bloc
    BlockNumber smgr_targblock; /* current insertion target block */
    //最后已知的fsm fork大小
    BlockNumber smgr_fsm_nblocks;   /* last known size of fsm fork */
    //最后已知的vm fork大小
    BlockNumber smgr_vm_nblocks;    /* last known size of vm fork */
    /* additional public fields may someday exist here */
    //------- 未來可能新增的公共域
    /*
     * Fields below here are intended to be private to smgr.c and its
     * submodules.  Do not touch them from elsewhere.
     * 下面的字段是smgr.c及其子模塊私有的,不要從其他模塊接觸這些字段.
     */
    //存儲管理器選擇器
    int         smgr_which;     /* storage manager selector */
    /*
     * for md.c; per-fork arrays of the number of open segments
     * (md_num_open_segs) and the segments themselves (md_seg_fds).
     * 用于md.c,打開段(md_num_open_segs)和段自身(md_seg_fds)的數組(每個fork一個)
     */
    int         md_num_open_segs[MAX_FORKNUM + 1];
    struct _MdfdVec *md_seg_fds[MAX_FORKNUM + 1];
    /* if unowned, list link in list of all unowned SMgrRelations */
    //如沒有宿主,未宿主的SMgrRelations鏈表的鏈表鏈接.
    struct SMgrRelationData *next_unowned_reln;
} SMgrRelationData;
typedef SMgrRelationData *SMgrRelation;

RelFileNodeBackend
組合relfilenode和后臺進程ID,用于提供需要定位物理存儲的所有信息.

/*
 * Augmenting a relfilenode with the backend ID provides all the information
 * we need to locate the physical storage.  The backend ID is InvalidBackendId
 * for regular relations (those accessible to more than one backend), or the
 * owning backend's ID for backend-local relations.  Backend-local relations
 * are always transient and removed in case of a database crash; they are
 * never WAL-logged or fsync'd.
 * 組合relfilenode和后臺進程ID,用于提供需要定位物理存儲的所有信息.
 * 對于普通的關系(可通過多個后臺進程訪問),后臺進程ID是InvalidBackendId;
 * 如為臨時表,則為自己的后臺進程ID.
 * 臨時表(backend-local relations)通常是臨時存在的,在數據庫崩潰時刪除,無需WAL-logged或者fsync.
 */
typedef struct RelFileNodeBackend
{
    RelFileNode node;//節點
    BackendId   backend;//后臺進程
} RelFileNodeBackend;

StrategyControl
共享的空閑鏈表控制信息

/*
 * The shared freelist control information.
 * 共享的空閑鏈表控制信息.
 */
typedef struct
{
    /* Spinlock: protects the values below */
    //自旋鎖,用于保護下面的值
    slock_t     buffer_strategy_lock;
    /*
     * Clock sweep hand: index of next buffer to consider grabbing. Note that
     * this isn't a concrete buffer - we only ever increase the value. So, to
     * get an actual buffer, it needs to be used modulo NBuffers.
     * Clock sweep hand:下一個考慮交換出去的buffer索引.
     * 注意這并不是一個精確的buffer - 我們只是曾經增加值而已.
     * 因此,獲得一個實際的buffer,需要取模(使用NBuffers).
     */
    pg_atomic_uint32 nextVictimBuffer;
    //未使用的buffers鏈表頭部
    int         firstFreeBuffer;    /* Head of list of unused buffers */
    //未使用的buffers鏈表尾部
    int         lastFreeBuffer; /* Tail of list of unused buffers */
    /*
     * NOTE: lastFreeBuffer is undefined when firstFreeBuffer is -1 (that is,
     * when the list is empty)
     * 注意:如firstFreeBuffer是-1,則lastFreeBuffer是未定義的.
     * (這意味著,當鏈表是空的時候會出現這種情況)
     */
    /*
     * Statistics.  These counters should be wide enough that they can't
     * overflow during a single bgwriter cycle.
     * 統計信息.這些計數器需要足夠大,以確保在單個bgwriter循環時不會溢出.
     */
    //完成一輪clock sweep循環,進行計數
    uint32      completePasses; /* Complete cycles of the clock sweep */
    //自上次重置后分配的buffers
    pg_atomic_uint32 numBufferAllocs;   /* Buffers allocated since last reset */
    /*
     * Bgworker process to be notified upon activity or -1 if none. See
     * StrategyNotifyBgWriter.
     * 活動時通知Bgworker進程,否則該值為-1.詳細參見StrategyNotifyBgWriter.
     */
    int         bgwprocno;
} BufferStrategyControl;
/* Pointers to shared state */
//指向BufferStrategyControl結構體的指針
static BufferStrategyControl *StrategyControl = NULL;

二、源碼解讀

StrategyGetBuffer在BufferAlloc()中,由bufmgr調用,用于獲得下一個候選的buffer.
其主要的處理邏輯如下:
1.初始化相關變量
2.如策略對象不為空,則從環形緩沖區中獲取buffer,如成功則返回buf
3.如需要,則喚醒后臺進程bgwriter,從共享內存中讀取一次,然后根據該值設置latch
4.計算buffer分配請求,這樣bgwriter可以估算buffer消耗的比例.
5.檢查freelist中是否存在buffer
5.1如存在,則執行相關判斷邏輯,如成功,則返回buf
5.2如不存在
5.2.1則使用clock sweep算法,選擇buffer,執行相關判斷,如成功,則返回buf
5.2.2如無法獲取,在嘗試過trycounter次后,報錯

/*
 * StrategyGetBuffer
 *
 *  Called by the bufmgr to get the next candidate buffer to use in
 *  BufferAlloc(). The only hard requirement BufferAlloc() has is that
 *  the selected buffer must not currently be pinned by anyone.
 *  在BufferAlloc()中,由bufmgr調用,用于獲得下一個候選的buffer.
 *  BufferAlloc()中唯一稍微困難的需求是選擇的buffer不能被其他后臺進程pinned.
 *
 *  strategy is a BufferAccessStrategy object, or NULL for default strategy.
 *  strategy是BufferAccessStrategy對象,如為默認策略,則為NULL.
 *
 *  To ensure that no one else can pin the buffer before we do, we must
 *  return the buffer with the buffer header spinlock still held.
 *  為了確保沒有其他后臺進程在我們完成之前pin buffer,必須返回仍持有buffer header自旋鎖的buffer.
 */
BufferDesc *
StrategyGetBuffer(BufferAccessStrategy strategy, uint32 *buf_state)
{
    BufferDesc *buf;//buffer描述符
    int         bgwprocno;
    int         trycounter;//嘗試次數
    //避免重復的依賴和解依賴
    uint32      local_buf_state;    /* to avoid repeated (de-)referencing */
    /*
     * If given a strategy object, see whether it can select a buffer. We
     * assume strategy objects don't need buffer_strategy_lock.
     * 如果給定了一個策略對象,看看是否可以選擇一個buffer.
     * 我們假定策略對象不需要buffer_strategy_lock鎖.
     */
    if (strategy != NULL)
    {
        //從環形緩沖區中獲取buffer,如獲取成功,則返回該buffer
        buf = GetBufferFromRing(strategy, buf_state);
        if (buf != NULL)
            return buf;
    }
    /*
     * If asked, we need to waken the bgwriter. Since we don't want to rely on
     * a spinlock for this we force a read from shared memory once, and then
     * set the latch based on that value. We need to go through that length
     * because otherwise bgprocno might be reset while/after we check because
     * the compiler might just reread from memory.
     * 如需要,則喚醒后臺進程bgwriter.
     * 我們不希望依賴自旋鎖實現這一點,所以強制從共享內存中讀取一次,然后根據該值設置latch.
     * 我們需要走完這一步,否則的話bgprocno在檢查期間或之后被重置,因為編譯器可能重新從內存中讀取數據.
     *
     * This can possibly set the latch of the wrong process if the bgwriter
     * dies in the wrong moment. But since PGPROC->procLatch is never
     * deallocated the worst consequence of that is that we set the latch of
     * some arbitrary process.
     * 如果bgwriter出現異常宕機,可能會出現latch被設置為錯誤的進程.
     * 但是由于PGPROC->procLatch從來沒有被釋放過，最壞的結果是我們設置了一些任意進程的latch。
     */
    bgwprocno = INT_ACCESS_ONCE(StrategyControl->bgwprocno);
    if (bgwprocno != -1)
    {
        //--- 如bgwprocno不是-1
        /* reset bgwprocno first, before setting the latch */
        //在設置latch前,首先重置bgwprocno為-1
        StrategyControl->bgwprocno = -1;
        /*
         * Not acquiring ProcArrayLock here which is slightly icky. It's
         * actually fine because procLatch isn't ever freed, so we just can
         * potentially set the wrong process' (or no process') latch.
         * 在這里不需要請求"令人生厭"的ProcArrayLock.
         * 因為procLatch未曾釋放過,因此實際上是沒有問題的,
         *   所以我們可能會設置錯誤的進程(或沒有進程)latch。
         */
        SetLatch(&ProcGlobal->allProcs[bgwprocno].procLatch);
    }
    /*
     * We count buffer allocation requests so that the bgwriter can estimate
     * the rate of buffer consumption.  Note that buffers recycled by a
     * strategy object are intentionally not counted here.
     * 計算buffer分配請求,這樣bgwriter可以估算buffer消耗的比例.
     * 注意通過策略對象進行的buffer回收不會在這里計算.
     */
    pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);
    /*
     * First check, without acquiring the lock, whether there's buffers in the
     * freelist. Since we otherwise don't require the spinlock in every
     * StrategyGetBuffer() invocation, it'd be sad to acquire it here -
     * uselessly in most cases. That obviously leaves a race where a buffer is
     * put on the freelist but we don't see the store yet - but that's pretty
     * harmless, it'll just get used during the next buffer acquisition.
     * 不需要請求鎖,首次檢查在freelist中是否存在buffer.
     * 因為我們不需要在每次StrategyGetBuffer()調用時都使用自旋鎖,
     *   在這里請求自旋鎖有點郁悶 -- 因為大多數情況下都沒有用.
     * 這顯然存在一個競爭,其中緩沖區被放在空閑列表中,但進程卻看不到存儲
     *   -- 但這是無害的,在下次buffer申請期間使用.  
     *
     * If there's buffers on the freelist, acquire the spinlock to pop one
     * buffer of the freelist. Then check whether that buffer is usable and
     * repeat if not.
     * 如果在空閑列表中有buffer存在,請求自旋鎖,從空閑列表中彈出一個可用的buffer.
     * 然后檢查該buffer是否可用,如不可用則繼續處理.
     *
     * Note that the freeNext fields are considered to be protected by the
     * buffer_strategy_lock not the individual buffer spinlocks, so it's OK to
     * manipulate them without holding the spinlock.
     * 注意freeNext字段通過buffer_strategy_lock鎖來保護,而不是使用獨立的緩沖區自旋鎖保護,
     *   因此不需要持有自旋鎖就可以維護這些字段.
     */
    if (StrategyControl->firstFreeBuffer >= 0)
    {
        while (true)
        {
            /* Acquire the spinlock to remove element from the freelist */
            //請求自旋鎖,刪除空閑鏈表中的元素
            SpinLockAcquire(&StrategyControl->buffer_strategy_lock);
            if (StrategyControl->firstFreeBuffer < 0)
            {
                //如無空閑空間,則馬上跳出循環
                SpinLockRelease(&StrategyControl->buffer_strategy_lock);
                break;
            }
            //獲取緩沖描述符
            buf = GetBufferDescriptor(StrategyControl->firstFreeBuffer);
            Assert(buf->freeNext != FREENEXT_NOT_IN_LIST);
            /* Unconditionally remove buffer from freelist */
            //無條件的清除空閑鏈表中的buffer
            StrategyControl->firstFreeBuffer = buf->freeNext;
            buf->freeNext = FREENEXT_NOT_IN_LIST;
            /*
             * Release the lock so someone else can access the freelist while
             * we check out this buffer.
             * 釋放鎖,這樣其他進程在我們檢查該緩沖的時候可以訪問空閑鏈表
             */
            SpinLockRelease(&StrategyControl->buffer_strategy_lock);
            /*
             * If the buffer is pinned or has a nonzero usage_count, we cannot
             * use it; discard it and retry.  (This can only happen if VACUUM
             * put a valid buffer in the freelist and then someone else used
             * it before we got to it.  It's probably impossible altogether as
             * of 8.3, but we'd better check anyway.)
             * 如果緩沖pinned或者usage_count非0,則不能使用該buffer,丟棄并重試.
             * (這種情況發生在VACUUM把一個有效的buffer放在空閑鏈表中,然后其他進程提前獲得了這個buffer.
             *  在8.3中是完全不可能的,但最好執行該檢查)
             */
            //鎖定緩沖頭部
            local_buf_state = LockBufHdr(buf);
            if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0
                && BUF_STATE_GET_USAGECOUNT(local_buf_state) == 0)
            {
                //refcount == 0 && usagecount == 0
                if (strategy != NULL)
                    //非默認策略,則添加到環形緩沖區中
                    AddBufferToRing(strategy, buf);
                //設置輸出參數
                *buf_state = local_buf_state;
                //返回buf
                return buf;
            }
            //不滿足條件,解鎖buffer header
            UnlockBufHdr(buf, local_buf_state);
        }
    }
    /* Nothing on the freelist, so run the "clock sweep" algorithm */
    //空閑鏈表中找不到或者滿足不了條件,則執行"clock sweep"算法
    //int NBuffers = 1000;
    trycounter = NBuffers;//嘗試次數
    for (;;)
    {
        //------- 循環
        //獲取buffer描述符
        buf = GetBufferDescriptor(ClockSweepTick());
        /*
         * If the buffer is pinned or has a nonzero usage_count, we cannot use
         * it; decrement the usage_count (unless pinned) and keep scanning.
         * 如果buffer已pinned,或者有一個非零值的usage_count,不能使用這個buffer.
         * 減少usage_count(除非已pinned)繼續掃描.
         */
        local_buf_state = LockBufHdr(buf);
        if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0)
        {
            //----- refcount == 0
            if (BUF_STATE_GET_USAGECOUNT(local_buf_state) != 0)
            {
                //usage_count <> 0
                //usage_count - 1
                local_buf_state -= BUF_USAGECOUNT_ONE;
                //重置嘗試次數
                trycounter = NBuffers;
            }
            else
            {
                //usage_count = 0
                /* Found a usable buffer */
                //發現一個可用的buffer
                if (strategy != NULL)
                    //添加到該策略的環形緩沖區中
                    AddBufferToRing(strategy, buf);
                //輸出參數賦值
                *buf_state = local_buf_state;
                //返回buf
                return buf;
            }
        }
        else if (--trycounter == 0)
        {
            //----- refcount <> 0 && --trycounter == 0
            /*
             * We've scanned all the buffers without making any state changes,
             * so all the buffers are pinned (or were when we looked at them).
             * We could hope that someone will free one eventually, but it's
             * probably better to fail than to risk getting stuck in an
             * infinite loop.
             * 在沒有改變任何狀態的情況,我們已經完成了所有buffers的遍歷,
             *   因此所有的buffers已pinned(或者在搜索的時候pinned).
             * 我們希望某些進程會周期性的釋放buffer,但如果實在拿不到,那報錯總比傻傻的死循環要好.
             */
            UnlockBufHdr(buf, local_buf_state);
            elog(ERROR, "no unpinned buffers available");
        }
        //解鎖buffer header
        UnlockBufHdr(buf, local_buf_state);
    }
}

三、跟蹤分析

測試腳本,查詢數據表:

10:01:54 (xdb@[local]:5432)testdb=# select * from t1 limit 10;

啟動gdb,設置斷點

(gdb) 
Continuing.
Breakpoint 1, StrategyGetBuffer (strategy=0x0, buf_state=0x7ffcc97fb4ec) at freelist.c:212
212     if (strategy != NULL)
(gdb)

輸入參數
strategy=NULL,策略對象,使用默認策略

(gdb) p *buf_state
$1 = 0

1.初始化相關變量
2.如策略對象不為空,則從環形緩沖區中獲取buffer,如成功則返回buf
3.如需要,則喚醒后臺進程bgwriter,從共享內存中讀取一次,然后根據該值設置latch

(gdb) n
231     bgwprocno = INT_ACCESS_ONCE(StrategyControl->bgwprocno);
(gdb) 
232     if (bgwprocno != -1)
(gdb) 
235         StrategyControl->bgwprocno = -1;
(gdb) p bgwprocno
$2 = 112
(gdb) p StrategyControl
$3 = (BufferStrategyControl *) 0x7f8607b21700
(gdb) p *StrategyControl
$4 = {buffer_strategy_lock = 0 '\000', nextVictimBuffer = {value = 0}, firstFreeBuffer = 134, lastFreeBuffer = 65535, 
  completePasses = 0, numBufferAllocs = {value = 0}, bgwprocno = 112}
(gdb) n
242         SetLatch(&ProcGlobal->allProcs[bgwprocno].procLatch);
(gdb)

4.計算buffer分配請求,這樣bgwriter可以估算buffer消耗的比例.

(gdb) 
250     pg_atomic_fetch_add_u32(&StrategyControl->numBufferAllocs, 1);

5.檢查freelist中是否存在buffer

(gdb) 
268     if (StrategyControl->firstFreeBuffer >= 0)

5.1如存在,則執行相關判斷邏輯,如成功,則返回buf

(gdb) n
273             SpinLockAcquire(&StrategyControl->buffer_strategy_lock);
(gdb) 
275             if (StrategyControl->firstFreeBuffer < 0)
(gdb) 
281             buf = GetBufferDescriptor(StrategyControl->firstFreeBuffer);
(gdb) 
282             Assert(buf->freeNext != FREENEXT_NOT_IN_LIST);
(gdb) p *buf
$5 = {tag = {rnode = {spcNode = 0, dbNode = 0, relNode = 0}, forkNum = InvalidForkNumber, blockNum = 4294967295}, 
  buf_id = 134, state = {value = 0}, wait_backend_pid = 0, freeNext = 135, content_lock = {tranche = 54, state = {
      value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}}
(gdb) n
285             StrategyControl->firstFreeBuffer = buf->freeNext;
(gdb) 
286             buf->freeNext = FREENEXT_NOT_IN_LIST;
(gdb) 
292             SpinLockRelease(&StrategyControl->buffer_strategy_lock);
(gdb) 
301             local_buf_state = LockBufHdr(buf);
(gdb) 
302             if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0
(gdb) 
303                 && BUF_STATE_GET_USAGECOUNT(local_buf_state) == 0)
(gdb) 
305                 if (strategy != NULL)
(gdb) 
307                 *buf_state = local_buf_state;
(gdb) 
308                 return buf;
(gdb) p *buf_state
$6 = 4194304
(gdb) p *buf
$7 = {tag = {rnode = {spcNode = 0, dbNode = 0, relNode = 0}, forkNum = InvalidForkNumber, blockNum = 4294967295}, 
  buf_id = 134, state = {value = 4194304}, wait_backend_pid = 0, freeNext = -2, content_lock = {tranche = 54, state = {
      value = 536870912}, waiters = {head = 2147483647, tail = 2147483647}}}
(gdb)

返回結果,回到BufferAlloc

(gdb) n
358 }
(gdb) 
BufferAlloc (smgr=0x22a38a0, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=0, strategy=0x0, 
    foundPtr=0x7ffcc97fb5c3) at bufmgr.c:1073
1073            Assert(BUF_STATE_GET_REFCOUNT(buf_state) == 0);
(gdb)

“PostgreSQL中StrategyGetBuffer函數有什么作用”的內容就介紹到這里了，感謝大家的閱讀。如果想了解更多行業相關的知識可以關注億速云網站，小編將為大家輸出更多高質量的實用文章！

向AI問一下細節

亚洲激情专区-91九色丨porny丨老师-久久久久久久女国产乱让韩-国产精品午夜小视频观看

PostgreSQL中StrategyGetBuffer函數有什么作用

一、數據結構

二、源碼解讀

三、跟蹤分析

猜你喜歡

亚洲激情专区-91九色丨porny丨老师-久久久久久久女国产乱让韩-国产精品午夜小视频观看

PostgreSQL中StrategyGetBuffer函數有什么作用

一、數據結構

二、源碼解讀

三、跟蹤分析

猜你喜歡

最新資訊

相關推薦

相關標簽