您好,登錄后才能下訂單哦!
這篇文章主要講解了“分析PostgreSQL中的數據結構HTAB”,文中的講解內容簡單清晰,易于學習與理解,下面請大家跟著小編的思路慢慢深入,一起來研究和學習“分析PostgreSQL中的數據結構HTAB”吧!
/* * Top control structure for a hashtable --- in a shared table, each backend * has its own copy (OK since no fields change at runtime) * 哈希表的頂層控制結構. * 在這個共享哈希表中,每一個后臺進程都有自己的拷貝 * (之所以沒有問題是因為fork出來后,在運行期沒有字段會變化) */ struct HTAB { //指向共享的控制信息 HASHHDR *hctl; /* => shared control information */ //段目錄 HASHSEGMENT *dir; /* directory of segment starts */ //哈希函數 HashValueFunc hash; /* hash function */ //哈希鍵比較函數 HashCompareFunc match; /* key comparison function */ //哈希鍵拷貝函數 HashCopyFunc keycopy; /* key copying function */ //內存分配器 HashAllocFunc alloc; /* memory allocator */ //內存上下文 MemoryContext hcxt; /* memory context if default allocator used */ //表名(用于錯誤信息) char *tabname; /* table name (for error messages) */ //如在共享內存中,則為T bool isshared; /* true if table is in shared memory */ //如為T,則固定大小不能擴展 bool isfixed; /* if true, don't enlarge */ /* freezing a shared table isn't allowed, so we can keep state here */ //不允許凍結共享表,因此這里會保存相關狀態 bool frozen; /* true = no more inserts allowed */ /* We keep local copies of these fixed values to reduce contention */ //保存這些固定值的本地拷貝,以減少沖突 //哈希鍵長度(以字節為單位) Size keysize; /* hash key length in bytes */ //段大小,必須為2的冪 long ssize; /* segment size --- must be power of 2 */ //段偏移,ssize的對數 int sshift; /* segment shift = log2(ssize) */ }; /* * Header structure for a hash table --- contains all changeable info * 哈希表的頭部結構 -- 存儲所有可變信息 * * In a shared-memory hash table, the HASHHDR is in shared memory, while * each backend has a local HTAB struct. For a non-shared table, there isn't * any functional difference between HASHHDR and HTAB, but we separate them * anyway to share code between shared and non-shared tables. * 在共享內存哈希表中,HASHHDR位于共享內存中,每一個后臺進程都有一個本地HTAB結構. * 對于非共享哈希表,HASHHDR和HTAB沒有任何功能性的不同, * 但無論如何,我們還是把它們區分為共享和非共享表. */ struct HASHHDR { /* * The freelist can become a point of contention in high-concurrency hash * tables, so we use an array of freelists, each with its own mutex and * nentries count, instead of just a single one. Although the freelists * normally operate independently, we will scavenge entries from freelists * other than a hashcode's default freelist when necessary. * 在高并發的哈希表中,空閑鏈表會成為競爭熱點,因此我們使用空閑鏈表數組, * 數組中的每一個元素都有自己的mutex和條目統計,而不是使用一個. * * If the hash table is not partitioned, only freeList[0] is used and its * spinlock is not used at all; callers' locking is assumed sufficient. * 如果哈希表沒有分區,那么只有freelist[0]元素是有用的,自旋鎖沒有任何用處; * 調用者鎖定被認為已足夠OK. */ /* Number of freelists to be used for a partitioned hash table. */ //#define NUM_FREELISTS 32 FreeListData freeList[NUM_FREELISTS]; /* These fields can change, but not in a partitioned table */ //這些域字段可以改變,但不適用于分區表 /* Also, dsize can't change in a shared table, even if unpartitioned */ //同時,就算是非分區表,共享表的dsize也不能改變 //目錄大小 long dsize; /* directory size */ //已分配的段大小(<= dsize) long nsegs; /* number of allocated segments (<= dsize) */ //正在使用的最大桶ID uint32 max_bucket; /* ID of maximum bucket in use */ //進入整個哈希表的模掩碼 uint32 high_mask; /* mask to modulo into entire table */ //進入低位哈希表的模掩碼 uint32 low_mask; /* mask to modulo into lower half of table */ /* These fields are fixed at hashtable creation */ //下面這些字段在哈希表創建時已固定 //哈希鍵大小(以字節為單位) Size keysize; /* hash key length in bytes */ //所有用戶元素大小(以字節為單位) Size entrysize; /* total user element size in bytes */ //分區個數(2的冪),或者為0 long num_partitions; /* # partitions (must be power of 2), or 0 */ //目標的填充因子 long ffactor; /* target fill factor */ //如目錄是固定大小,則該值為dsize的上限值 long max_dsize; /* 'dsize' limit if directory is fixed size */ //段大小,必須是2的冪 long ssize; /* segment size --- must be power of 2 */ //段偏移,ssize的對數 int sshift; /* segment shift = log2(ssize) */ //一次性分配的條目個數 int nelem_alloc; /* number of entries to allocate at once */ #ifdef HASH_STATISTICS /* * Count statistics here. NB: stats code doesn't bother with mutex, so * counts could be corrupted a bit in a partitioned table. * 統計信息. * 注意:統計相關的代碼不會影響mutex,因此對于分區表,統計可能有一點點問題 */ long accesses; long collisions; #endif }; /* * Per-freelist data. * 空閑鏈表數據. * * In a partitioned hash table, each freelist is associated with a specific * set of hashcodes, as determined by the FREELIST_IDX() macro below. * nentries tracks the number of live hashtable entries having those hashcodes * (NOT the number of entries in the freelist, as you might expect). * 在一個分區哈希表中,每一個空閑鏈表與特定的hashcodes集合相關,通過下面的FREELIST_IDX()宏進行定義. * nentries跟蹤有這些hashcodes的仍存活的hashtable條目個數. * (注意不要搞錯,不是空閑的條目個數) * * The coverage of a freelist might be more or less than one partition, so it * needs its own lock rather than relying on caller locking. Relying on that * wouldn't work even if the coverage was the same, because of the occasional * need to "borrow" entries from another freelist; see get_hash_entry(). * 空閑鏈表的覆蓋范圍可能比一個分區多或少,因此需要自己的鎖而不能僅僅依賴調用者的鎖. * 依賴調用者鎖在覆蓋面一樣的情況下也不會起效,因為偶爾需要從另一個自由列表“借用”條目,詳細參見get_hash_entry() * * Using an array of FreeListData instead of separate arrays of mutexes, * nentries and freeLists helps to reduce sharing of cache lines between * different mutexes. * 使用FreeListData數組而不是一個獨立的mutexes,nentries和freelists數組有助于減少不同mutexes之間的緩存線共享. */ typedef struct { //該空閑鏈表的自旋鎖 slock_t mutex; /* spinlock for this freelist */ //相關桶中的條目個數 long nentries; /* number of entries in associated buckets */ //空閑元素鏈 HASHELEMENT *freeList; /* chain of free elements */ } FreeListData; /* * HASHELEMENT is the private part of a hashtable entry. The caller's data * follows the HASHELEMENT structure (on a MAXALIGN'd boundary). The hash key * is expected to be at the start of the caller's hash entry data structure. * HASHELEMENT是哈希表條目的私有部分. * 調用者的數據按照HASHELEMENT結構組織(位于MAXALIGN的邊界). * 哈希鍵應位于調用者hash條目數據結構的開始位置. */ typedef struct HASHELEMENT { //鏈接到相同桶中的下一個條目 struct HASHELEMENT *link; /* link to next entry in same bucket */ //該條目的哈希函數結果 uint32 hashvalue; /* hash function result for this entry */ } HASHELEMENT; /* Hash table header struct is an opaque type known only within dynahash.c */ //哈希表頭部結構,非透明類型,用于dynahash.c typedef struct HASHHDR HASHHDR; /* Hash table control struct is an opaque type known only within dynahash.c */ //哈希表控制結構,非透明類型,用于dynahash.c typedef struct HTAB HTAB; /* Parameter data structure for hash_create */ //hash_create使用的參數數據結構 /* Only those fields indicated by hash_flags need be set */ //根據hash_flags標記設置相應的字段 typedef struct HASHCTL { //分區個數(必須是2的冪) long num_partitions; /* # partitions (must be power of 2) */ //段大小 long ssize; /* segment size */ //初始化目錄大小 long dsize; /* (initial) directory size */ //dsize上限 long max_dsize; /* limit to dsize if dir size is limited */ //填充因子 long ffactor; /* fill factor */ //哈希鍵大小(字節為單位) Size keysize; /* hash key length in bytes */ //參見上述數據結構注釋 Size entrysize; /* total user element size in bytes */ // HashValueFunc hash; /* hash function */ HashCompareFunc match; /* key comparison function */ HashCopyFunc keycopy; /* key copying function */ HashAllocFunc alloc; /* memory allocator */ MemoryContext hcxt; /* memory context to use for allocations */ //共享內存中的哈希頭部結構地址 HASHHDR *hctl; /* location of header in shared mem */ } HASHCTL; /* A hash bucket is a linked list of HASHELEMENTs */ //哈希桶是HASHELEMENTs鏈表 typedef HASHELEMENT *HASHBUCKET; /* A hash segment is an array of bucket headers */ //hash segment是桶數組 typedef HASHBUCKET *HASHSEGMENT; /* * Hash functions must have this signature. * Hash函數必須有它自己的標識 */ typedef uint32 (*HashValueFunc) (const void *key, Size keysize); /* * Key comparison functions must have this signature. Comparison functions * return zero for match, nonzero for no match. (The comparison function * definition is designed to allow memcmp() and strncmp() to be used directly * as key comparison functions.) * 哈希鍵對比函數必須有自己的標識. * 如匹配則對比函數返回0,不匹配返回非0. * (對比函數定義被設計為允許在對比鍵值時可直接使用memcmp()和strncmp()) */ typedef int (*HashCompareFunc) (const void *key1, const void *key2, Size keysize); /* * Key copying functions must have this signature. The return value is not * used. (The definition is set up to allow memcpy() and strlcpy() to be * used directly.) * 鍵拷貝函數必須有自己的標識. * 返回值無用. */ typedef void *(*HashCopyFunc) (void *dest, const void *src, Size keysize); /* * Space allocation function for a hashtable --- designed to match malloc(). * Note: there is no free function API; can't destroy a hashtable unless you * use the default allocator. * 哈希表的恐懼分配函數 -- 被設計為與malloc()函數匹配. * 注意:這里沒有釋放函數API;不能銷毀哈希表,除非使用默認的分配器. */ typedef void *(*HashAllocFunc) (Size request);
其結構如下圖所示:
測試腳本
\pset footer off \pset tuples_only on \o /tmp/drop.sql SELECT 'drop table if exists tbl' || id || ' ;' as "--" FROM generate_series(1, 20000) AS id; \i /tmp/drop.sql \pset footer off \pset tuples_only on \o /tmp/create.sql SELECT 'CREATE TABLE tbl' || id || ' (id int);' as "--" FROM generate_series(1, 10000) AS id; begin; \o /tmp/ret.txt \i /tmp/create.sql
跟蹤分析
... HASHSEGMENT *dir --> HASHELEMENT ***dir; dir --> HASHELEMENT *** (gdb) p *hctl $1 = {freeList = {{mutex = 0 '\000', nentries = 312, freeList = 0x7fd906ab84c0}, {mutex = 0 '\000', nentries = 298, freeList = 0x7fd907097c40}, {mutex = 0 '\000', nentries = 292, freeList = 0x7fd906ac2520}, {mutex = 0 '\000', nentries = 321, freeList = 0x7fd906ac8120}, { mutex = 0 '\000', nentries = 341, freeList = 0x7fd907229980}, {mutex = 0 '\000', nentries = 334, freeList = 0x7fd906ad3f08}, {mutex = 0 '\000', nentries = 316, freeList = 0x7fd906ad6fb8}, { mutex = 0 '\000', nentries = 299, freeList = 0x7fd906ade550}, {mutex = 0 '\000', nentries = 328, freeList = 0x7fd906ae1600}, {mutex = 0 '\000', nentries = 328, freeList = 0x7fd906ae62e8}, { mutex = 0 '\000', nentries = 308, freeList = 0x7fd906aeb660}, {mutex = 0 '\000', nentries = 327, freeList = 0x7fd90706f338}, {mutex = 0 '\000', nentries = 346, freeList = 0x7fd906af6bc0}, { mutex = 0 '\000', nentries = 323, freeList = 0x7fd907237bc0}, {mutex = 0 '\000', nentries = 304, freeList = 0x7fd9071ddb40}, {mutex = 0 '\000', nentries = 311, freeList = 0x7fd906b06238}, { mutex = 0 '\000', nentries = 292, freeList = 0x7fd90707b620}, {mutex = 0 '\000', nentries = 303, freeList = 0x7fd90723dd20}, {mutex = 0 '\000', nentries = 302, freeList = 0x7fd906b137e0}, { mutex = 0 '\000', nentries = 307, freeList = 0x7fd9070873c8}, {mutex = 0 '\000', nentries = 314, freeList = 0x7fd90723bb68}, {mutex = 0 '\000', nentries = 279, freeList = 0x7fd906b22678}, { mutex = 0 '\000', nentries = 297, freeList = 0x7fd907073e08}, {mutex = 0 '\000', nentries = 309, freeList = 0x7fd90721f888}, {mutex = 0 '\000', nentries = 317, freeList = 0x7fd906b33880}, { mutex = 0 '\000', nentries = 283, freeList = 0x7fd907086168}, {mutex = 0 '\000', nentries = 331, freeList = 0x7fd906b3d838}, {mutex = 0 '\000', nentries = 330, freeList = 0x7fd906b41f38}, { mutex = 0 '\000', nentries = 313, freeList = 0x7fd906b46440}, {mutex = 0 '\000', nentries = 304, freeList = 0x7fd906b4b5c0}, {mutex = 0 '\000', nentries = 310, freeList = 0x7fd90720ed80}, { mutex = 0 '\000', nentries = 323, freeList = 0x7fd906b575a0}}, dsize = 256, nsegs = 16, max_bucket = 4095, high_mask = 8191, low_mask = 4095, keysize = 16, entrysize = 152, num_partitions = 16, ffactor = 1, max_dsize = 256, ssize = 256, sshift = 8, nelem_alloc = 48} (gdb) p *hashp $2 = {hctl = 0x7fd906aae980, dir = 0x7fd906aaecd8, hash = 0xa79ac6 <tag_hash>, match = 0x47cb70 <memcmp@plt>, keycopy = 0x47d0a0 <memcpy@plt>, alloc = 0x8c3419 <ShmemAllocNoError>, hcxt = 0x0, tabname = 0x160f1d0 "LOCK hash", isshared = true, isfixed = false, frozen = false, keysize = 16, ssize = 256, sshift = 8} (gdb) p *hashp->dir $3 = (HASHSEGMENT) 0x7fd906aaf500 (gdb) p hashp->dir $4 = (HASHSEGMENT *) 0x7fd906aaecd8 (gdb) p **hashp->dir $5 = (HASHBUCKET) 0x7fd907212dd0 (gdb) p ***hashp->dir $6 = {link = 0x7fd9071a7b90, hashvalue = 1748602880} (gdb) n 949 if (action == HASH_ENTER || action == HASH_ENTER_NULL) (gdb) 956 if (!IS_PARTITIONED(hctl) && !hashp->frozen && (gdb) 965 bucket = calc_bucket(hctl, hashvalue); --> hash桶 (gdb) 967 segment_num = bucket >> hashp->sshift; --> 桶號右移8位得到段號 (gdb) 968 segment_ndx = MOD(bucket, hashp->ssize); --> 桶號取模得到段內偏移 (gdb) 970 segp = hashp->dir[segment_num]; --> 獲取段(HASHELEMENT **) (gdb) 972 if (segp == NULL) (gdb) p bucket $7 = 2072 (gdb) p segment_num $8 = 8 (gdb) p segment_ndx $9 = 24 (gdb) p segp --> $10 = (HASHSEGMENT) 0x7fd906ab3500 (gdb) (gdb) n 975 prevBucketPtr = &segp[segment_ndx]; --> HASHELEMENT ** (gdb) 976 currBucket = *prevBucketPtr; --> HASHELEMENT * (gdb) 981 match = hashp->match; /* save one fetch in inner loop */ (gdb) p prevBucketPtr $12 = (HASHBUCKET *) 0x7fd906ab35c0 (gdb) p currBucket $13 = (HASHBUCKET) 0x7fd90714da68 (gdb)
感謝各位的閱讀,以上就是“分析PostgreSQL中的數據結構HTAB”的內容了,經過本文的學習后,相信大家對分析PostgreSQL中的數據結構HTAB這一問題有了更深刻的體會,具體使用情況還需要大家實踐驗證。這里是億速云,小編將為大家推送更多相關知識點的文章,歡迎關注!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。