您好,登錄后才能下訂單哦!
本節簡單介紹了PostgreSQL緩存管理(Buffer Manager)中的實現函數ReadBuffer_common->BufferAlloc->BufTableHashCode,該函數根據BufferTag計算Hash Code。
BufferDesc
共享緩沖區的共享描述符(狀態)數據
/*
* Flags for buffer descriptors
* buffer描述器標記
*
* Note: TAG_VALID essentially means that there is a buffer hashtable
* entry associated with the buffer's tag.
* 注意:TAG_VALID本質上意味著有一個與緩沖區的標記相關聯的緩沖區散列表條目。
*/
//buffer header鎖定
#define BM_LOCKED (1U << 22) /* buffer header is locked */
//數據需要寫入(標記為DIRTY)
#define BM_DIRTY (1U << 23) /* data needs writing */
//數據是有效的
#define BM_VALID (1U << 24) /* data is valid */
//已分配buffer tag
#define BM_TAG_VALID (1U << 25) /* tag is assigned */
//正在R/W
#define BM_IO_IN_PROGRESS (1U << 26) /* read or write in progress */
//上一個I/O出現錯誤
#define BM_IO_ERROR (1U << 27) /* previous I/O failed */
//開始寫則變DIRTY
#define BM_JUST_DIRTIED (1U << 28) /* dirtied since write started */
//存在等待sole pin的其他進程
#define BM_PIN_COUNT_WAITER (1U << 29) /* have waiter for sole pin */
//checkpoint發生,必須刷到磁盤上
#define BM_CHECKPOINT_NEEDED (1U << 30) /* must write for checkpoint */
//持久化buffer(不是unlogged或者初始化fork)
#define BM_PERMANENT (1U << 31) /* permanent buffer (not unlogged,
* or init fork) */
/*
* BufferDesc -- shared descriptor/state data for a single shared buffer.
* BufferDesc -- 共享緩沖區的共享描述符(狀態)數據
*
* Note: Buffer header lock (BM_LOCKED flag) must be held to examine or change
* the tag, state or wait_backend_pid fields. In general, buffer header lock
* is a spinlock which is combined with flags, refcount and usagecount into
* single atomic variable. This layout allow us to do some operations in a
* single atomic operation, without actually acquiring and releasing spinlock;
* for instance, increase or decrease refcount. buf_id field never changes
* after initialization, so does not need locking. freeNext is protected by
* the buffer_strategy_lock not buffer header lock. The LWLock can take care
* of itself. The buffer header lock is *not* used to control access to the
* data in the buffer!
* 注意:必須持有Buffer header鎖(BM_LOCKED標記)才能檢查或修改tag/state/wait_backend_pid字段.
* 通常來說,buffer header lock是spinlock,它與標記位/參考計數/使用計數組合到單個原子變量中.
* 這個布局設計允許我們執行原子操作,而不需要實際獲得或者釋放spinlock(比如,增加或者減少參考計數).
* buf_id字段在初始化后不會出現變化,因此不需要鎖定.
* freeNext通過buffer_strategy_lock鎖而不是buffer header lock保護.
* LWLock可以很好的處理自己的狀態.
* 務請注意的是:buffer header lock不用于控制buffer中的數據訪問!
*
* It's assumed that nobody changes the state field while buffer header lock
* is held. Thus buffer header lock holder can do complex updates of the
* state variable in single write, simultaneously with lock release (cleaning
* BM_LOCKED flag). On the other hand, updating of state without holding
* buffer header lock is restricted to CAS, which insure that BM_LOCKED flag
* is not set. Atomic increment/decrement, OR/AND etc. are not allowed.
* 假定在持有buffer header lock的情況下,沒有人改變狀態字段.
* 持有buffer header lock的進程可以執行在單個寫操作中執行復雜的狀態變量更新,
* 同步的釋放鎖(清除BM_LOCKED標記).
* 換句話說,如果沒有持有buffer header lock的狀態更新,會受限于CAS,
* 這種情況下確保BM_LOCKED沒有被設置.
* 比如原子的增加/減少(AND/OR)等操作是不允許的.
*
* An exception is that if we have the buffer pinned, its tag can't change
* underneath us, so we can examine the tag without locking the buffer header.
* Also, in places we do one-time reads of the flags without bothering to
* lock the buffer header; this is generally for situations where we don't
* expect the flag bit being tested to be changing.
* 一種例外情況是如果我們已有buffer pinned,該buffer的tag不能改變(在本進程之下),
* 因此不需要鎖定buffer header就可以檢查tag了.
* 同時,在執行一次性的flags讀取時不需要鎖定buffer header.
* 這種情況通常用于我們不希望正在測試的flag bit將被改變.
*
* We can't physically remove items from a disk page if another backend has
* the buffer pinned. Hence, a backend may need to wait for all other pins
* to go away. This is signaled by storing its own PID into
* wait_backend_pid and setting flag bit BM_PIN_COUNT_WAITER. At present,
* there can be only one such waiter per buffer.
* 如果其他進程有buffer pinned,那么進程不能物理的從磁盤頁面中刪除items.
* 因此,后臺進程需要等待其他pins清除.這可以通過存儲它自己的PID到wait_backend_pid中,
* 并設置標記位BM_PIN_COUNT_WAITER.
* 目前,每個緩沖區只能由一個等待進程.
*
* We use this same struct for local buffer headers, but the locks are not
* used and not all of the flag bits are useful either. To avoid unnecessary
* overhead, manipulations of the state field should be done without actual
* atomic operations (i.e. only pg_atomic_read_u32() and
* pg_atomic_unlocked_write_u32()).
* 本地緩沖頭部使用同樣的結構,但并不需要使用locks,而且并不是所有的標記位都使用.
* 為了避免不必要的負載,狀態域的維護不需要實際的原子操作
* (比如只有pg_atomic_read_u32() and pg_atomic_unlocked_write_u32())
*
* Be careful to avoid increasing the size of the struct when adding or
* reordering members. Keeping it below 64 bytes (the most common CPU
* cache line size) is fairly important for performance.
* 在增加或者記錄成員變量時,小心避免增加結構體的大小.
* 保持結構體大小在64字節內(通常的CPU緩存線大小)對于性能是非常重要的.
*/
typedef struct BufferDesc
{
//buffer tag
BufferTag tag; /* ID of page contained in buffer */
//buffer索引編號(0開始),指向相應的buffer pool slot
int buf_id; /* buffer's index number (from 0) */
/* state of the tag, containing flags, refcount and usagecount */
//tag狀態,包括flags/refcount和usagecount
pg_atomic_uint32 state;
//pin-count等待進程ID
int wait_backend_pid; /* backend PID of pin-count waiter */
//空閑鏈表鏈中下一個空閑的buffer
int freeNext; /* link in freelist chain */
//緩沖區內容鎖
LWLock content_lock; /* to lock access to buffer contents */
} BufferDesc;
BufferTag
Buffer tag標記了buffer存儲的是磁盤中哪個block
/*
* Buffer tag identifies which disk block the buffer contains.
* Buffer tag標記了buffer存儲的是磁盤中哪個block
*
* Note: the BufferTag data must be sufficient to determine where to write the
* block, without reference to pg_class or pg_tablespace entries. It's
* possible that the backend flushing the buffer doesn't even believe the
* relation is visible yet (its xact may have started before the xact that
* created the rel). The storage manager must be able to cope anyway.
* 注意:BufferTag必須足以確定如何寫block而不需要參照pg_class或者pg_tablespace數據字典信息.
* 有可能后臺進程在刷新緩沖區的時候深圳不相信關系是可見的(事務可能在創建rel的事務之前).
* 存儲管理器必須可以處理這些事情.
*
* Note: if there's any pad bytes in the struct, INIT_BUFFERTAG will have
* to be fixed to zero them, since this struct is used as a hash key.
* 注意:如果在結構體中有填充的字節,INIT_BUFFERTAG必須將它們固定為零,因為這個結構體用作散列鍵.
*/
typedef struct buftag
{
//物理relation標識符
RelFileNode rnode; /* physical relation identifier */
ForkNumber forkNum;
//相對于relation起始的塊號
BlockNumber blockNum; /* blknum relative to begin of reln */
} BufferTag;
HTAB
哈希表的頂層控制結構.
/*
* Top control structure for a hashtable --- in a shared table, each backend
* has its own copy (OK since no fields change at runtime)
* 哈希表的頂層控制結構.
* 在這個共享哈希表中,每一個后臺進程都有自己的拷貝
* (之所以沒有問題是因為fork出來后,在運行期沒有字段會變化)
*/
struct HTAB
{
//指向共享的控制信息
HASHHDR *hctl; /* => shared control information */
//段開始目錄
HASHSEGMENT *dir; /* directory of segment starts */
//哈希函數
HashValueFunc hash; /* hash function */
//哈希鍵比較函數
HashCompareFunc match; /* key comparison function */
//哈希鍵拷貝函數
HashCopyFunc keycopy; /* key copying function */
//內存分配器
HashAllocFunc alloc; /* memory allocator */
//內存上下文
MemoryContext hcxt; /* memory context if default allocator used */
//表名(用于錯誤信息)
char *tabname; /* table name (for error messages) */
//如在共享內存中,則為T
bool isshared; /* true if table is in shared memory */
//如為T,則固定大小不能擴展
bool isfixed; /* if true, don't enlarge */
/* freezing a shared table isn't allowed, so we can keep state here */
//不允許凍結共享表,因此這里會保存相關狀態
bool frozen; /* true = no more inserts allowed */
/* We keep local copies of these fixed values to reduce contention */
//保存這些固定值的本地拷貝,以減少沖突
//哈希鍵長度(以字節為單位)
Size keysize; /* hash key length in bytes */
//段大小,必須為2的冪
long ssize; /* segment size --- must be power of 2 */
//段偏移,ssize的對數
int sshift; /* segment shift = log2(ssize) */
};
/*
* Header structure for a hash table --- contains all changeable info
* 哈希表的頭部結構 -- 存儲所有可變信息
*
* In a shared-memory hash table, the HASHHDR is in shared memory, while
* each backend has a local HTAB struct. For a non-shared table, there isn't
* any functional difference between HASHHDR and HTAB, but we separate them
* anyway to share code between shared and non-shared tables.
* 在共享內存哈希表中,HASHHDR位于共享內存中,每一個后臺進程都有一個本地HTAB結構.
* 對于非共享哈希表,HASHHDR和HTAB沒有任何功能性的不同,
* 但無論如何,我們還是把它們區分為共享和非共享表.
*/
struct HASHHDR
{
/*
* The freelist can become a point of contention in high-concurrency hash
* tables, so we use an array of freelists, each with its own mutex and
* nentries count, instead of just a single one. Although the freelists
* normally operate independently, we will scavenge entries from freelists
* other than a hashcode's default freelist when necessary.
* 在高并發的哈希表中,空閑鏈表會成為競爭熱點,因此我們使用空閑鏈表數組,
* 數組中的每一個元素都有自己的mutex和條目統計,而不是使用一個.
*
* If the hash table is not partitioned, only freeList[0] is used and its
* spinlock is not used at all; callers' locking is assumed sufficient.
* 如果哈希表沒有分區,那么只有freelist[0]元素是有用的,自旋鎖沒有任何用處;
* 調用者鎖定被認為已足夠OK.
*/
FreeListData freeList[NUM_FREELISTS];
/* These fields can change, but not in a partitioned table */
//這些域字段可以改變,但不適用于分區表
/* Also, dsize can't change in a shared table, even if unpartitioned */
//同時,就算是非分區表,共享表的dsize也不能改變
//目錄大小
long dsize; /* directory size */
//已分配的段大小(<= dbsize)
long nsegs; /* number of allocated segments (<= dsize) */
//正在使用的最大桶ID
uint32 max_bucket; /* ID of maximum bucket in use */
//進入整個哈希表的模掩碼
uint32 high_mask; /* mask to modulo into entire table */
//進入低于半個哈希表的模掩碼
uint32 low_mask; /* mask to modulo into lower half of table */
/* These fields are fixed at hashtable creation */
//下面這些字段在哈希表創建時已固定
//哈希鍵大小(以字節為單位)
Size keysize; /* hash key length in bytes */
//所有用戶元素大小(以字節為單位)
Size entrysize; /* total user element size in bytes */
//分區個數(2的冪),或者為0
long num_partitions; /* # partitions (must be power of 2), or 0 */
//目標的填充因子
long ffactor; /* target fill factor */
//如目錄是固定大小,則該值為dsize的上限值
long max_dsize; /* 'dsize' limit if directory is fixed size */
//段大小,必須是2的冪
long ssize; /* segment size --- must be power of 2 */
//端偏移,ssize的對數
int sshift; /* segment shift = log2(ssize) */
//一次性分配的條目個數
int nelem_alloc; /* number of entries to allocate at once */
#ifdef HASH_STATISTICS
/*
* Count statistics here. NB: stats code doesn't bother with mutex, so
* counts could be corrupted a bit in a partitioned table.
* 統計信息.
* 注意:統計相關的代碼不會影響mutex,因此對于分區表,統計可能有一點點問題
*/
long accesses;
long collisions;
#endif
};
/*
* HASHELEMENT is the private part of a hashtable entry. The caller's data
* follows the HASHELEMENT structure (on a MAXALIGN'd boundary). The hash key
* is expected to be at the start of the caller's hash entry data structure.
* HASHELEMENT是哈希表條目的私有部分.
* 調用者的數據按照HASHELEMENT結構組織(位于MAXALIGN的邊界).
* 哈希鍵應位于調用者hash條目數據結構的開始位置.
*/
typedef struct HASHELEMENT
{
//鏈接到相同桶中的下一個條目
struct HASHELEMENT *link; /* link to next entry in same bucket */
//該條目的哈希函數結果
uint32 hashvalue; /* hash function result for this entry */
} HASHELEMENT;
/* Hash table header struct is an opaque type known only within dynahash.c */
//哈希表頭部結構,非透明類型,用于dynahash.c
typedef struct HASHHDR HASHHDR;
/* Hash table control struct is an opaque type known only within dynahash.c */
//哈希表控制結構,非透明類型,用于dynahash.c
typedef struct HTAB HTAB;
/* Parameter data structure for hash_create */
//hash_create使用的參數數據結構
/* Only those fields indicated by hash_flags need be set */
//根據hash_flags標記設置相應的字段
typedef struct HASHCTL
{
//分區個數(必須是2的冪)
long num_partitions; /* # partitions (must be power of 2) */
//段大小
long ssize; /* segment size */
//初始化目錄大小
long dsize; /* (initial) directory size */
//dsize上限
long max_dsize; /* limit to dsize if dir size is limited */
//填充因子
long ffactor; /* fill factor */
//哈希鍵大小(字節為單位)
Size keysize; /* hash key length in bytes */
//參見上述數據結構注釋
Size entrysize; /* total user element size in bytes */
//
HashValueFunc hash; /* hash function */
HashCompareFunc match; /* key comparison function */
HashCopyFunc keycopy; /* key copying function */
HashAllocFunc alloc; /* memory allocator */
MemoryContext hcxt; /* memory context to use for allocations */
//共享內存中的哈希頭部結構地址
HASHHDR *hctl; /* location of header in shared mem */
} HASHCTL;
/* A hash bucket is a linked list of HASHELEMENTs */
//哈希桶是HASHELEMENTs鏈表
typedef HASHELEMENT *HASHBUCKET;
/* A hash segment is an array of bucket headers */
//hash segment是桶數組
typedef HASHBUCKET *HASHSEGMENT;
/*
* Hash functions must have this signature.
* Hash函數必須有它自己的標識
*/
typedef uint32 (*HashValueFunc) (const void *key, Size keysize);
/*
* Key comparison functions must have this signature. Comparison functions
* return zero for match, nonzero for no match. (The comparison function
* definition is designed to allow memcmp() and strncmp() to be used directly
* as key comparison functions.)
* 哈希鍵對比函數必須有自己的標識.
* 如匹配則對比函數返回0,不匹配返回非0.
* (對比函數定義被設計為允許在對比鍵值時可直接使用memcmp()和strncmp())
*/
typedef int (*HashCompareFunc) (const void *key1, const void *key2,
Size keysize);
/*
* Key copying functions must have this signature. The return value is not
* used. (The definition is set up to allow memcpy() and strlcpy() to be
* used directly.)
* 鍵拷貝函數必須有自己的標識.
* 返回值無用.
*/
typedef void *(*HashCopyFunc) (void *dest, const void *src, Size keysize);
/*
* Space allocation function for a hashtable --- designed to match malloc().
* Note: there is no free function API; can't destroy a hashtable unless you
* use the default allocator.
* 哈希表的恐懼分配函數 -- 被設計為與malloc()函數匹配.
* 注意:這里沒有釋放函數API;不能銷毀哈希表,除非使用默認的分配器.
*/
typedef void *(*HashAllocFunc) (Size request);
FreeListData
在一個分區哈希表中,每一個空閑鏈表與特定的hashcodes集合相關,通過下面的FREELIST_IDX()宏進行定義.
nentries跟蹤有這些hashcodes的仍存活的hashtable條目個數.
/*
* Per-freelist data.
* 空閑鏈表數據.
*
* In a partitioned hash table, each freelist is associated with a specific
* set of hashcodes, as determined by the FREELIST_IDX() macro below.
* nentries tracks the number of live hashtable entries having those hashcodes
* (NOT the number of entries in the freelist, as you might expect).
* 在一個分區哈希表中,每一個空閑鏈表與特定的hashcodes集合相關,通過下面的FREELIST_IDX()宏進行定義.
* nentries跟蹤有這些hashcodes的仍存活的hashtable條目個數.
* (注意不要搞錯,不是空閑的條目個數)
*
* The coverage of a freelist might be more or less than one partition, so it
* needs its own lock rather than relying on caller locking. Relying on that
* wouldn't work even if the coverage was the same, because of the occasional
* need to "borrow" entries from another freelist; see get_hash_entry().
* 空閑鏈表的覆蓋范圍可能比一個分區多或少,因此需要自己的鎖而不能僅僅依賴調用者的鎖.
* 依賴調用者鎖在覆蓋面一樣的情況下也不會起效,因為偶爾需要從另一個自由列表“借用”條目,詳細參見get_hash_entry()
*
* Using an array of FreeListData instead of separate arrays of mutexes,
* nentries and freeLists helps to reduce sharing of cache lines between
* different mutexes.
* 使用FreeListData數組而不是一個獨立的mutexes,nentries和freelists數組有助于減少不同mutexes之間的緩存線共享.
*/
typedef struct
{
//該空閑鏈表的自旋鎖
slock_t mutex; /* spinlock for this freelist */
//相關桶中的條目個數
long nentries; /* number of entries in associated buckets */
//空閑元素鏈
HASHELEMENT *freeList; /* chain of free elements */
} FreeListData;
BufferLookupEnt
/* entry for buffer lookup hashtable */
//檢索hash表的條目
typedef struct
{
//磁盤page的tag
BufferTag key; /* Tag of a disk page */
//相關聯的buffer ID
int id; /* Associated buffer ID */
} BufferLookupEnt;
BufTableHashCode
BufTableHashCode函數根據BufferTag計算Hash Code,主要的邏輯在函數hash_any中實現
/*
* BufTableHashCode
* Compute the hash code associated with a BufferTag
* 根據BufferTag計算Hash Code
*
* This must be passed to the lookup/insert/delete routines along with the
* tag. We do it like this because the callers need to know the hash code
* in order to determine which buffer partition to lock, and we don't want
* to do the hash computation twice (hash_any is a bit slow).
* 該函數的返回值需要作為參數傳遞給與tag相關的檢索/插入/刪除處理過程.
* 之所以這樣處理是因為調用者需要知道Hash Code以便確定那個buffer partition需要鎖定,
* 而且我們不希望多次計算hash(hash_any有一點點慢).
*/
uint32
BufTableHashCode(BufferTag *tagPtr)
{
return get_hash_value(SharedBufHash, (void *) tagPtr);
}
/*
* get_hash_value -- exported routine to calculate a key's hash value
* get_hash_value -- exported過程,用于計算key的hash值
*
* We export this because for partitioned tables, callers need to compute
* the partition number (from the low-order bits of the hash value) before
* searching.
* 之所以export這個過程是因為分區表,調用者需要在搜索前計算分區編號(根據hash值的lower-order bits)
*/
uint32
get_hash_value(HTAB *hashp, const void *keyPtr)
{
return hashp->hash(keyPtr, hashp->keysize);
}
/*
* tag_hash: hash function for fixed-size tag values
* tag_hash:固定tag大小的hash函數
*/
uint32
tag_hash(const void *key, Size keysize)
{
return DatumGetUInt32(hash_any((const unsigned char *) key,
(int) keysize));
}
/*
* DatumGetUInt32
* Returns 32-bit unsigned integer value of a datum.
* DatumGetUInt32返回datum的32位無符號整型值
*/
#define DatumGetUInt32(X) ((uint32) (X))
hash_any
hash_any函數hash一個可變長鍵值到一個32位值
/*
* hash_any() -- hash a variable-length key into a 32-bit value
* k : the key (the unaligned variable-length array of bytes)
* len : the length of the key, counting by bytes
* hash_any() -- hash一個可變長鍵值到一個32位值.
* k : the key(未對齊的可變長字節數組)
*
* Returns a uint32 value. Every bit of the key affects every bit of
* the return value. Every 1-bit and 2-bit delta achieves avalanche.
* About 6*len+35 instructions. The best hash table sizes are powers
* of 2. There is no need to do mod a prime (mod is sooo slow!).
* If you need less than 32 bits, use a bitmask.
* 返回無符號32位整型值.
* key的每一位都會影響返回值的每一位.每一個1位和2位增量都會產生雪崩.
* 大概有6*len+35個指令.最好的hash表大小是2的冪.不需要進行很慢的mod操作.
* 如果需要少于32bits的值,那使用bitmask.
*
* This procedure must never throw elog(ERROR); the ResourceOwner code
* relies on this not to fail.
* 這個過程永遠都不要拋出elog(ERROR);依賴此函數的ResourceOwner代碼永遠都不會出現異常.
*
* Note: we could easily change this function to return a 64-bit hash value
* by using the final values of both b and c. b is perhaps a little less
* well mixed than c, however.
* 注意:不能輕易的改變該函數,通過使用b和c的最后值來返回64-bit的hash值.b的混合度可能沒有c好
*
*/
Datum
hash_any(register const unsigned char *k, register int keylen)
{
register uint32 a,
b,
c,
len;
/* Set up the internal state */
//設置內部狀態,初始化a/b/c
len = keylen;
a = b = c = 0x9e3779b9 + len + 3923095;
/* If the source pointer is word-aligned, we use word-wide fetches */
//如果源指針是字對齊的,那么我們使用字寬提取
if (((uintptr_t) k & UINT32_ALIGN_MASK) == 0)
{
//源數據是對齊的
/* Code path for aligned source data */
register const uint32 *ka = (const uint32 *) k;
/* handle most of the key */
while (len >= 12)
{
a += ka[0];
b += ka[1];
c += ka[2];
mix(a, b, c);
ka += 3;
len -= 12;
}
/* handle the last 11 bytes */
//處理后面11個字節
k = (const unsigned char *) ka;
#ifdef WORDS_BIGENDIAN//大碼端
switch (len)
{
case 11:
c += ((uint32) k[10] << 8);
/* fall through */
case 10:
c += ((uint32) k[9] << 16);
/* fall through */
case 9:
c += ((uint32) k[8] << 24);
/* fall through */
case 8:
/* the lowest byte of c is reserved for the length */
b += ka[1];
a += ka[0];
break;
case 7:
b += ((uint32) k[6] << 8);
/* fall through */
case 6:
b += ((uint32) k[5] << 16);
/* fall through */
case 5:
b += ((uint32) k[4] << 24);
/* fall through */
case 4:
a += ka[0];
break;
case 3:
a += ((uint32) k[2] << 8);
/* fall through */
case 2:
a += ((uint32) k[1] << 16);
/* fall through */
case 1:
a += ((uint32) k[0] << 24);
/* case 0: nothing left to add */
}
#else /* 小碼端; !WORDS_BIGENDIAN */
switch (len)
{
case 11:
c += ((uint32) k[10] << 24);
/* fall through */
case 10:
c += ((uint32) k[9] << 16);
/* fall through */
case 9:
c += ((uint32) k[8] << 8);
/* fall through */
case 8:
/* the lowest byte of c is reserved for the length */
b += ka[1];
a += ka[0];
break;
case 7:
b += ((uint32) k[6] << 16);
/* fall through */
case 6:
b += ((uint32) k[5] << 8);
/* fall through */
case 5:
b += k[4];
/* fall through */
case 4:
a += ka[0];
break;
case 3:
a += ((uint32) k[2] << 16);
/* fall through */
case 2:
a += ((uint32) k[1] << 8);
/* fall through */
case 1:
a += k[0];
/* case 0: nothing left to add */
}
#endif /* WORDS_BIGENDIAN */
}
else//---------- 非字對齊
{
/* Code path for non-aligned source data */
/* handle most of the key */
while (len >= 12)
{
#ifdef WORDS_BIGENDIAN
a += (k[3] + ((uint32) k[2] << 8) + ((uint32) k[1] << 16) + ((uint32) k[0] << 24));
b += (k[7] + ((uint32) k[6] << 8) + ((uint32) k[5] << 16) + ((uint32) k[4] << 24));
c += (k[11] + ((uint32) k[10] << 8) + ((uint32) k[9] << 16) + ((uint32) k[8] << 24));
#else /* !WORDS_BIGENDIAN */
a += (k[0] + ((uint32) k[1] << 8) + ((uint32) k[2] << 16) + ((uint32) k[3] << 24));
b += (k[4] + ((uint32) k[5] << 8) + ((uint32) k[6] << 16) + ((uint32) k[7] << 24));
c += (k[8] + ((uint32) k[9] << 8) + ((uint32) k[10] << 16) + ((uint32) k[11] << 24));
#endif /* WORDS_BIGENDIAN */
mix(a, b, c);
k += 12;
len -= 12;
}
/* handle the last 11 bytes */
#ifdef WORDS_BIGENDIAN
switch (len)
{
case 11:
c += ((uint32) k[10] << 8);
/* fall through */
case 10:
c += ((uint32) k[9] << 16);
/* fall through */
case 9:
c += ((uint32) k[8] << 24);
/* fall through */
case 8:
/* the lowest byte of c is reserved for the length */
b += k[7];
/* fall through */
case 7:
b += ((uint32) k[6] << 8);
/* fall through */
case 6:
b += ((uint32) k[5] << 16);
/* fall through */
case 5:
b += ((uint32) k[4] << 24);
/* fall through */
case 4:
a += k[3];
/* fall through */
case 3:
a += ((uint32) k[2] << 8);
/* fall through */
case 2:
a += ((uint32) k[1] << 16);
/* fall through */
case 1:
a += ((uint32) k[0] << 24);
/* case 0: nothing left to add */
}
#else /* !WORDS_BIGENDIAN */
switch (len)
{
case 11:
c += ((uint32) k[10] << 24);
/* fall through */
case 10:
c += ((uint32) k[9] << 16);
/* fall through */
case 9:
c += ((uint32) k[8] << 8);
/* fall through */
case 8:
/* the lowest byte of c is reserved for the length */
b += ((uint32) k[7] << 24);
/* fall through */
case 7:
b += ((uint32) k[6] << 16);
/* fall through */
case 6:
b += ((uint32) k[5] << 8);
/* fall through */
case 5:
b += k[4];
/* fall through */
case 4:
a += ((uint32) k[3] << 24);
/* fall through */
case 3:
a += ((uint32) k[2] << 16);
/* fall through */
case 2:
a += ((uint32) k[1] << 8);
/* fall through */
case 1:
a += k[0];
/* case 0: nothing left to add */
}
#endif /* WORDS_BIGENDIAN */
}
final(a, b, c);
/* report the result */
return UInt32GetDatum(c);
}
/*----------
* mix -- mix 3 32-bit values reversibly.
*
* This is reversible, so any information in (a,b,c) before mix() is
* still in (a,b,c) after mix().
*
* If four pairs of (a,b,c) inputs are run through mix(), or through
* mix() in reverse, there are at least 32 bits of the output that
* are sometimes the same for one pair and different for another pair.
* This was tested for:
* * pairs that differed by one bit, by two bits, in any combination
* of top bits of (a,b,c), or in any combination of bottom bits of
* (a,b,c).
* * "differ" is defined as +, -, ^, or ~^. For + and -, I transformed
* the output delta to a Gray code (a^(a>>1)) so a string of 1's (as
* is commonly produced by subtraction) look like a single 1-bit
* difference.
* * the base values were pseudorandom, all zero but one bit set, or
* all zero plus a counter that starts at zero.
*
* This does not achieve avalanche. There are input bits of (a,b,c)
* that fail to affect some output bits of (a,b,c), especially of a. The
* most thoroughly mixed value is c, but it doesn't really even achieve
* avalanche in c.
*
* This allows some parallelism. Read-after-writes are good at doubling
* the number of bits affected, so the goal of mixing pulls in the opposite
* direction from the goal of parallelism. I did what I could. Rotates
* seem to cost as much as shifts on every machine I could lay my hands on,
* and rotates are much kinder to the top and bottom bits, so I used rotates.
*----------
*/
#define mix(a,b,c) \
{ \
a -= c; a ^= rot(c, 4); c += b; \
b -= a; b ^= rot(a, 6); a += c; \
c -= b; c ^= rot(b, 8); b += a; \
a -= c; a ^= rot(c,16); c += b; \
b -= a; b ^= rot(a,19); a += c; \
c -= b; c ^= rot(b, 4); b += a; \
}
/*
* UInt32GetDatum
* Returns datum representation for a 32-bit unsigned integer.
*/
#define UInt32GetDatum(X) ((Datum) (X))
N/A
PG Source Code
Buffer Manager
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。