您好,登錄后才能下訂單哦!
本篇內容介紹了“PostgreSQL統計信息中相關的數據字典表有哪些”的有關知識,在實際案例的操作過程中,不少人都會遇到這樣的困境,接下來就讓小編帶領大家學習一下如何處理這些情況吧!希望大家仔細閱讀,能夠學有所成!
在pg_class數據字典表中,存儲了Relation(包括Index)的兩個統計信息:頁面占用總數relpages和元組總數reltuples,在優化器執行物理優化時用于估算訪問路徑的啟動成本和總成本.
測試腳本如下,在t_grxx表中插入10w數據,創建2個索引:
drop table if exists t_grxx; create table t_grxx(dwbh varchar(10),grbh varchar(10),xm varchar(20),xb varchar(10),nl int); insert into t_grxx(dwbh,grbh,xm,xb,nl) select generate_series(1,100000)/10||'',generate_series(1,100000),'XM'||generate_series(1,100000), (case when (floor(random()*2)=0) then '男' else '女' end),floor(random() * 100 + 1)::int; create index idx_t_grxx_grbh on t_grxx(grbh); create index idx_t_grxx_dwbh on t_grxx(dwbh);
基礎關系t_grxx中的統計信息
testdb=# select relpages,reltuples from pg_class where relname = 't_grxx'; relpages | reltuples ----------+----------- 726 | 100000 --> 頁面數726,元組數10,000 (1 row)
索引idx_t_grxx_grbh的統計信息
testdb=# select relpages,reltuples from pg_class where relname = 'idx_t_grxx_grbh'; relpages | reltuples ----------+----------- 276 | 100000 --> 頁面數276,元組數10,000 (1 row)
pg_statistic是PG中存儲統計信息的主要數據字典表.通過命令ANALYZE生成統計數據,這些統計數據在計劃階段提供給優化器使用,是成本估算的基礎.
pg_statistic的表結構如下:
testdb=# \d pg_statistic Table "pg_catalog.pg_statistic" Column | Type | Collation | Nullable | Default -------------+----------+-----------+----------+--------- starelid | oid | | not null | staattnum | smallint | | not null | stainherit | boolean | | not null | stanullfrac | real | | not null | stawidth | integer | | not null | stadistinct | real | | not null | stakind1 | smallint | | not null | stakind2 | smallint | | not null | stakind3 | smallint | | not null | stakind4 | smallint | | not null | stakind5 | smallint | | not null | staop1 | oid | | not null | staop2 | oid | | not null | staop3 | oid | | not null | staop4 | oid | | not null | staop5 | oid | | not null | stanumbers1 | real[] | | | stanumbers2 | real[] | | | stanumbers3 | real[] | | | stanumbers4 | real[] | | | stanumbers5 | real[] | | | stavalues1 | anyarray | | | stavalues2 | anyarray | | | stavalues3 | anyarray | | | stavalues4 | anyarray | | | stavalues5 | anyarray | | | Indexes: "pg_statistic_relid_att_inh_index" UNIQUE, btree (starelid, staattnum, stainherit)
其中:
starelid:數據列所屬的關系或索引Oid
staattnum:數據列編號
stainherit:是否繼承表(分區表)
stanullfrac:NULL值所占的比例
stawidth:平均行大小,以字節為單位
stadistinct:非NULL值的唯一值信息.>0,表示多少個唯一值;=0,表示未知;<0,根據重復值出現次數求得:stadistinct = (-1)x(1-stanullfrac)/times(重復次數)
stakindN:在N號slot槽中存儲的是哪種類型的統計信息
staopN:在N號槽中所使用的操作符,如"=","<",">"等
stanumbersN:在N號槽中存儲的數值類型統計信息
stavaluesN*:在N號槽中存儲的數據值(使用anyarray類型)
統計信息類型,在src/include/catalog/pg_statistic.h文件中有相關解釋,定義如下:
#define STATISTIC_KIND_MCV 1 #define STATISTIC_KIND_HISTOGRAM 2 #define STATISTIC_KIND_CORRELATION 3 #define STATISTIC_KIND_MCELEM 4 #define STATISTIC_KIND_DECHIST 5 #define STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM 6 #define STATISTIC_KIND_BOUNDS_HISTOGRAM 7
MCV=most common values,最常見值,即高頻值
HISTOGRAM=HISTOGRAM,數據分布直方圖
CORRELATION=CORRELATION,相關系數,該列未排序的數據分布與排序后的數據分布的相關性
MCELEM=most common elements,與MCV類似,存儲的是最常見非NULL值
DECHIST=distinct elements count histogram,描述數組類型列的每行中不同值的分布。
RANGE_LENGTH_HISTOGRAM=length histogram,以行為單位的距離類型列的長度分布。
BOUNDS_HISTOGRAM=與HISTOGRAM類似,面向range-type column
仍以t_grxx表為例說明該表的統計信息
表結構:
testdb=# \d t_grxx Table "public.t_grxx" Column | Type | Collation | Nullable | Default --------+-----------------------+-----------+----------+--------- dwbh | character varying(10) | | | grbh | character varying(10) | | | xm | character varying(20) | | | xb | character varying(10) | | | nl | integer | | | Indexes: "idx_t_grxx_dwbh" btree (dwbh) "idx_t_grxx_grbh" btree (grbh)
數據列信息:
testdb=# select attrelid,attname,attnum from pg_attribute where attrelid = 16742 order by attnum; attrelid | attname | attnum ----------+----------+-------- 16742 | tableoid | -7 16742 | cmax | -6 16742 | xmax | -5 16742 | cmin | -4 16742 | xmin | -3 16742 | ctid | -1 16742 | dwbh | 1 16742 | grbh | 2 16742 | xm | 3 16742 | xb | 4 16742 | nl | 5 (11 rows)
dwbh/grbh/xm/xb/nl這5列的編號分別是1/2/3/4/5.
查詢常規的統計信息
testdb=# select starelid,staattnum,stainherit,stanullfrac,stawidth,stadistinct from pg_statistic where starelid = 16742; starelid | staattnum | stainherit | stanullfrac | stawidth | stadistinct ----------+-----------+------------+-------------+----------+------------- 16742 | 1 | f | 0 | 4 | 9984 16742 | 2 | f | 0 | 5 | -1 16742 | 3 | f | 0 | 7 | -1 16742 | 4 | f | 0 | 4 | 2 16742 | 5 | f | 0 | 4 | 100 (5 rows)
stanullfrac為0表示沒有NULL值
stawidth表示各列的平均行大小,比如dwbh平均行大小為4個字節
stadistinct表示唯一值信息,比如dwbh,10000行中有9984個唯一值,而xb只有2個,grbh和xm則不存在重復值.
查詢其他統計信息
testdb=# select starelid,staattnum,stakind1,stakind2,stakind3,stakind4,stakind5 from pg_statistic where starelid = 16742 order by staattnum; starelid | staattnum | stakind1 | stakind2 | stakind3 | stakind4 | stakind5 ----------+-----------+----------+----------+----------+----------+---------- 16742 | 1 | 1 | 2 | 3 | 0 | 0 16742 | 2 | 2 | 3 | 0 | 0 | 0 16742 | 3 | 2 | 3 | 0 | 0 | 0 16742 | 4 | 1 | 3 | 0 | 0 | 0 16742 | 5 | 1 | 3 | 0 | 0 | 0 (5 rows)
第1列(dwbh),在第1/2/3個槽中分別存儲了類型為第1/2/3種類型的統計信息,即STATISTIC_KIND_MCV/STATISTIC_KIND_HISTOGRAM/STATISTIC_KIND_CORRELATION,其他槽沒有內容
第2列(grbh),在第1/2個槽中分別存儲了類型為第2/3種類型的統計信息,即STATISTIC_KIND_HISTOGRAM/STATISTIC_KIND_CORRELATION,其他槽沒有內容
第5列(nl),在第1/2個槽中分別存儲了類型為第1/3種類型的統計信息,即STATISTIC_KIND_MCV/STATISTIC_KIND_CORRELATION,其他槽沒有內容
第一列統計信息
testdb=# \x Expanded display is on. testdb=# select staattnum,stakind1,staop1,stanumbers1,stavalues1, testdb-# stakind2,staop2,stanumbers2,stavalues2, testdb-# stakind3,staop3,stanumbers3,stavalues3 testdb-# from pg_statistic testdb-# where starelid = 16742 testdb-# and staattnum = 1; -[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- staattnum | 1 -->dwbh stakind1 | 1 -->STATISTIC_KIND_MCV,高頻值統計信息 staop1 | 98 -->"=",查詢pg_operator,select * from pg_operator where oid=98; stanumbers1 | {0.0003} -->高頻值比例,即3/10000 stavalues1 | {24} -->表示'24'這個值 stakind2 | 2 -->STATISTIC_KIND_HISTOGRAM,直方圖 staop2 | 664 -->"<",左閉右開的區間 stanumbers2 | -->無 stavalues2 | {0,1084,1172,1262,1356,1441,1533,1622,1703,1790,1885,198,207,2165,2259,2351,2447,2530,262,2708,2799,2895,2985,3087,3172,3262,3359,3459,3541,3629,3716,3805,3900,3995,4078,417,4257,4345,4432,4515,461,47,479,489,4985,5069,5154,5244,533,542,5510,5596,5686,5786,587,5963,6053,6145,6232,6323,6412,650,6599,6686,6779,6868,6957,7038,7127,7218,7305,7403,7495,7588,7679,7767,7857,7942,8037,8125,8209,8307,8392,8481,8575,8664,8755,8844,8935,9022,9115,9202,9296,9378,9464,9561,965,9731,982,9906,9999} -->剔除高頻值后的數據分布,假定平均分布,0為MIN值,9999為MAX值 -->[0,1084),[1084,1172),...[9906,9999) stakind3 | 3 -->STATISTIC_KIND_CORRELATION,相關系數 staop3 | 664 -->"<" stanumbers3 | {0.817163}-->未排序和已排序的數據分布,有81.7%的相關性,值越高,順序掃描結果越近似于排序 stavalues3 | -->無
第二列統計信息
testdb=# \x Expanded display is on. testdb=# select staattnum,stakind1,staop1,stanumbers1,stavalues1, stakind2,staop2,stanumbers2,stavalues2, stakind3,staop3,stanumbers3,stavalues3 from pg_statistic where starelid = 16742 and staattnum = 2; -[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- staattnum | 2 -->grbh stakind1 | 2 -->STATISTIC_KIND_HISTOGRAM,直方圖 staop1 | 664 -->"<" stanumbers1 | -->NULL stavalues1 | {1,1088,11753,12659,1358,14448,15354,16253,17071,17939,18872,19821,20730,21676,22630,23540,24479,25302,26198,27069,27974,2894,29836,30851,31725,32619,33587,34584,35427,36280,37164,3806,39000,39942,40786,41695,42571,43452,44330,4516,46102,46997,47900,48899,49852,50697,5157,52452,533,54204,5512,55963,5686,57860,5871,59633,60544,61468,62331,63252,64134,65016,65993,66870,67798,68687,69568,70391,71285,72205,73066,74045,74962,75881,7679,77675,78573,79434,8038,81276,82125,83078,83939,84822,85764,86654,87557,88450,89352,90249,91171,92030,92979,93797,94665,95628,96510,97328,98212,99074,99998} -->參照第一列 stakind2 | 3 -->STATISTIC_KIND_CORRELATION,相關系數 staop2 | 664 -->"<" stanumbers2 | {0.816172}-->未排序和已排序的數據分布 stavalues2 | -->NULL stakind3 | 0 -->第3個槽無統計信息 staop3 | 0 stanumbers3 | stavalues3 |
第五列統計信息
testdb=# select staattnum,stakind1,staop1,stanumbers1,stavalues1, testdb-# stakind2,staop2,stanumbers2,stavalues2, testdb-# stakind3,staop3,stanumbers3,stavalues3 testdb-# from pg_statistic testdb-# where starelid = 16742 testdb-# and staattnum = 5; -[ RECORD 1 ]------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- staattnum | 5 -->nl stakind1 | 1 -->STATISTIC_KIND_MCV,高頻值統計信息 staop1 | 96 -->"=" stanumbers1 | {0.0117667,0.0117333,0.0113,0.0112667,0.0112333,0.0110333,0.0109333,0.0109333,0.0109333,0.0108333,0.0108,0.0108,0.0107667,0.0107667,0.0107333,0.0107333,0.0107333,0.0107,0.0106333,0.0106333,0.0106333,0.0105667,0.0105333,0.0105333,0.0105,0.0104667,0.0104667,0.0104333,0.0104333,0.0103667,0.0103,0.0102667,0.0102333,0.0102333,0.0102333,0.0102333,0.0102,0.0102,0.0102,0.0101667,0.0101667,0.0101333,0.0101,0.0101,0.0100667,0.0100333,0.0100333,0.01,0.00996667,0.00996667,0.00996667,0.00996667,0.00993333,0.00993333,0.00993333,0.00993333,0.0099,0.00986667,0.00983333,0.00983333,0.0098,0.0098,0.0098,0.00973333,0.0097,0.00966667,0.00966667,0.00963333,0.00963333,0.00963333,0.00963333,0.0096,0.0096,0.00956667,0.00953333,0.0095,0.0095,0.0095,0.0095,0.00943333,0.0094,0.00936667,0.00936667,0.00933333,0.00933333,0.0093,0.00916667,0.00916667,0.00916667,0.00913333,0.0091,0.0091,0.0091,0.00906667,0.009,0.00896667,0.00893333,0.00876667,0.00876667,0.0079} -->每個高頻值的出現頻率 stavalues1 | {73,40,7,51,15,8,14,74,100,53,77,89,33,42,26,29,78,11,16,38,58,18,27,95,49,34,92,65,87,30,32,28,23,62,82,96,1,50,60,36,70,52,9,37,84,54,61,5,59,72,80,93,35,66,79,81,55,68,88,98,13,47,56,22,24,39,43,3,21,75,85,25,44,71,86,4,20,83,97,91,99,63,94,17,46,57,48,67,69,90,2,6,45,19,10,64,76,31,41,12} -->高頻值 stakind2 | 3 -->STATISTIC_KIND_CORRELATION,相關系數 staop2 | 97 -->"<",整型比較,664是字符型比較 stanumbers2 | {0.00562935} -->0.5%的相關性,相關性較低 stavalues2 | -->NULL stakind3 | 0 -->無相關信息 staop3 | 0 stanumbers3 | stavalues3 |
pg_statistic_ext數據字典表用于存儲多列統計信息,需使用CREATE STATISTICS命令創建統計信息,在執行ANALYZE命令時統計.
仍以t_grxx為例,統計dwbh和grbh這兩列的信息
testdb=# create statistics sta_t_grxx_dwbh_grbh on dwbh,grbh from t_grxx; CREATE STATISTICS testdb=# analyze t_grxx(dwbh,grbh); ANALYZE
查詢多列統計信息
testdb=# \x Expanded display is on. testdb=# select * from pg_statistic_ext; -[ RECORD 1 ]---+----------------------------------------- stxrelid | 16742 -->數據表Oid stxname | sta_t_grxx_dwbh_grbh -->統計信息名稱 stxnamespace | 2200 -->表空間 stxowner | 10 -->統計信息的Owner stxkeys | 1 2 -->列編號,1 2表示dwbh grbh stxkind | {d,f} -->STATS_EXT_NDISTINCT/STATS_EXT_DEPENDENCIES stxndistinct | {"1, 2": 100000} -->STATS_EXT_NDISTINCT統計信息 stxdependencies | {"1 => 2": 0.040900, "2 => 1": 1.000000} -->STATS_EXT_DEPENDENCIES統計信息
“PostgreSQL統計信息中相關的數據字典表有哪些”的內容就介紹到這里了,感謝大家的閱讀。如果想了解更多行業相關的知識可以關注億速云網站,小編將為大家輸出更多高質量的實用文章!
免責聲明:本站發布的內容(圖片、視頻和文字)以原創、轉載和分享為主,文章觀點不代表本網站立場,如果涉及侵權請聯系站長郵箱:is@yisu.com進行舉報,并提供相關證據,一經查實,將立刻刪除涉嫌侵權內容。