PostgreSQL游標(biāo)與索引選擇實(shí)例詳細(xì)介紹

更新時間：2022年09月15日 16:34:32 作者：foucus、

這篇文章主要介紹了PostgreSQL游標(biāo)與索引選擇優(yōu)化案例，文中通過示例代碼介紹的非常詳細(xì)，對大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價值，需要的朋友們下面隨著小編來一起學(xué)習(xí)吧

之前有寫過一個案例，order by limit因?yàn)閿?shù)據(jù)分布不均而選擇了錯誤的索引，這是由于優(yōu)化器沒法判斷數(shù)據(jù)的分布關(guān)系，默認(rèn)認(rèn)為數(shù)據(jù)分布是均勻的所導(dǎo)致的。

而除了limit，當(dāng)我們在使用游標(biāo)時也要注意有可能會出現(xiàn)類似的情況。而往往這類在存儲過程中的SQL我們更難發(fā)現(xiàn)其選擇了錯誤的執(zhí)行計(jì)劃，所以需要注意。

1、建測試表

bill=# create table tbl (id int, c1 int, c2 int, c3 int, c4 int);
CREATE TABLE

2、寫入一批隨機(jī)數(shù)據(jù)，ID從1到1000萬。

bill=# insert into tbl select generate_series(1,10000000), random()*100, random()*100, random()*100, random()*100;  
INSERT 0 10000000

3、寫入另一批100萬條數(shù)據(jù)，c1,c2 與前面1000萬的值不一樣。

bill=# insert into tbl select generate_series(10000001,11000000), 200,200,200,200;  
INSERT 0 1000000

4、創(chuàng)建兩個索引，也就是本文需要重點(diǎn)關(guān)注的，到底走哪個索引更劃算

bill=# create index idx_tbl_1 on tbl(id);  
CREATE INDEX  
bill=# create index idx_tbl_2 on tbl(c1,c2,c3,c4);  
CREATE INDEX

5、收集統(tǒng)計(jì)信息

bill=# vacuum analyze tbl;  
VACUUM

6、查看下面SQL的執(zhí)行計(jì)劃，走了正確的索引

bill=# explain select * from tbl where c1=200 and c2=200 order by id;
                                     QUERY PLAN
-------------------------------------------------------------------------------------
 Sort  (cost=72109.20..72344.16 rows=93984 width=20)
   Sort Key: id
   ->  Bitmap Heap Scan on tbl  (cost=1392.77..60811.81 rows=93984 width=20)
         Recheck Cond: ((c1 = 200) AND (c2 = 200))
         ->  Bitmap Index Scan on idx_tbl_2  (cost=0.00..1369.28 rows=93984 width=0)
               Index Cond: ((c1 = 200) AND (c2 = 200))
(6 rows)

7、而當(dāng)我們在游標(biāo)中使用該SQL時，會發(fā)現(xiàn)執(zhí)行計(jì)劃出現(xiàn)了偏差

bill=# begin;
BEGIN
bill=*# explain declare tt cursor for select * from tbl where c1=200 and c2=200 order by id;
                                  QUERY PLAN
-------------------------------------------------------------------------------
 Index Scan using idx_tbl_1 on tbl  (cost=0.43..329277.60 rows=93984 width=20)
   Filter: ((c1 = 200) AND (c2 = 200))
(2 rows)

為什么會出現(xiàn)這種情況呢，這其實(shí)是因?yàn)槭褂糜螛?biāo)的SQL會根據(jù)cursor_tuple_fraction參數(shù)進(jìn)行自動優(yōu)化，而該參數(shù)默認(rèn)是0.1，表示只檢索前10%的行進(jìn)行預(yù)估，這就和limit有點(diǎn)異曲同工的味道了。

因?yàn)閷τ谶@張表，優(yōu)化器認(rèn)為數(shù)據(jù)是均勻分布的，而實(shí)際上，數(shù)據(jù)分布是不均勻的，c1=200 and c2=200的記錄在表的末端。當(dāng)我們在游標(biāo)中只檢索了前10%的行，所以會得到一個錯誤的執(zhí)行計(jì)劃。

具體的細(xì)節(jié)我們可以在parsenodes.h和planner.c中看到：

當(dāng)使用cursor或者SPI_PREPARE_CURSOR函數(shù)時，會設(shè)置CURSOR_OPT_FAST_PLAN標(biāo)志位，然后就會根據(jù)cursor_tuple_fraction參數(shù)對SQL進(jìn)行自動優(yōu)化，所以對于一些數(shù)據(jù)分布不均的情況，可能就會

導(dǎo)致選擇了錯誤的執(zhí)行計(jì)劃。
	/* Determine what fraction of the plan is likely to be scanned */
	if (cursorOptions & CURSOR_OPT_FAST_PLAN)
	{
		/*
		 * We have no real idea how many tuples the user will ultimately FETCH
		 * from a cursor, but it is often the case that he doesn't want 'em
		 * all, or would prefer a fast-start plan anyway so that he can
		 * process some of the tuples sooner.  Use a GUC parameter to decide
		 * what fraction to optimize for.
		 */
		tuple_fraction = cursor_tuple_fraction;
		/*
		 * We document cursor_tuple_fraction as simply being a fraction, which
		 * means the edge cases 0 and 1 have to be treated specially here.  We
		 * convert 1 to 0 ("all the tuples") and 0 to a very small fraction.
		 */
		if (tuple_fraction >= 1.0)
			tuple_fraction = 0.0;
		else if (tuple_fraction <= 0.0)
			tuple_fraction = 1e-10;
	}
	else
	{
		/* Default assumption is we need all the tuples */
		tuple_fraction = 0.0;
	}

到此這篇關(guān)于PostgreSQL游標(biāo)與索引選擇實(shí)例詳細(xì)介紹的文章就介紹到這了,更多相關(guān)PostgreSQL游標(biāo)與索引選擇內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: