文章詳情頁

MySQL的join buffer原理

瀏覽：58日期：2023-10-03 15:15:24

一、MySQL的join buffer

在MySQL對(duì)于join操作的處理過程中，join buffer是一個(gè)重要的概念，也是MySQL對(duì)于table join的一個(gè)重要的優(yōu)化手段。雖然這個(gè)概念實(shí)現(xiàn)并不復(fù)雜，但是這個(gè)是實(shí)現(xiàn)MySQL join連接優(yōu)化的一個(gè)重要方法，在'暴力'連接的時(shí)候可以極大提高join查詢的效率。

關(guān)于這個(gè)概念的權(quán)威說明當(dāng)然是來自MySQL文檔中對(duì)于這個(gè)概念的說明，說明的文字不多，但是言簡(jiǎn)意賅，說明了這個(gè)優(yōu)化的主要實(shí)現(xiàn)思想：Assume you have the following join:

Table name Typet1 ranget2 reft3 ALLThe join is then done as follows: - While rows in t1 matching range - Read through all rows in t2 according to reference key - Store used fields from t1, t2 in cache - If cache is full - Read through all rows in t3 - Compare t3 row against all t1, t2 combinations in cache- If row satisfies join condition, send it to client - Empty cache - Read through all rows in t3 - Compare t3 row against all stored t1, t2 combinations in cache - If row satisfies join condition, send it to client二、join buffer cache存儲(chǔ)空間的分配

下面函數(shù)中table_count表示的就是所有join table中在該table之前的非const table數(shù)量，因?yàn)檫@個(gè)table要緩存自己之前所有table中的每條記錄中'需讀取'(tables[i].table->read_set置位)。

其中兩重循環(huán)每次執(zhí)行都是復(fù)制下需要緩存的field的描述結(jié)構(gòu)(及其對(duì)應(yīng)的數(shù)據(jù)源)，或者說，二重循環(huán)只是為了賦值和保存元數(shù)據(jù)，而最后的cache->buff=(uchar*) my_malloc(size,MYF(0))才是真正的分配滿足條件的記錄內(nèi)容。

static intjoin_init_cache(THD *thd,JOIN_TAB *tables,uint table_count){…… for (i=0 ; i < table_count ; i++) { bool have_bit_fields= FALSE; uint null_fields=0,used_fields; Field **f_ptr,*field; MY_BITMAP *read_set= tables[i].table->read_set; for (f_ptr=tables[i].table->field,used_fields=tables[i].used_fields ; used_fields ; f_ptr++) { field= *f_ptr; if (bitmap_is_set(read_set, field->field_index)) {used_fields--;length+=field->fill_cache_field(copy);…… } } cache->length=length+blobs*sizeof(char*); cache->blobs=blobs; *blob_ptr=0; /* End sequentel */ size=max(thd->variables.join_buff_size, cache->length); if (!(cache->buff=(uchar*) my_malloc(size,MYF(0)))) DBUG_RETURN(1); /* Don’t use cache */ /* purecov: inspected */ cache->end=cache->buff+size; reset_cache_write(cache); DBUG_RETURN(0);}三、普通的多表查詢實(shí)現(xiàn)

這個(gè)'普通'當(dāng)然也可以理解為'樸素'、'直觀'的意思，也是大部分情況下的執(zhí)行流程。普通查詢其實(shí)就是對(duì)于對(duì)于各個(gè)表格進(jìn)行遞歸調(diào)用，和矩陣的乘法一樣一樣的，這個(gè)對(duì)應(yīng)非常直觀，也非常通用。

而這個(gè)常規(guī)的查詢動(dòng)作就是通過sub_select函數(shù)來實(shí)現(xiàn)，這個(gè)函數(shù)本質(zhì)性上是執(zhí)行

tsecer_select(){for (r = first ; r != end ; r = next){if(sofartest()){nexttable.tsecer_select()}}}

其中的sofartest()表示'使用所有當(dāng)前已讀取表格可以進(jìn)行的判斷'，也就是where中下推的表達(dá)式。例如 select * from a, b where a.a > 10 and b.b + a.a = 10，在a表讀取之后，其實(shí)已經(jīng)可以執(zhí)行 a.a > 10的判斷。當(dāng)然這個(gè)是一個(gè)甚至算不上偽代碼的描述方法，而真正的代碼對(duì)應(yīng)為：

enum_nested_loop_statesub_select(JOIN *join,JOIN_TAB *join_tab,bool end_of_records){…… error= (*join_tab->read_first_record)(join_tab); rc= evaluate_join_record(join, join_tab, error);…… while (rc == NESTED_LOOP_OK) { error= info->read_record(info); rc= evaluate_join_record(join, join_tab, error); }…… return rc;}static enum_nested_loop_stateevaluate_join_record(JOIN *join, JOIN_TAB *join_tab, int error){…… if (select_cond) { select_cond_result= test(select_cond->val_int()); /* check for errors evaluating the condition */ if (join->thd->is_error()) return NESTED_LOOP_ERROR; }…… if (found) { enum enum_nested_loop_state rc; /* A match from join_tab is found for the current partial join. */ rc= (*join_tab->next_select)(join, join_tab+1, 0); if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS)return rc; if (join->return_tab < join_tab)return NESTED_LOOP_OK; /*Test if this was a SELECT DISTINCT query on a table thatwas not in the field list; In this case we can abort ifwe found a row, as no new rows can be added to the result. */ if (not_used_in_distinct && found_records != join->found_records)return NESTED_LOOP_NO_MORE_ROWS; }……}

這里可以看到，這個(gè)地方是一個(gè)遞歸，用來產(chǎn)生一個(gè)笛卡爾叉乘集合，從程序?qū)崿F(xiàn)和數(shù)學(xué)表達(dá)上看都非常簡(jiǎn)潔可愛。在MySQL的實(shí)現(xiàn)中，tsecer_select函數(shù)中的for循環(huán)大致相當(dāng)sub_select中的while循環(huán)，而tsecer_select函數(shù)中循環(huán)體內(nèi)的內(nèi)容被放在了evaluate_join_record函數(shù)中，其中的sofartest對(duì)應(yīng)evaluate_join_record::test(select_cond->val_int())；tsecer_select中的nexttable.tsecer_select()語句對(duì)應(yīng)evaluate_join_record::(*join_tab->next_select)(join, join_tab+1, 0)。

四、join buffer的select實(shí)現(xiàn)

當(dāng)使用join buffer cache時(shí)，next_select函數(shù)指向sub_select_cache

enum_nested_loop_statesub_select_cache(JOIN *join,JOIN_TAB *join_tab,bool end_of_records){ enum_nested_loop_state rc; if (end_of_records) { rc= flush_cached_records(join,join_tab,FALSE); if (rc == NESTED_LOOP_OK || rc == NESTED_LOOP_NO_MORE_ROWS) rc= sub_select(join,join_tab,end_of_records); return rc; } if (join->thd->killed) // If aborted by user { join->thd->send_kill_message(); return NESTED_LOOP_KILLED; /* purecov: inspected */ } if (join_tab->use_quick != 2 || test_if_quick_select(join_tab) <= 0) { if (!store_record_in_cache(&join_tab->cache)) return NESTED_LOOP_OK; // There is more room in cache return flush_cached_records(join,join_tab,FALSE); } rc= flush_cached_records(join, join_tab, TRUE); if (rc == NESTED_LOOP_OK || rc == NESTED_LOOP_NO_MORE_ROWS) rc= sub_select(join, join_tab, end_of_records); return rc;}

結(jié)合MySQL文檔中的說明，這里的代碼意義就比較明顯。開始對(duì)于end_of_records的判斷對(duì)應(yīng)的就是

if (!store_record_in_cache(&join_tab->cache)) return NESTED_LOOP_OK; // There is more room in cache return flush_cached_records(join,join_tab,FALSE);

對(duì)應(yīng)

- Store used fields from t1, t2 in cache - If cache is full

其中store_record_in_cache函數(shù)會(huì)判斷cache是否已滿，如果cache可以放入更多的緩存，則把之前table的組合記錄存儲(chǔ)在cache中，并返回NESTED_LOOP_OK。注意：這個(gè)地方可以說是整個(gè)cache優(yōu)化的關(guān)鍵，因?yàn)檫@里并沒有啟動(dòng)對(duì)于table的掃描。反過來說，如果cache數(shù)據(jù)已經(jīng)滿了，則調(diào)用flush_cached_records函數(shù)來進(jìn)行下面的流程

- Read through all rows in t3 - Compare t3 row against all t1, t2 combinations in cache- If row satisfies join condition, send it to client - Empty cache

這個(gè)流程的特殊之處在于遍歷的驅(qū)動(dòng)是通過對(duì)于table的每一條記錄來和cache中所有t1、t2組合來進(jìn)行比較，來判斷是否滿足下推where條件(If row satisfies join condition)，則執(zhí)行join_tab->next_select函數(shù)(send it to client)。

static enum_nested_loop_stateflush_cached_records(JOIN *join,JOIN_TAB *join_tab,bool skip_last){…… info= &join_tab->read_record; do {//遍歷t3表格所有記錄……for (i=(join_tab->cache.records- (skip_last ? 1 : 0)) ; i-- > 0 ;){//遍歷cache中所有t1、t2記錄組合 read_cached_record(join_tab); skip_record= FALSE; if (select && select->skip_record(join->thd, &skip_record)) {// reset_cache_write(&join_tab->cache); return NESTED_LOOP_ERROR; } if (!skip_record) {//滿足下推的where條件//執(zhí)行下一個(gè)table的遍歷 rc= (join_tab->next_select)(join,join_tab+1,0); if (rc != NESTED_LOOP_OK && rc != NESTED_LOOP_NO_MORE_ROWS) { reset_cache_write(&join_tab->cache); return rc; } }…… } while (!(error=info->read_record(info)));五、舉例來說明下這個(gè)流程

這個(gè)實(shí)現(xiàn)的核心思想并不復(fù)雜，結(jié)合具體的例子來看就更加的簡(jiǎn)單直觀。舉個(gè)例子，其中使用兩個(gè)簡(jiǎn)單的table，其中分別存儲(chǔ)一個(gè)x，和y的值，我們希望通過一個(gè)join操作來計(jì)算這兩個(gè)表格中所有的滿足 x

x + y

y == 5 * 5,也就是我們最常見的'勾三股四弦五'這樣的經(jīng)典勾股數(shù)數(shù)值。

mysql> create table harry (x int);Query OK, 0 rows affected (0.03 sec) mysql> insert harry values (1),(2),(3),(4),(5);Query OK, 5 rows affected (0.00 sec)Records: 5 Duplicates: 0 Warnings: 0 mysql> create table tsecer (y int); Query OK, 0 rows affected (0.01 sec) mysql> insert tsecer values (1),(2),(3),(4),(5); Query OK, 5 rows affected (0.00 sec)Records: 5 Duplicates: 0 Warnings: 0 mysql> explain select * from harry, tsecer where x * x + y * y = 5 * 5;+----+-------------+--------+------+---------------+------+---------+------+------+--------------------------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+--------+------+---------------+------+---------+------+------+--------------------------------+| 1 | SIMPLE | harry | ALL | NULL | NULL | NULL | NULL | 5 ||| 1 | SIMPLE | tsecer | ALL | NULL | NULL | NULL | NULL | 5 | Using where; Using join buffer |+----+-------------+--------+------+---------------+------+---------+------+------+--------------------------------+2 rows in set (0.00 sec) mysql>1、不使用joinbuffer

在不使用join buffer的情況下，對(duì)于harry表的每個(gè)x值，對(duì)應(yīng)的tsecer表都要進(jìn)行一次全表掃描，之后使用這個(gè)x和y的組合判斷是否滿足x

x + y

y == 5 * 5這條件。由于x總共有5個(gè)值，所以tsecer需要全表掃描的次數(shù)就是5次。

2、使用joinbuffer

對(duì)于x的每個(gè)值，tsecer表在執(zhí)行的時(shí)候先是把這個(gè)值緩存到j(luò)oinbuffer中，如果buffer緩沖內(nèi)容非空，那么把此時(shí)的x的值存儲(chǔ)在buffer中后直接返回；當(dāng)join buffer滿或者是最后一條記錄的時(shí)候，此時(shí)開始啟動(dòng)對(duì)于tsecer表的掃描，對(duì)于tsecer表中讀取的每一個(gè)記錄，結(jié)合前面緩存的每一個(gè)記錄，看是否滿足自己判斷條件。對(duì)于我們看到的例子，這個(gè)地方harry表的5個(gè)值都在緩存中，在tsecer表的掃描過程中，對(duì)于從tsecer中讀取的每一條記錄，結(jié)合緩存中的“每一條”緩存，判斷這個(gè)組合結(jié)果是否滿足條件，如果任意一個(gè)組很滿足，那么就繼續(xù)next_select。在這個(gè)使用buffer的例子中，可以看到這個(gè)地方只是對(duì)于tsecer表進(jìn)行了一次掃描，而通常來說，數(shù)據(jù)庫的掃描代碼是最高的(因?yàn)橐婕暗酱疟P讀取)，這樣使用buffer的方式將tsecer表的掃描降低為1次，所以這個(gè)效率提高很多，特別是在涉及到的多個(gè)table，并且/或者每個(gè)table中的記錄數(shù)量都很多的情況下。

3、cache可以優(yōu)化的原因

本質(zhì)上說，這個(gè)效率提高的原因在于提高了從table中獲得的每條記錄的“利用率”，在使用直觀掃描方式時(shí)，table的全表掃描只是和一個(gè)組合進(jìn)行匹配，而使用buffer之后則是和cache中的所有組合進(jìn)行匹配。

以上就是MySQL的join buffer原理的詳細(xì)內(nèi)容，更多關(guān)于MySQL join buffer的資料請(qǐng)關(guān)注好吧啦網(wǎng)其它相關(guān)文章！

上一條：MySQL創(chuàng)建高性能索引的全步驟下一條：Mysql服務(wù)添加 iptables防火墻策略的方案

相關(guān)文章：

1. Mybatis傳入List實(shí)現(xiàn)批量更新的示例代碼2. Mysql入門系列：MYSQL表達(dá)式求值和MYSQL類型轉(zhuǎn)換3. MySQL基礎(chǔ)教程9 —— 函數(shù)之日期和時(shí)間函數(shù)4. 使用MySqldump命令導(dǎo)出數(shù)據(jù)時(shí)的注意5. Windwos下MySQL 64位壓縮包的安裝方法學(xué)習(xí)記錄6. 不要忽視Oracle 10g STATSPACK新功能7. ACCESS轉(zhuǎn)SQL數(shù)據(jù)庫相關(guān)的幾個(gè)技能8. Mysql 實(shí)現(xiàn)字段拼接的三個(gè)函數(shù)9. Oracle數(shù)據(jù)庫PL/SQL過程調(diào)試的輸出方法10. SQL Server 2005 Reporting Services的一個(gè)bug及其解決方法

排行榜

					
					Mysql入門系列：MYSQL表達(dá)式求值和MYSQL類型轉(zhuǎn)換
使用MySqldump命令導(dǎo)出數(shù)據(jù)時(shí)的注意
Mybatis傳入List實(shí)現(xiàn)批量更新的示例代碼
ACCESS轉(zhuǎn)SQL數(shù)據(jù)庫相關(guān)的幾個(gè)技能
Windwos下MySQL 64位壓縮包的安裝方法學(xué)習(xí)記錄
SQL Server 2005 Reporting Services的一個(gè)bug及其解決方法
MySQL基礎(chǔ)教程9 —— 函數(shù)之日期和時(shí)間函數(shù)
Oracle數(shù)據(jù)庫PL/SQL過程調(diào)試的輸出方法
不要忽視Oracle 10g STATSPACK新功能
Mysql 實(shí)現(xiàn)字段拼接的三個(gè)函數(shù)
Mysql入門系列：建立MYSQL客戶機(jī)程序的一般過程
				

熱門標(biāo)簽

亚洲精品久久久中文字幕-亚洲精品久久片久久-亚洲精品久久青草-亚洲精品久久婷婷爱久久婷婷-亚洲精品久久午夜香蕉

MySQL的join buffer原理