【MongoDB】03、MongoDB索引及分片基礎

發布時間：2020-06-30 21:04:50 來源：網絡閱讀：12263 作者：xiexiaojun 欄目：數據庫

一、MongoDB配置

mongodb配置文件/etc/mongodb.conf中的配置項，其實都是mongod啟動選項（和memcached一樣）

[root@Node7 ~]# mongod --help
Allowed options:

General options:
  -h [ --help ]               show this usage information
  --version                   show version information
  -f [ --config ] arg         configuration file specifying additional options
  -v [ --verbose ]            be more verbose (include multiple times for more 
                              verbosity e.g. -vvvvv)
  --quiet                     quieter output
  --port arg                  specify port number - 27017 by default
  --bind_ip arg               comma separated list of ip addresses to listen on
                              - all local ips by default
  --maxConns arg              max number of simultaneous connections - 20000 by
                              default
  --logpath arg               log file to send write to instead of stdout - has
                              to be a file, not directory
  --logappend                 append to logpath instead of over-writing
  --pidfilepath arg           full path to pidfile (if not set, no pidfile is 
                              created)
  --keyFile arg               private key for cluster authentication
  --setParameter arg          Set a configurable parameter
  --nounixsocket              disable listening on unix sockets
  --unixSocketPrefix arg      alternative directory for UNIX domain sockets 
                              (defaults to /tmp)
  --fork                      fork server process
  --syslog                    log to system's syslog facility instead of file 
                              or stdout
  --auth                      run with security
  --cpu                       periodically show cpu and iowait utilization
  --dbpath arg                directory for datafiles - defaults to /data/db/
  --diaglog arg               0=off 1=W 2=R 3=both 7=W+some reads
  --directoryperdb            each database will be stored in a separate 
                              directory
  --ipv6                      enable IPv6 support (disabled by default)
  --journal                   enable journaling
  --journalCommitInterval arg how often to group/batch commit (ms)
  --journalOptions arg        journal diagnostic options
  --jsonp                     allow JSONP access via http (has security 
                              implications)
  --noauth                    run without security
  --nohttpinterface           disable http interface
  --nojournal                 disable journaling (journaling is on by default 
                              for 64 bit)
  --noprealloc                disable data file preallocation - will often hurt
                              performance
  --noscripting               disable scripting engine
  --notablescan               do not allow table scans
  --nssize arg (=16)          .ns file size (in MB) for new databases
  --profile arg               0=off 1=slow, 2=all
  --quota                     limits each database to a certain number of files
                              (8 default)
  --quotaFiles arg            number of files allowed per db, requires --quota
  --repair                    run repair on all dbs
  --repairpath arg            root directory for repair files - defaults to 
                              dbpath
  --rest                      turn on simple rest api
  --shutdown                  kill a running server (for init scripts)
  --slowms arg (=100)         value of slow for profile and console log
  --smallfiles                use a smaller default file size
  --syncdelay arg (=60)       seconds between disk syncs (0=never, but not 
                              recommended)
  --sysinfo                   print some diagnostic system information
  --upgrade                   upgrade db if needed

Replication options:
  --oplogSize arg       size to use (in MB) for replication op log. default is 
                        5% of disk space (i.e. large is good)

Master/slave options (old; use replica sets instead):
  --master              master mode
  --slave               slave mode
  --source arg          when slave: specify master as <server:port>
  --only arg            when slave: specify a single database to replicate
  --slavedelay arg      specify delay (in seconds) to be used when applying 
                        master ops to slave
  --autoresync          automatically resync if slave data is stale

Replica set options:
  --replSet arg           arg is <setname>[/<optionalseedhostlist>]
  --replIndexPrefetch arg specify index prefetching behavior (if secondary) 
                          [none|_id_only|all]

Sharding options:
  --configsvr           declare this is a config db of a cluster; default port 
                        27019; default dir /data/configdb
  --shardsvr            declare this is a shard db of a cluster; default port 
                        27018

SSL options:
  --sslOnNormalPorts              use ssl on configured ports
  --sslPEMKeyFile arg             PEM file for ssl
  --sslPEMKeyPassword arg         PEM file password
  --sslCAFile arg                 Certificate Authority file for SSL
  --sslCRLFile arg                Certificate Revocation List file for SSL
  --sslWeakCertificateValidation  allow client to connect without presenting a 
                                  certificate
  --sslFIPSMode                   activate FIPS 140-2 mode at startup

常用配置參數：

fork={true|false} mongod是否運行于后臺

bind_ip=IP 指定監聽地址

port=PORT 指定監聽的端口，默認為27017

maxConns=N 指定最大并發連接數

syslog=/PATH/TO/SAME_FILE 指定日志文件

httpinterface=true 是否啟動web監控功能，端口為mongod端口 + 1000

journal 是否啟動事務日志，默認已啟動

slowms arg (=100) 設定慢查詢，單位為ms，超過設定的時間就為慢查詢，默認100ms

repair 意外關閉時，應該啟用這樣來修復數據

二、索引

索引通常能夠極大的提高查詢的效率，如果沒有索引，MongoDB在讀取數據時必須掃描集合中的每個文件并選取那些符合查詢條件的記錄。這種掃描全集合的查詢效率是非常低的，特別在處理大量的數據時，查詢可以要花費幾十秒甚至幾分鐘，這對網站的性能是非常致命的。

索引是特殊的數據結構，索引存儲在一個易于遍歷讀取的數據集合中，索引是對數據庫表中一列或多列的值進行排序的一種結構

1、索引的類型

B+ Tree、hash、空間索引、全文索引

MongoDB支持的索引：

單鍵索引、組合索引（多字段索引）、

多鍵索引：索引創建在值為鍵值對上的索引

空間索引：基于位置查找

文本索引：相當于全文索引

hash索引：精確查找，不適用于范圍查找

2、索引的管理

創建：

db.mycoll.ensureIndex(keypattern[,options])

查看幫助信息：

db.mycoll.ensureIndex(keypattern[,options]) - options is an object with these possible fields: name, unique, dropDups

db.COLLECTION_NAME.ensureIndex({KEY:1})

語法中 Key 值為你要創建的索引字段，1為指定按升序創建索引，如果你想按降序來創建索引指定為-1即可。ensureIndex() 方法中你也可以設置使用多個字段創建索引（關系型數據庫中稱作復合索引）。db.col.ensureIndex({"title":1,"description":-1})

ensureIndex() 接收可選參數，可選參數列表如下：

Parameter	Type	Description
background	Boolean	建索引過程會阻塞其它數據庫操作，background可指定以后臺方式創建索引，即增加 "background" 可選參數。 "background" 默認值為false。
unique	Boolean	建立的索引是否唯一。指定為true創建唯一索引。默認值為false.
name	string	索引的名稱。如果未指定，MongoDB的通過連接索引的字段名和排序順序生成一個索引名稱。
dropDups	Boolean	在建立唯一索引時是否刪除重復記錄,指定 true 創建唯一索引。默認值為false.
sparse	Boolean	對文檔中不存在的字段數據不啟用索引；這個參數需要特別注意，如果設置為true的話，在索引字段中不會查詢出不包含對應字段的文檔.。默認值為false.
expireAfterSeconds	integer	指定一個以秒為單位的數值，完成 TTL設定，設定集合的生存時間。
v	index version	索引的版本號。默認的索引版本取決于mongod創建索引時運行的版本。
weights	document	索引權重值，數值在 1 到 99,999 之間，表示該索引相對于其他索引字段的得分權重。
default_language	string	對于文本索引，該參數決定了停用詞及詞干和詞器的規則的列表。默認為英語
language_override	string	對于文本索引，該參數指定了包含在文檔中的字段名，語言覆蓋默認的language，默認值為 language.

查詢：

db.mycoll.getIndex()

刪除：

db.mycoll.dropIndexes() 刪除當前集合的所有索引

db.mycoll.dropIndexes("index") 刪除指定索引

db.mycoll.reIndex() 重新構建索引，

實例：

> db.students.find()
> for (i=1;i<=100;i++) db.students.insert({name:"student"+i, age:(i%100)}) 
                                                                  #  使用for循環 
> db.students.find().count()
100

> db.students.find()
{ "_id" : ObjectId("58d613021e8383d30814f846"), "name" : "student1", "age" : 1 }
{ "_id" : ObjectId("58d613021e8383d30814f847"), "name" : "student2", "age" : 2 }
{ "_id" : ObjectId("58d613021e8383d30814f848"), "name" : "student3", "age" : 3 }
{ "_id" : ObjectId("58d613021e8383d30814f849"), "name" : "student4", "age" : 4 }
{ "_id" : ObjectId("58d613021e8383d30814f84a"), "name" : "student5", "age" : 5 }
{ "_id" : ObjectId("58d613021e8383d30814f84b"), "name" : "student6", "age" : 6 }
{ "_id" : ObjectId("58d613021e8383d30814f84c"), "name" : "student7", "age" : 7 }
{ "_id" : ObjectId("58d613021e8383d30814f84d"), "name" : "student8", "age" : 8 }
{ "_id" : ObjectId("58d613021e8383d30814f84e"), "name" : "student9", "age" : 9 }
{ "_id" : ObjectId("58d613021e8383d30814f84f"), "name" : "student10", "age" : 10 }
{ "_id" : ObjectId("58d613021e8383d30814f850"), "name" : "student11", "age" : 11 }
{ "_id" : ObjectId("58d613021e8383d30814f851"), "name" : "student12", "age" : 12 }
{ "_id" : ObjectId("58d613021e8383d30814f852"), "name" : "student13", "age" : 13 }
{ "_id" : ObjectId("58d613021e8383d30814f853"), "name" : "student14", "age" : 14 }
{ "_id" : ObjectId("58d613021e8383d30814f854"), "name" : "student15", "age" : 15 }
{ "_id" : ObjectId("58d613021e8383d30814f855"), "name" : "student16", "age" : 16 }
{ "_id" : ObjectId("58d613021e8383d30814f856"), "name" : "student17", "age" : 17 }
{ "_id" : ObjectId("58d613021e8383d30814f857"), "name" : "student18", "age" : 18 }
{ "_id" : ObjectId("58d613021e8383d30814f858"), "name" : "student19", "age" : 19 }
{ "_id" : ObjectId("58d613021e8383d30814f859"), "name" : "student20", "age" : 20 }
Type "it" for more      # 只顯示前20個，it顯示更多

> db.students.ensureIndex({name:1})   #　在name鍵上構建索引，1表示升序，-1表示降序
> show collections
students
system.indexes
t1

> db.students.getIndexes()
[
	{                               # 默認的索引
		"v" : 1,              
		"name" : "_id_",
		"key" : {
			"_id" : 1
		},
		"ns" : "students.students"　　# 數據庫.集合
	},
	{
		"v" : 1,
		"name" : "name_1",      #　自動生成的索引名
		"key" : {　　　
			"name" : 1　　　# 在name鍵上創建的索引
		},
		"ns" : "students.students"  
	}
]

> db.students.dropIndexes("name_1")      #　刪除指定索引
{
	"nIndexesWas" : 2,
	"msg" : "non-_id indexes dropped for collection",
	"ok" : 1
}
> db.students.getIndexes()
[
	{
		"v" : 1,
		"name" : "_id_",
		"key" : {
			"_id" : 1
		},
		"ns" : "students.students"
	}
]
> db.students.dropIndexes()　　　　　　　　# 默認的索引無法刪除，
{
	"nIndexesWas" : 1,
	"msg" : "non-_id indexes dropped for collection",
	"ok" : 1
}
> db.students.getIndexes()
[
	{
		"v" : 1,
		"name" : "_id_",
		"key" : {
			"_id" : 1
		},
		"ns" : "students.students"
	}
	
> db.students.find({age:"90"}).explain()       # 顯示查詢過程
{
	"cursor" : "BtreeCursor t1",
	"isMultiKey" : false,
	"n" : 0,
	"nscannedObjects" : 0,　　　　　
	"nscanned" : 0,
	"nscannedObjectsAllPlans" : 0,
	"nscannedAllPlans" : 0,
	"scanAndOrder" : false,
	"indexOnly" : false,
	"nYields" : 0,
	"nChunkSkips" : 0,
	"millis" : 17,
	"indexBounds" : {               #　使用的索引
		"age" : [
			[
				"90",
				"90"
			]
		]
	},
	"server" : "Node7:27017"
}

三、MongoDB的分片

1、分片簡介

隨著業務發展，當數據集越來越大，CPU、Memory、IO出現瓶頸，就需要對mongodb進行擴展。

增加mongodb只能均衡讀壓力，不能均衡寫壓力，就需要對數據集分片。

mongodb原生支持分片

MySQL的分片解決方案（框架），需要資深DBA（5年以上經驗）

Gizzard, HiveDB, MySQL Proxy + HSACLE, Hibernate Shard, Pyshards

2、分片架構中的角色

【MongoDB】03、MongoDB索引及分片基礎

mongos：Router

相當于代理，將用戶請求路由到合適的分片上執行，本身不存儲數據也不查詢數據，

config server：元數據服務器，也需要多個，但不是副本集，需要借助其它工具實現如zookeeper

存放的是shard服務器上存儲的數據集的索引

shard: 數據節點，也稱mongod實例

zookeeper：

常用于實現分布式系統中心節點協調，能夠提供選舉并選舉出主節點機制；zookeeper本身也可以自行做分布式。

3、分片的方式

分片是基于collection

為保證每個shard節點上數據集均衡，將每個collectin切割成大小固定的chunk（塊），然后逐個分配給shard節點。

基于范圍切片：

range，所用到的索引一定是順序索引，支持排序如：B tree 索引

根據索引平均分配chunk

基于列表切片：

list，離散的方式，將值放在列表中

基于hash切片：

hash，按鍵對shard服務器的個數取模，分散存放，實現熱點數據發散

具體使用哪種切片方式需要根據自己的業務模型來定

切片的原則：

寫離散，讀集中

db.enableSharding("testdb")

四、實戰案例

1、架構

【MongoDB】03、MongoDB索引及分片基礎

2、配置過程

1）應先配置config server節點

使用configsvr=true配置，無需加入副本集，監聽在tcp:27019端口上

2）mongos

只需啟動mongos時，使用--configdb=172.16.100.16:27019 指定config server即可,監聽在tcp 27017作為代理

mongos啟動時的選項：

mongos --configdb=172.168.100.16 --fork --logpath=/var/log/mongodb/mongos.log

3）在mongos節點上添加shard節點

和shard相關命令的幫助：

testSet:PRIMARY> sh.help()
	sh.addShard( host )                       server:port OR setname/server:port
	                      #　添加shard節點，可以是副本集名稱
	sh.enableSharding(dbname)                 enables sharding on the database dbname                    
	                     　#　指定在哪個數據庫上啟用切片功能
	sh.shardCollection(fullName,key,unique)   shards the collection
	                       # 對哪個collection作切片
	sh.splitFind(fullName,find)               splits the chunk that find is in at the median
	sh.splitAt(fullName,middle)               splits the chunk that middle is in at middle
	sh.moveChunk(fullName,find,to)            move the chunk where 'find' is to 'to' (name of shard)
	sh.setBalancerState( <bool on or not> )   turns the balancer on or off true=on, false=off
	sh.getBalancerState()                     return true if enabled
	sh.isBalancerRunning()                    return true if the balancer has work in progress on any mongos
	sh.addShardTag(shard,tag)                 adds the tag to the shard
	sh.removeShardTag(shard,tag)              removes the tag from the shard
	sh.addTagRange(fullName,min,max,tag)      tags the specified range of the given collection
	sh.status()                               prints a general overview of the clustest                # 查看shard的狀態；“primary” 表示如果一些collection很小，沒必要做shard，沒有做shard的collection存放的數據節點

創建一個collection時，明確指定基于哪個鍵作shard

sh.shardCollection(fullName,key,unique)

fullName：為完整的名字，包括數據庫和集合：數據庫名稱.集合名稱

例子：sh.shardCollection("testdb.students",{"age":1})

表示對testdb庫中students集合做切片，基于“age”字段創建升序索引；然后在testdb庫students集合下的數據就會自動分發到各個shard節點上

use admin

db.runCommand("listShards") # 列出shard節點

db.printShardingStatus()和sh.status()一樣

sh.isBanlancerRunning() # 查看均衡器是否在運行,需要均衡時才會自動運行，

sh.getBalancerState() # 均衡功能是否開啟

sh.moveChunk(fullName,find,to) # 手動移動chunk，不建議使用

向AI問一下細節

亚洲激情专区-91九色丨porny丨老师-久久久久久久女国产乱让韩-国产精品午夜小视频观看

【MongoDB】03、MongoDB索引及分片基礎

猜你喜歡

亚洲激情专区-91九色丨porny丨老师-久久久久久久女国产乱让韩-国产精品午夜小视频观看

【MongoDB】03、MongoDB索引及分片基礎

猜你喜歡

最新資訊

相關推薦

相關標簽