一、备份准备
1.1.备份环境
1.1.1.系统环境
源数据服务器: cat /etc/centos-release
CentOS Linux release 7.3.1611 (Core)
目标数据库服务器: cat /etc/centos-release
CentOS Linux release 7.5.1804 (Core
1.1.2.软件环境
源数据服务器版本: MongoDB server version: 3.2.17
目标数据服务器本本: MongoDB server version: 3.6.7
1.2.备份工具
- mongodump/mongorestore
- mongoexport/mongoimport
- 第三方工具,如Studio.3T
1.3.备份场景
1、单机
2、复制集
3、分片集群
二、开始备份
2.1.数据库说明
为了方便演示,后面我们针对每一种情况举一个具体的例子来作说明
2.2.单机备份恢复
单机备份也分很多情况的
- 备份所有的库
- 备份单个库
- 备份所有的文档
2.2.1.备份所有库
- 需求:
把服务器172.168.1.111上的mongodb上所有的库备份出来,然后传到服务器172.168.1.209进行恢复 - 源数据库服务器信息:
admin库用户: username1
admin库用户密码: password1
mongodb开启认证
端口: 20000
具有的库如下所示
db1
db2
db3
db4
db5
- 执行备份
mongodump -h 172.168.1.111 --port 20000 -u username1 -p password1 -v -j 16 -o /backup/ --authenticationDatabase admin
2.2.2.恢复所有库
- 目标数据库服务器信息
admin库用户:username2
admin库用户密码: password2
mongodb开启认证
端口: 20001 - 数据传输到目标服务器
rsync -avz --progress /backup 172.168.1.209:/ - 执行恢复
mongorestore -h 172.168.1.209 --port 20001 --drop /backup -u username2 -p password2 --authenticationDatabase admin
2.2.3.备份单库
以db1为例
mongodump -h 172.168.1.111:20000 -u username1 -p password1 -d db1 -v -j 16 -o /backup --authenticationDatabase admin
2.2.4.恢复单库
mongorestore -h 172.168.1.209:20001 -u username2 -p password2 -d db1 --dir /backup/dbname --authenticationDatabase admin
2.2.5.导出库文档数据
这个情况适用于不同库之间进行数据迁移或者迁移数据量过大的库
还是以db1为例子,这次需要使用脚本来批量导出数据
cat collections.txt
Comparison_DRP002726
Comparison_ERP016957
Comparison_SRP002082
Comparison_SRP002176
INDEL_Chr01
INDEL_Chr02
INDEL_Chr03
INDEL_Chr04
SNP_Chr01
SNP_Chr02
SNP_Chr03
SNP_Chr04
cat mongoexport.sh
#!/bin/bash
a=$(cat collections.txt)
b=($a)
for collection in ${b[@]}
do
mongoexport -h 172.168.1.111:20000 -u username1 -p password1 -d db1 -c ${collection} -o /backup/${collection}.json --authenticationDatabase db1
done
2.2.6.导入库文档数据
将前面导出的文档数据传到目的服务器上,执行脚本导入
cat mongoimport.sh
#!/bin/bash
a=$(cat collections.txt)
b=($a)
for collection in ${b[@]}
do
mongoimport -h 172.168.1.209:27001 -u username2 -p password2 -d db1 -c ${collection} /backup/${collection}.json --authenticationDatabase db1
done
2.2.7.建立索引
导出文档数据和导入文档是不带索引的,所以我们需要根据源文档数据的索引信息来在目的文档数据上建立索引,这里还是以db1为例,其文档前面我们已经给出了,下面我们给出这些文档的索引信息
2.2.7.1.单个索引建立
以Comparison_DRP002726为例,查看其索引信息
db.Comparison_DRP002726.getIndexes()
[
{
"ns" : "db1.Comparison_DRP002726",
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_"
},
{
"ns" : "db1.Comparison_DRP002726",
"v" : 1,
"unique" : true,
"key" : {
"gene" : 1,
"diffs" : 1,
"diffs.name" : 1
},
"name" : "gene_1_diffs_1_diffs.name_1"
}
]
从上面我们可以看到该文档只有两个索引,其中_id_是自动生成,gene_1_diffs_1_diffs.name_1是我们需要手动建立的索引,那么对于多个具有相同结构的索引我们进行批量新建
cat collections1.txt
Comparison_DRP002726
Comparison_ERP016957
Comparison_SRP002082
Comparison_SRP002176
cat createindex.sh
#!/bin/bash
a=$(cat collections1.txt)
b=($a)
for collection in ${b[@]}
do
docker exec db_mongo mongo db1 -u username2 -p password2 --eval "db.${collection}.ensureIndex({\"gene\":1, \"diffs\":1, \"diffs.name\":1},{\"unique\": true})" --authenticationDatabase 授权库
done
2.2.7.2.多个索引建立
有时候一个文档下会有多个索引
以INDEL_Chr01为例
db.INDEL_Chr01.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "db1.INDEL_Chr01"
},
{
"v" : 1,
"key" : {
"pos" : 1
},
"name" : "idx_INDEL_Chr01_pos",
"ns" : "db1.INDEL_Chr01",
"background" : true
},
{
"v" : 1,
"key" : {
"gene" : 1
},
"name" : "idx_INDEL_Chr01_gene",
"ns" : "db1.INDEL_Chr01",
"background" : true
},
{
"v" : 1,
"key" : {
"consequencetype" : 1
},
"name" : "idx_INDEL_Chr01_consequencetype",
"ns" : "db1.INDEL_Chr01",
"background" : true
}
]
上面可以看出我们需要新建的索引有3个,这里可以用for循环嵌套来实现批量新建
cat collections2.txt
INDEL_Chr01
INDEL_Chr02
INDEL_Chr03
INDEL_Chr04
cat index.txt
pos
gene
consequencetype
cat createindex2.sh
#!/bin/bash
a=$(cat collections2.txt)
b=($a)
c=$(cat index.txt)
d=($c)
for collection in ${b[@]}
do
for index in ${d[@]}
do
docker exec db_mongo mongo db1 -u username2 -p password2 --eval "db.${collection}.ensureIndex({\"${index}\":1},{\"name\":\"idx_${collection}_${index}\"},{\"background\":true})"
done
done
2.3.分片集群的备份与恢复
思路:
- 插入数据
- 备份数据
- 恢复数据
2.3.1.数据准备
搭建分片集群请参考博文,下面的操作也均以该博文为例子
2.3.1.1.登录mongos
mongo 118.31.244.127:27121/admin –u tengwang –p 123456 –authenticationDatabase admin
2.3.1.2.查看目前集群状态
sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5b97bf3a46d83924b1b0818c")
}
shards:
{ "_id" : "shard1", "host" : "shard1/118.31.244.127:27118,118.31.244.177:27118", "state" : 1 }
{ "_id" : "shard2", "host" : "shard2/118.31.244.177:27119,118.31.244.21:27119", "state" : 1 }
{ "_id" : "shard3", "host" : "shard3/118.31.244.127:27120,118.31.244.21:27120", "state" : 1 }
active mongoses:
"3.6.7" : 3
autosplit:
Currently enabled: yes
balancer:
Currently enabled: no //均衡器未开启exit
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "linuxwt", "primary" : "shard3", "partitioned" : false表示主分片是shard3,集群未开启数据库linuxwt的分片功能
为了方便测试,插入一条数据
use linuxwt
db.students.insert({uid:1,name:'hukey',age:23})
2.3.1.3.开启均衡器和分片功能
sh.startBalancer(),显示
{
"ok" : 1,
"operationTime" : Timestamp(1536818478, 5),
"$clusterTime" : {
"clusterTime" : Timestamp(1536818478, 5),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
use linuxwt
sh.enableSharding('linuxwt')
db.students.ensureIndex({uid:1})
sh.shardCollection('linuxwt.students’,{uid:1}) ,显示
{
"collectionsharded" : "linuxwt.students",
"collectionUUID" : BinData(4,"+IKBHvAfRCe5Sn9YbyQ2kw=="),
"ok" : 1,
"operationTime" : Timestamp(1536818843, 11),
"$clusterTime" : {
"clusterTime" : Timestamp(1536818843, 11),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
再次查看集群状态
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5b97bf3a46d83924b1b0818c")
}
shards:
{ "_id" : "shard1", "host" : "shard1/118.31.244.127:27118,118.31.244.177:27118", "state" : 1 }
{ "_id" : "shard2", "host" : "shard2/118.31.244.177:27119,118.31.244.21:27119", "state" : 1 }
{ "_id" : "shard3", "host" : "shard3/118.31.244.127:27120,118.31.244.21:27120", "state" : 1 }
active mongoses:
"3.6.7" : 3
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "linuxwt", "primary" : "shard3", "partitioned" : true } //数据库分片功能开启成功
linuxwt.students
shard key: { "uid" : 1 } //片键
unique: false
balancing: true
chunks:
shard3 1
{ "uid" : { "$minKey" : 1 } } -->> { "uid" : { "$maxKey" : 1 } } on : shard3 Timestamp(1, 0) //集合分片成功
2.3.1.4.修改chunk大小
chunk是用来存储数据的,当超过设置的大小,均衡器就会迁移chunk数据到负载较小的分片,这里为了测试分片几群的分片功能,我们设置小一点,这里设置为4M
use config
db.settings.find()
db.settings.save({ id:"chunksize", value: 4})
2.3.1.5.插入数据
for (var i=2;i<300000;i++) db.students.insert({uid:i,name:"hukey",age:"23"})
插入完成后我们登陆查看状态sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5b97bf3a46d83924b1b0818c")
}
shards:
{ "_id" : "shard1", "host" : "shard1/118.31.244.127:27118,118.31.244.177:27118", "state" : 1 }
{ "_id" : "shard2", "host" : "shard2/118.31.244.177:27119,118.31.244.21:27119", "state" : 1 }
{ "_id" : "shard3", "host" : "shard3/118.31.244.127:27120,118.31.244.21:27120", "state" : 1 }
active mongoses:
"3.6.7" : 3
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 1
Last reported error: Couldn't get a connection within the time limit
Time of Reported error: Fri Sep 14 2018 18:20:04 GMT+0800 (CST)
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "linuxwt", "primary" : "shard3", "partitioned" : true }
linuxwt.students
shard key: { "uid" : 1 }
unique: false
balancing: true
chunks:
shard1 3
shard2 3
shard3 3
{ "uid" : { "$minKey" : 1 } } -->> { "uid" : 2 } on : shard2 Timestamp(5, 1)
{ "uid" : 2 } -->> { "uid" : 65538 } on : shard1 Timestamp(6, 1)
{ "uid" : 65538 } -->> { "uid" : 98306 } on : shard3 Timestamp(4, 1)
{ "uid" : 98306 } -->> { "uid" : 131909 } on : shard3 Timestamp(3, 3)
{ "uid" : 131909 } -->> { "uid" : 164677 } on : shard2 Timestamp(4, 2)
{ "uid" : 164677 } -->> { "uid" : 202691 } on : shard2 Timestamp(4, 3)
{ "uid" : 202691 } -->> { "uid" : 235459 } on : shard1 Timestamp(5, 2)
{ "uid" : 235459 } -->> { "uid" : 273473 } on : shard1 Timestamp(5, 3)
{ "uid" : 273473 } -->> { "uid" : { "$maxKey" : 1 } } on : shard3 Timestamp(6, 0)
由上面的信息可以看出分片成功
2.3.1.6.观察各分片主节点数据
mongos
show dbs
admin 0.000GB
config 0.002GB
linuxwt 0.015GB
crs:PRIMARY> show dbs
admin 0.000GB
config 0.001GB
local 0.014GB
shard1:PRIMARY> show dbs
admin 0.000GB
config 0.000GB
linuxwt 0.007GB
local 0.012GB
shard2、shard3同上,可以发现shard1-3的数据量加起来刚好等于从mongos登录看到的数据
2.3.2.全量备份
全量备份就是把mongodb实例里的所有数据库全部备份,这里备份的是复制集,所以我们备份的时候最好带上oplog这个参数,这样我们将数据恢复至新的复制集的时候可以不用花时间去同步数据,具体备份步骤如下
2.3.2.1.暂停Balancer
分片机制中平衡器有可能迁移数据,所以备份分片的时首先需要停掉Balancer,登陆mongos,执行命令sh.stopBalancer()
2.3.2.2.备份所有从节点数据
db.fsyncLock()锁定shard1-3复制集合中的副节点以及configserver中的副节点
mongodump -h 118.31.244.177 --port 27117 -u tengwang -p 123456 --oplog -o /backup/config_backup/ --authenticationDatabase admin
mongodump -h 118.31.244.127 --port 27118 -u tengwang -p 123456 --oplog -o /backup/shard1_backup/ --authenticationDatabase admin
mongodump -h 118.31.244.177 --port 27119 -u tengwang -p 123456 --oplog -o /backup/shard2_backup/ --authenticationDatabase admin
mongodump -h 118.31.244.21 --port 27120 -u tengwang -p 123456 --oplog -o /backup/shard3_backup/ --authenticationDatabase admin
2.3.2.3.开启Balancer
db.fsyncUnlock()解除锁定
sh.startBalancer()
2.3.3.恢复测试
2.3.3.1.删除数据
连上mongos
show dbs
use linuxwt
删除数据库中的students集合
db.students.drop()
重启所有进程
docker-compose down
docker-compose up -d
再次登入mongos
show dbs
use linuxwt
show collections
发现已经成功删除students集合
2.3.3.2.恢复数据库
恢复shatd1-3主节点数据
mongorestore -h 118.31.244.177 --port 27118 --drop --oplogReplay /backup/shard1_bak/ -u tengwang -p 123456 --authenticationDatabase admin
mongorestore -h 118.31.244.21 --port 27119 --drop --oplogReplay /backup/shard2_bak/ -u tengwang -p 123456 --authenticationDatabase admin
mongorestore -h 118.31.244.127 --port 27120 --drop --oplogReplay /backup/shard3_bak/ -u tengwang -p 123456 --authenticationDatabase admin
恢复config主节点数据
mongorestore -h 118.31.244.127 --port 27117 --drop --oplogReplay /backup/config_bak/ -u tengwang -p 123456 --authenticationDatabase admin
sh.status()
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5b97bf3a46d83924b1b0818c")
}
shards:
{ "_id" : "shard1", "host" : "shard1/118.31.244.127:27118,118.31.244.177:27118", "state" : 1 }
{ "_id" : "shard2", "host" : "shard2/118.31.244.177:27119,118.31.244.21:27119", "state" : 1 }
{ "_id" : "shard3", "host" : "shard3/118.31.244.127:27120,118.31.244.21:27120", "state" : 1 }
active mongoses:
"3.6.7" : 3
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 1
Last reported error: Couldn't get a connection within the time limit
Time of Reported error: Fri Sep 14 2018 18:20:04 GMT+0800 (CST)
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "linuxwt", "primary" : "shard2", "partitioned" : true }
linuxwt.students
shard key: { "uid" : 1 }
unique: false
balancing: true
chunks:
shard1 3
shard2 3
shard3 3
{ "uid" : { "$minKey" : 1 } } -->> { "uid" : 2 } on : shard2 Timestamp(5, 1)
{ "uid" : 2 } -->> { "uid" : 65538 } on : shard1 Timestamp(6, 1)
{ "uid" : 65538 } -->> { "uid" : 98306 } on : shard3 Timestamp(4, 1)
{ "uid" : 98306 } -->> { "uid" : 131909 } on : shard3 Timestamp(3, 3)
{ "uid" : 131909 } -->> { "uid" : 164677 } on : shard2 Timestamp(4, 2)
{ "uid" : 164677 } -->> { "uid" : 202691 } on : shard2 Timestamp(4, 3)
{ "uid" : 202691 } -->> { "uid" : 235459 } on : shard1 Timestamp(5, 2)
{ "uid" : 235459 } -->> { "uid" : 273473 } on : shard1 Timestamp(5, 3)
{ "uid" : 273473 } -->> { "uid" : { "$maxKey" : 1 } } on : shard3 Timestamp(6, 0)
注意以上恢复均是在服务运行的状况下进行
三、问题
3.1.问题一
备份分片集群的时候使用了全量备份并带上了参数oplog(只有全量备份才可以加这个参数),但是如果我想只备份某一个库比如linuxwt这个库,可以参考单库备份
3.2.问题二
为什么没有讲复制集备份,分片集群备份里已经涉及到了
3.3.问题三
备份config server的时候报错
Failed: error counting admin.system.keys: not authorized on admin to execute command { count: "system.keys", query: {}, $readPreference: { mode: "secondaryPreferred" }, $db: "admin"
此错误是因为没有授权给admin用户对system.version表执行命令的权限,解决办法:
db.grantRolesToUser ( "root", [ { role: "__system", db: "admin" } ] )
3.4.问题四
从节点是不能查看数据库的具体信息的,如果想查看,执行命令
db.getMongo().setSlaveOk()
3.5.问题五
在恢复复制集或者分片集群的时候最好重新搭建新的复制集和分片集群