Docker使用故障笔记

故障一
故障现象:一般docker守护进程必须以root用户运行,默认情况普通用户是没有权限对docker进程进行操作的,会出现下面的错误

Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.35/images/create?fromSrc=-&message=&repo=ubuntu-16.04&tag=: dial unix /var/run/docker.sock: connect: permission denied

解决办法:

  • 使用root用户给予普通用户管理员权限,在/etc/sudoers如下位置里添加信息
## Allow root to run any commands anywhere 
root    ALL=(ALL)       ALL
tengwang  ALL=(ALL)     ALL
  • 切换至普通用户
sudo groupadd docker  
sudo gpasswd -a ${USER} docker   
systemctl restart docker  
newgrp - docker

这个故障给了我一个安全启动容器的思路:
首先一般是以docker启动相关容器
然后我们创建一个普通用户,对该用户赋予上述权限
然后进制root用户远程登录,这样只有普通用户可以登录并启停容器,但是没有权限去改动容器配置

故障二
故障现象:

ERROR: for nginx_linuxwt Cannot start service nginx_linuxwt: driver failed programming external connectivity on endpoint nginx_linuxwt (3cdf1c11cf27c33a33be85ce92e9bc73e0946a3b74c3e409d1c7e3019a81eab3): (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 80 -j DNAT --to-destination 172.17.0.3:80 ! -i docker0: iptables: No chain/target/match by that name (exit status 1))

解决办法:
systemctl restart docker
故障三
故障现象:

安装docker-compose时遇到错误“ImportError: 'module' object has no attribute 'check_specifier'”

解决办法:

easy_install --version #查看setuptools版本,将其升级到30.1.0版本  
pip install --upgrade setuptools==30.1.0

故障四
故障现象:

在使用docker的官方镜像安装splunk的时候,将docker内部的目录/opt/splunk映射出来的时候碰到只能映射出部分目录和文件,目录etc和var是空目录

解决办法:
这是因为etc目录和var目录采用的是volume的方式实现的数据持久化,这两个目录其实是映射到docker存储位置上的,可以使用docker inspect来查看挂载点,该问题暂时我还没有找到好的解决办法,最后只能自己定义Dockerfile来构建镜像,然后都使用bind的方式来进行数据持久化

故障四
安装docker-compose的时候报错

DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.
Downloading https://files.pythonhosted.org/packages/b3/25/e605574f24948a8a53b497744e93f061eb1dbe7c44b6465fc1c172d591aa/PyNaCl-1.3.0-cp27-cp27mu-manylinux1_x86_64.whl (762kB)
|█████████████████▋ | 419kB 2.2kB/s eta 0:02:35ERROR: Exception:
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/pip/_internal/cli/base_command.py", line 178, in main
status = self.run(options, args)
File "/usr/lib/python2.7/site-packages/pip/_internal/commands/install.py", line 352, in run
resolver.resolve(requirement_set)
File "/usr/lib/python2.7/site-packages/pip/_internal/resolve.py", line 131, in resolve
self._resolve_one(requirement_set, req)
File "/usr/lib/python2.7/site-packages/pip/_internal/resolve.py", line 294, in _resolve_one
abstract_dist = self._get_abstract_dist_for(req_to_install)
File "/usr/lib/python2.7/site-packages/pip/_internal/resolve.py", line 242, in _get_abstract_dist_for
self.require_hashes
File "/usr/lib/python2.7/site-packages/pip/_internal/operations/prepare.py", line 347, in prepare_linked_requirement
progress_bar=self.progress_bar
File "/usr/lib/python2.7/site-packages/pip/_internal/download.py", line 886, in unpack_url
progress_bar=progress_bar
File "/usr/lib/python2.7/site-packages/pip/_internal/download.py", line 746, in unpack_http_url
progress_bar)
File "/usr/lib/python2.7/site-packages/pip/_internal/download.py", line 954, in _download_http_url
_download_url(resp, link, content_file, hashes, progress_bar)
File "/usr/lib/python2.7/site-packages/pip/_internal/download.py", line 683, in _download_url
hashes.check_against_chunks(downloaded_chunks)
File "/usr/lib/python2.7/site-packages/pip/_internal/utils/hashes.py", line 62, in check_against_chunks
for chunk in chunks:
File "/usr/lib/python2.7/site-packages/pip/_internal/download.py", line 651, in written_chunks
for chunk in chunks:
File "/usr/lib/python2.7/site-packages/pip/_internal/utils/ui.py", line 156, in iter
for x in it:
File "/usr/lib/python2.7/site-packages/pip/_internal/download.py", line 640, in resp_read
decode_content=False):
File "/usr/lib/python2.7/site-packages/pip/_vendor/urllib3/response.py", line 494, in stream
data = self.read(amt=amt, decode_content=decode_content)
File "/usr/lib/python2.7/site-packages/pip/_vendor/urllib3/response.py", line 459, in read
raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
File "/usr/lib64/python2.7/contextlib.py", line 35, in exit
self.gen.throw(type, value, traceback)
File "/usr/lib/python2.7/site-packages/pip/_vendor/urllib3/response.py", line 374, in _error_catcher
raise ReadTimeoutError(self._pool, None, 'Read timed out.')
ReadTimeoutError: HTTPSConnectionPool(host='files.pythonhosted.org', port=443): Read timed out.

解决办法: pip install -i https://pypi.douban.com/simple docker-compose
这里可能因为网速的原因报错,这里直接指定下载源

故障五
使用docker-compose启动一个mysql服务的时候报错,相关docker-compose.yml文件如下:

mysql_linuxwt:
  restart: always
  image: 10.8.8.13:5000/mysql:v1
  container_name: mysql_linuxwt
  volumes:
    - /etc/localtime:/etc/localtime
    - /etc/timezone:/etc/timezone
    - $PWD/mysql:/var/lib/mysql
    - $PWD/mysqld.cnf:/etc/mysql/mysql.conf.d/mysqld.cnf
    - $PWD/mysql.log:/var/log/mysql/general.log
    - $PWD/error.log:/var/log/mysql/error.log
  ports:
    - 3306:3306
  environment:
    MYSQL_ROOT_PASSWORD: password

chown: cannot read directory '/var/lib/mysql/': Permission denied

这和docker再centos7上运行的selinux机制有关,我们可以关闭selinux来解决该故障
临时关闭setenforce 0
启动容器报错

[ERROR] Could not open file '/var/log/mysql/error.log' for error logging: Permission denied

这是由于我们在外面影射了日志文件,需要给与响应的权限
chmod 777 *.log

故障六:
docker build构建镜像的时候报错:

Message from syslogd@i1234567890 at Mar 23 22:45:52 ...
kernel:unregister_netdevice: waiting for lo to become free. Usage count = 1

这和docker engine版本和docker内部基础镜像的内核有关,当时docker engine版本过高导致这个问题
故障七:
docker-compose启动相关容器时报错:

/usr/lib/python2.7/site-packages/requests/init.py:80: RequestsDependencyWarning: urllib3 (1.22) or chardet (2.2.1) doesn't match a supported version!
RequestsDependencyWarning)

解决办法:
pip uninstall urllib3
pip uninstall chardet
pip install requests

故障八:
swarm集群中从主节点部署服务启动服务失败,通过命令docker service ps --no-trunc servicename 发现报错:

e5y8g443x1ajj0jovyrhcioqj 151_prometheus.1 prom/prometheus:latest@sha256:bfad037f95e5e34d595502aa02cac6467b7eadc4b08a601d150844003051fb1b node151 Shutdown Rejected less than a second ago "invalid mount config for type "bind": bind source path does not exist"

在主节点需要创建与目标节点同样的目录结构,但不必在主节点生成镜像

故障九:
swarm里部署的gitlab服务无法启动,使用命令systemctl status -l docker查看日志发现错误

level=error msg="Not continuing with pull after error: context canceled"
Jun 26 12:14:54 node150 dockerd[974]: time="2020-06-26T12:14:54.542270318+08:00" level=warning msg="failed to deactivate service binding for container 150_gitlab.1.nhcg0qojnsqm1tq70enycgx2e" error="No such container: 150_gitlab.1.nhcg0qojnsqm1tq70enycgx2e" module=node/agent node.id=osm84hfjpi604vcxid4d8j4xo

从上面信息可以看到镜像出错了,导致服务无法与及将生成的容器无法绑定
解决方式:
仔细查看镜像,swarm中不支持在线拉取镜像,必须在本地生成镜像后在创建服务

故障十:
使用docker stack启动创建某个服务时报错

services.nginx.port.0 is needed a number or string

解决方式:
修改docker-compose.yml文件中services的版本,最好3.4