linux系统故障笔记

一、前言

在linux系统的使用中,经常会碰到一些故障,比如网络故障、防火墙故障、内核故障等等,这些故障往往需要通过linux的相关日志来查看报错信息,然后通过错误信息来定位具体的错误,接着可以查找关键字google来解决,为此我们从今天开始在这里记录碰到的故障信息,积少成多,碰到的新的错误会越来越少,这也是经验的积累,方便提高自身工作效率。

二、开始

2.1.网络故障

故障一:
启动网络systemctl start network报错

Job for network.service failed.See 'Systemctl status network.service' and 'journalctl -xe' for details

但是按照提示并没有发现错误,我们查看系统日志
tail -n 100 /var/log/messages,发现错误

Mar 11 10:18:18 localhost systemd: Starting LSB: Bring up/down networking...
Mar 11 10:18:19 localhost network: 正在打开环回接口: [  确定  ]
Mar 11 10:18:19 localhost NetworkManager[9366]: <info>  [1552270699.1055] audit: op="connection-activate" uuid="fd005f04-41ac-4401-bd27-decd3492dd84" name="ens33" result="fail" reason="No suitable device found for this connection."
Mar 11 10:18:19 localhost network: 正在打开接口 ens33: 错误:激活连接失败:No suitable device found for this connection.
Mar 11 10:18:19 localhost network: [失败]
Mar 11 10:18:19 localhost network: RTNETLINK answers: File exists
Mar 11 10:18:19 localhost network: RTNETLINK answers: File exists
Mar 11 10:18:19 localhost network: RTNETLINK answers: File exists
Mar 11 10:18:19 localhost network: RTNETLINK answers: File exists
Mar 11 10:18:19 localhost network: RTNETLINK answers: File exists
Mar 11 10:18:19 localhost network: RTNETLINK answers: File exists
Mar 11 10:18:19 localhost network: RTNETLINK answers: File exists
Mar 11 10:18:19 localhost network: RTNETLINK answers: File exists
Mar 11 10:18:19 localhost network: RTNETLINK answers: File exists
Mar 11 10:18:19 localhost systemd: network.service: control process exited, code=exited status=1
Mar 11 10:18:19 localhost systemd: Failed to start LSB: Bring up/down networking.
Mar 11 10:18:19 localhost systemd: Unit network.service entered failed state.
Mar 11 10:18:19 localhost systemd: network.service failed.

这种错误针对不同的情况有不同的解决方式:
1、如果该系统是克隆过来的,那么mac地址需要修改
临时修改:
ifdown ens33
ifconfig ens33 hw ether mac地址
ifup ens33
永久修改:
编辑文件/sys/class/net/ens33/address
2、与MAC地址无关
systemctl stop NetworkManager

故障二:
忘记root密码
解决步骤:
开机进入内核页面,按e,在utf-8后面添加init=/bin/sh,ctrl+x进入单用户模式
mount -o remount ,rw /
passwd
touch /.autorelabel
exec /sbin/init

2.2.软件安装故障

故障1:在安装docker的时候出现错误

Downloading packages:
Running transaction check
ERROR You need to update rpm to handle:
rpmlib(SetVersions) is needed by libltdl7-2.4.6-alt1.x86_64
RPM needs to be updated
You could try running: rpm -Va --nofiles --nodigest
Your transaction was saved, rerun it with:
yum load-transaction /tmp/yum_save_tx.2019-12-25.22-14.h_0_dM.yumtx

解决方式:
rpm -i --force --nodeps libltdl7-2.4.6-alt1.x86_64.rpm
强制安装,但是这里就有个隐患,如果建立本地yum仓库就无法直接yum安装,每次需要将该包传输至目标服务器强制安装

故障2:

Error: Package: docker-ce-18.03.1.ce-1.el7.centos.x86_64 (local)
Requires: libltdl.so.7()(64bit)
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest

解决方式:
yum -y install libtool-ltdl*

2.3.系统故障

-bash: /etc/fstab: Read-only file system

mount -o remount,rw /

2.4.远程故障

报错:
我们在进行ssh连接远程服务器的时候,会在本地的known_host文件里写入远程信息,在进行免密登录设置后对应的信息也会写入,如果我们初始化了目标服务器磁盘或者取消了免密登录后,可能会造成无法远程登录,这个时候需要清空本地的known_host文件里的内容

2.5.yumguzhang

报错:No Presto metadata available for local
这里我是本地建了一个仓库,使用nginx来进行连接,有一次使用yum clean all之后,再安装文件时就报错了
解决方式:重启nginx,重建createrepo私有仓库,yum makecache