一次502错误解决过程

编程

中午收到反馈,后台报502错误,进服务器,看到nginx 和 php-fpm 进程都在。看负载,发现CPU间歇性升高到99%,发现是 php artisan 的脚本在执行,进 /etc/cront 文件,将定时任务删除,负载降下来,但仍然502。

502 是网关错误,肯定是 nginx-cgi-php 之间的连接出了问题,先看nginx 的状态。

service nginx status

Redirecting to /bin/systemctl status nginx.service

● nginx.service - nginx - high performance web server

Loaded: loaded (/usr/lib/systemd/system/nginx.service; enabled; vendor preset: disabled)

Active: active (running) since 六 2019-06-15 00:28:19 CST; 9 months 2 days ago

Docs: http://nginx.org/en/docs/

Process: 23962 ExecReload=/bin/kill -s HUP $MAINPID (code=exited, status=0/SUCCESS)

Main PID: 32501 (nginx)

Tasks: 3

Memory: 50.9M

CGroup: /system.slice/nginx.service

├─23973 nginx: worker process

├─23974 nginx: worker process

└─32501 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf

6月 15 00:28:19 sixfoot systemd[1]: Starting nginx - high performance web server...

6月 15 00:28:19 sixfoot nginx[32497]: nginx: the configuration file /usr/local/nginx/conf/nginx.conf syntax is ok

6月 15 00:28:19 sixfoot nginx[32497]: nginx: configuration file /usr/local/nginx/conf/nginx.conf test is successful

6月 15 00:28:19 sixfoot systemd[1]: Started nginx - high performance web server.

3月 17 14:39:21 sixfoot systemd[1]: Reloading nginx - high performance web server.

3月 17 14:39:21 sixfoot systemd[1]: Reloaded nginx - high performance web server.

看到 Active 的状态是OK的,进程也在。

ps -ef | grep nginx

root 18639 18585 0 3月11 ? 00:00:00 nginx: master process nginx -g daemon off;

101 18911 18639 0 3月11 ? 00:00:00 nginx: worker process

www 23973 32501 0 14:09 ? 00:00:01 nginx: worker process

www 23974 32501 0 14:09 ? 00:00:01 nginx: worker process

root 27240 17800 0 14:10 pts/1 00:00:00 grep --color nginx

root 32501 1 0 2019 ? 00:00:01 nginx: master process /usr/local/nginx/sbin/nginx -c /usr/local/nginx/conf/nginx.conf

nginx 暂时没问题。

再看php-fpm的进程和状态

ps -ef | grep  php

33 17946 18725 0 3月12 ? 00:00:02 php-fpm: pool www

root 18725 18692 0 3月11 ? 00:00:23 php-fpm: master process (/usr/local/etc/php-fpm.conf)

www 22857 22854 0 14:09? 00:00:00 /bin/sh -c /usr/local/php/bin/php /www/web/sixfoot-server/artisan schedule:run >> /dev/null 2>&1

www 22859 22857 87 14:09 ? 00:00:02 /usr/local/php/bin/php /www/web/sixfoot-server/artisan schedule:run

root 22866 17800 0 14:09 pts/1 00:00:00 grep --color php

33 23628 18725 0 3月12 ? 00:00:01 php-fpm: pool www

33 25973 18725 0 3月11 ? 00:00:03 php-fpm: pool www

有进程在,但是数量和当前的请求数不对等,怀疑有问题。

service php-fpm status

Redirecting to /bin/systemctl status php-fpm.service

● php-fpm.service - The PHP FastCGI Process Manager

Loaded: loaded (/usr/lib/systemd/system/php-fpm.service; enabled; vendor preset: disabled)

Active: inactive (dead) since 二 2020-03-17 14:07:00 CST; 10min ago

Docs: http://php.net/docs.php

Process: 26247 ExecReload=/bin/kill -USR2 $MAINPID (code=exited, status=0/SUCCESS)

Process: 17080 ExecStart=/usr/local/php/sbin/php-fpm --nodaemonize --fpm-config /usr/local/php/etc/php-fpm.conf (code=exited, status=0/SUCCESS)

Main PID: 17080 (code=exited, status=0/SUCCESS)

3月 17 14:06:54 abc systemd[1]: Stopped The PHP FastCGI Process Manager.

3月 17 14:06:54 abc systemd[1]: Started The PHP FastCGI Process Manager.

3月 17 14:06:54 abc systemd[1]: Unit php-fpm.service cannot be reloaded because it is inactive.

看到这里的 Active 状态为 inactive 就知道,php-fpm 挂了。

赶紧把 php-fpm 起起来:

systemctl start php-fpm

再查看状态:

systemctl status php-fpm.service

● php-fpm.service - The PHP FastCGI Process Manager

Loaded: loaded (/usr/lib/systemd/system/php-fpm.service; enabled; vendor preset: disabled)

Active: active (running) since 二 2020-03-17 14:11:06 CST; 1min ago

Docs: http://php.net/docs.php

Process: 26247 ExecReload=/bin/kill -USR2 $MAINPID (code=exited, status=0/SUCCESS)

Main PID: 24223 (php-fpm)

Tasks: 31

Memory: 511.2M

CGroup: /system.slice/php-fpm.service

├─24223 php-fpm: master process (/usr/local/php/etc/php-fpm.conf)

├─24235 php-fpm: pool www

├─24236 php-fpm: pool www

├─24237 php-fpm: pool www

├─24238 php-fpm: pool www

├─24239 php-fpm: pool www

├─24240 php-fpm: pool www

├─24241 php-fpm: pool www

├─24242 php-fpm: pool www

├─24243 php-fpm: pool www

├─24244 php-fpm: pool www

├─24245 php-fpm: pool www

├─24246 php-fpm: pool www

├─24247 php-fpm: pool www

├─24248 php-fpm: pool www

├─24249 php-fpm: pool www

├─24250 php-fpm: pool www

├─24251 php-fpm: pool www

├─24252 php-fpm: pool www

├─24253 php-fpm: pool www

├─24254 php-fpm: pool www

├─24255 php-fpm: pool www

├─24256 php-fpm: pool www

├─24257 php-fpm: pool www

├─24258 php-fpm: pool www

├─24259 php-fpm: pool www

├─24260 php-fpm: pool www

├─24261 php-fpm: pool www

├─24262 php-fpm: pool www

├─24263 php-fpm: pool www

└─24264 php-fpm: pool www

3月 17 14:11:06 abc systemd[1]: Started The PHP FastCGI Process Manager.

再进行请求,OK了。

查找原因:从 php-fpm 的配置文件中 找到 错误日志设置

vi /usr/local/php/etc/php-fpm.conf

error_log = log/php-fpm.log

log_level = warning

在 /usr/local/php/var/log 下 找到了 php-fpm.log,进去查看

[17-Mar-2020 14:06:14] WARNING: [pool www] child 5964 exited on signal 9 (SIGKILL) after 3869.429674 seconds from start

[17-Mar-2020 14:06:14] WARNING: [pool www] child 6105 exited on signal 9 (SIGKILL) after 3839.418549 seconds from start

[17-Mar-2020 14:06:14] WARNING: [pool www] child 6108 exited on signal 9 (SIGKILL) after 3837.358373 seconds from start

[17-Mar-2020 14:06:14] WARNING: [pool www] child 7185 exited on signal 9 (SIGKILL) after 3531.743171 seconds from start

[17-Mar-2020 14:06:29] WARNING: [pool www] child 7212 exited on signal 9 (SIGKILL) after 3539.952668 seconds from start

[17-Mar-2020 14:06:34] ERROR: fork() failed: Cannot allocate memory (12)

[17-Mar-2020 14:06:47] ERROR: fork() failed: Cannot allocate memory (12)

[17-Mar-2020 14:06:48] ERROR: fork() failed: Cannot allocate memory (12)

[17-Mar-2020 14:06:49] WARNING: [pool www] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 15 idle, and 47 total children

[17-Mar-2020 14:06:59] ERROR: fork() failed: Cannot allocate memory (12)

看到很多进程被强制杀死的日志,接着3条 fork() 子进程失败的日志,原因是内存不够。

倒数第二条,看起来很繁忙(你可能需要增加 pm.start_servers 或 pm.min/max_spare_servers 数量),正在产生 8个子进程,有15个闲置的无任务进程,一共有47个进程。

看php-fpm.conf中的几个进程数配置

pm = dynamic

pm.max_children = 50

pm.start_servers = 30

pm.min_spare_servers = 20

pm.max_spare_servers = 50

pm.max_requests = 2048

这些都还OK。

 

以上是 一次502错误解决过程 的全部内容, 来源链接: utcz.com/z/514459.html

回到顶部