URGENT: Experienced System Administrator needed

IT & Networking Network & System Administration Posted 1 year ago

Hourly Job

Hours to be determined
Less than 1 month
$$$

Expert Level

I am willing to pay higher rates for the most experienced freelancers

Details

We are urgently hiring an experienced System Administrator to help us troubleshoot a serious issue with one of our main web servers, which has recently been failing due to heavy load.

This is a cloud server hosted at DigitalOcean, having 8 CPUs, 160GB SSD Disk and 16GB of RAM, running nginx 1.4.2 with PHP 5.5.3 (served via php-fpm) and MySQL 5.6

Although the traffic has increased drastically in the past week, we still have over 95% of idle CPU and over 1GB free RAM at any given time. However, the php-fpm fastcgi server stops responding at random intervals during peak hours and a manual 'service php-fpm restart' is necessary to get it back online.

We have a hard time identifying whether the bottleneck is MySQL, php-fpm or nginx.

When the issue occurs, the following errors are recorded in the error logs:

PHP-FPM:
========
WARNING: [pool www] seems busy

PHP:
====
PHP Warning:  Error while sending QUERY packet. PID=22771 in ********

Nginx:
=====
EITHER

[error] 9240#0: *581395 recv() failed (104: Connection reset by peer) while reading response header from upstream, client: *******, server: *******, request: "GET /l.php?id=124 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "******", referrer: "****"

OR

[error] 9242#0: *582048 connect() failed (111: Connection refused) while connecting to upstream, client: *******, server: ******, request: "GET /l.php?id=50 HTTP/1.1", upstream: "fastcgi://127.0.0.1:9000", host: "*******", referrer: "*******"

--------

Our observations are showing that at some point MySQL stops responding to the requests sent to it by PHP-FPM. At this point, more and more PHP-FPM children are spawned by the server, until the pm.max_children limit is finally reached and PHP-FPM stops responding. This happens literally in seconds, and thus the entire system goes down.

We are looking for a very experienced person, who can work on fixing this issue ASAP. If you don't have experience troubleshooting servers that process a couple of million requests per day, please do not apply.

The selected candidate will receive full cooperation from our technical rep, as well as a walkthrough of how the system is currently set up and what troubleshooting attempts have been made so far.

We are looking forward to hearing from you.

Skills Required:

Client Activity on this Job

Last Viewed: 1 year ago

Proposals: 20

Hired: 1


About the Client

(5.00) 19 reviews

Bulgaria
Sofia 12:24 AM

28 Jobs Posted
75% Hire Rate, 1 Open Job

Over $50,000 Total Spent
29 Hires, 6 Active

$23.01/hr Avg Hourly Rate Paid
6,395 Hours

Member Since Dec 25, 2012