You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Incident documentation/2018-02-06 Phabricator

From Wikitech-static
Jump to navigation Jump to search

Summary

Phabricator stopped working on Feb 6 2018.

Users saw either our standard error page from the caching layer or "Failed to `proc_open()`: proc_open() expects parameter 2 to be array, unknown given"

The issue was fixed about 6 minutes later after restarting httpd.

Timeline

  • 16:51 <+icinga-wm> PROBLEM - https://phabricator.wikimedia.org on phab1001 is CRITICAL: HTTP CRITICAL: HTTP/1.1 503 Backend fetch failed - 2426 bytes in 2.329 second response time
  • 16:56 < elukey> !log restart httpd on phab1001
  • 16:57 <+icinga-wm> RECOVERY - https://phabricator.wikimedia.org on phab1001 is OK: HTTP OK: HTTP/1.1 200 OK - 31921 bytes in 0.225 second response time

Conclusions

  • There is a deadlock bug in PHP 5.6 which is triggered by Phabricator, leading to a deadlock in malloc as shown by this stack trace.
  • The deadlock causes apache to slowly leak worker processes until the pool is filled some other system is exhausted.
  • Until we can work around the bug or upgrade to a newer version of php, we will have to periodically restart apache on phabricator servers.

Actionables

Explicit next steps to prevent this from happening again as much as possible, with Phabricator tasks linked for every step.

  • Set up a cron job to periodically kill deadlocked apache processes phab:T187790
  • Upgrade phabricator to PHP 7.1 as soon as it is practical to do so. phab:T160714