You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Mailman/Monitoring

From Wikitech-static
< Mailman
Revision as of 14:37, 30 April 2021 by imported>Ladsgroup (Created page with "We monitor a few key components: * Processes ** <code>mailman_ctl</code>: Is the control process in the list of running processes? *** If not, check logs in <code>/var/log/ma...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

We monitor a few key components:

  • Processes
    • mailman_ctl: Is the control process in the list of running processes?
      • If not, check logs in /var/log/mailman and/or try restarting mailman.service.
    • mailman_qrunner: Is the queue runner process in the list of running processes?
      • If not, check logs in /var/log/mailman and/or try restarting mailman.service.
  • HTTP
    • mailman archives: Are the archives for wikimedia-l reachable?
      • If not, check apache.service.
    • mailman list info: Is the subscribe page for wikimedia-l reachable?
      • If not, check apache.service.
  • Queues
    • mailman_queue_size: Is there a backlog in the bounces, in, and virgin queues?
      • If so, check:
        • dashboard
        • logs in /var/log/mailman
        • you may have to look at what is in /var/lib/mailman/qfiles/(bounces|in|virgin) to see what is not sending.
    • Mailman outbound queue hours until empty: How long will it take to drain the out queue?
      • If it's going to take a long time, check:
        • Date and time of day. Historically, the first of the month has the longest queue of about 15+ hours long and 08:00 UTC each day about 1+ hours long.
        • dashboard
        • logs in /var/log/mailman and /var/log/exim4
        • you may have to look at the messages themselves in /var/lib/mailman/qfiles/out to see what is not sending.