Zum Ende der Metadaten springen
Zum Anfang der Metadaten

Sie zeigen eine alte Version dieser Seite an. Zeigen Sie die aktuelle Version an.

Unterschiede anzeigen Seitenhistorie anzeigen

« Vorherige Version anzeigen Version 3 Nächste Version anzeigen »

On a production server several things should be monitored automatically, from inside and outside, with automatic alarms actually reaching someone feeling responsible.

Local monitoring recommendations

We suggest to monitor:

  • Free hard disc space

  • Free physical and virtual RAM (but: virtual RAM is a reserve for peak load, no real resource)

  • CPU load (not only the computation usage, also the overall load respecting I/O and context switches, in Linux think about monitoring /proc/load)

In case of presumed overload, try very hard to distinct between the several aspects of distributed computation and the whole list of possible bottlenecks down to network usage and disc I/O.

Remote monitoring recommendations

We suggest to monitor:

  • Basic network connectivity (ping with timing)

  • Application connectivity (HTTP(S)-Requests, checking reaction time and some minimal content bit)

Maintenance

Somebody should watch the watchers.

Every now and then check:

  • Is the monitoring still running? Eventually stop or interrupt something, at a point in time when you don't ruin someone's day.

  • Would alarms reach anyone? Eventually send test messages.

  • Is there activity at all? Idle servers may be idle because the clients can't connect.

  • Keine Stichwörter