This PM2 module is:
- Monitoring events (like app exit, restart etc.)
- Monitoring app exceptions
- Monitoring PMX metrics of your apps and send alerts when value hits treshold
- Sending mail notifications
With rich config options you can fine-tune monitoring rules.
pm2 install pm2-health
After installation run pm2 conf
to configure module. Alternatively edit module_conf.json
file directly (in PM2 home folder).
"pm2-health": {
"smtp": {
"host": "your-smtp-host",
"port": 587,
"from": "your-from-mail", // if not set, user will be used
"user": "your-smtp-user", // auth
"password": "your-smtp-password", // auth
"secure": false,
"disabled": false,
"clientHostName": "your-machine.com", // optional, will force the client host-name FQDN used in SMTP HELLO. If not set, NodeMailer will ask host name to the OS, and use 127.0.0.1 if it's not a FQDN.
},
"mailTo": "mail1,mail2"
}
If any of required properties is not defined,
pm2-health
will shutdown. You can check error logs for details.
-
smtp
- SMTP server configuration. If your SMTP doesn't require auth, leavesmtp.user
empty -
mailTo
- comma separated list of notification receipients -
replyTo
- reply to address (optional) -
events
- list of events to monitor (optional). If not set, all events will be monitored.
Manually triggered events will not send notification.
-
exceptions
- iftrue
apps exceptions will be monitored (optional) -
messages
- iftrue
apps custom messages will be monitored (optional). See Custom messages -
messageExcludeExps
- array of regular expressions used to exclude messages (optional). See Filtering custom messages -
metric
- object describing PMX metrics to be monitored (optional). See Metrics monitoring -
metricIntervalS
- how often PMX metrics will be tested in seconds (optional). If not set, 60 seconds is used -
aliveTimeoutS
- alive watchdog timeout interal in seconds. If not set watchdog function is off. See Process alive watchdog -
addLogs
- iftrue
app logs will be added as mail attachement (optional) -
appsExcluded
- array of app names to exclude from monitoring (optional) -
appsIncluded
- array of app names to include, if setappsExcluded
is ignored (optional) -
webConfig
- if set, some of the config settings can be downloaded from given url (optional). See Web config -
debugLogEnabled
- iftrue
debug log is enabled, by default isfalse
(optional) -
batchPeriodM
- enables message batching and sets batching period (optional). See Message batching -
batchMaxMessages
- max. messages in batch (optional). See Message batching
pm2-health
can monitor any PMX metrics defined in your apps.
To configure rules of alerting, setup metric
section in module config file.
"metric": {
"metric name": {
"target": 0,
"op": ">",
"ifChanged": true,
"noNotify": true,
"noHistory": true,
"exclude": false
},
"metric 2": {
...
}
}
-
metric name
- name of metric defined in one of your apps -
target
- target numeric value -
op
- operator to compare metric value and target. Can be one of:<
,>
,=
,<=
,>=
,!=
-
ifChanged
- iftrue
, alert will trigger only if current metric value is different from last recorded value (optional) -
noNotify
- iftrue
, no alerts will be send (optional) -
noHistory
- iftrue
, metric value history won't be stored (optional) -
exclude
- iftrue
, metric will be complettely excluded from monitoring (optional) -
direct
- iftrue
, metric value won't be converted to number (optional)
By default, cpu
and memory
metrics are added.
Learn how to define PMX metrics in your apps here: http://pm2.keymetrics.io/docs/usage/process-metrics/
On top of standard PM2 events, you can monitor custom messages sent from your apps.
To send message from your app use:
process.send({
type: "process:msg",
data: {
...
}
});
-
type
- must beprocess:msg
-
data
- object containing additional data (optional).
You can exclude some of the messages based on their data
content:
- Add regular expression to list in
messageExcludeExp
config property data
(converted to JSON string) will be tested with this all expressions in the list- If any test is positive, message will be excluded
Example:
You wish to monitor slow operations in your app, so you send custom messages like so:
function slow(operation, duration) {
process.send({ type: "process:msg", data: { operation, duration }});
}
You know that backup
and restore
operations are always slow and wish to exclude them, but still get other slow operations.
Set config to:
"messageExcludeExps": [
"\"operation\": \"(backup|restore)\""
]
Remember to escape regex string to JSON string
Alive watchdog (added in 1.9.0) can observe alive messages from processes.
To use functionallity your process has to send periodically process:msg
signal as such:
process.send({
type: "process:msg",
data: "alive"
});
In addition config parameter aliveTimeoutS
must be added. If alive message won't be received within aliveTimeoutS
(seconds), alert will be send.
aliveTimeoutS
must be lower than interval of sendingprocess:alive
signal.
After first alert, following test is done every 10 minues for 6 consecutive times, after wich alerting stops, assuming process is permanetly closed.
Web config (added in 1.7) allows you to fetch some of the config settings from web url.
Sample config:
{
"webConfig": {
"url": "url of JSON file",
"auth": {
"user": "...",
"password": "..."
},
"fetchIntervalM": 10
}
}
Url must return UTF-8 JSON with config properties.
Only following properties can be used:
events
,metric
,exceptions
,messages
,messageExcludeExps
,appsExcluded
,metricIntervalS
,addLogs
,batchPeriodM
,batchMaxMessages
Feature added in (1.11) allows to merge multiple messages over period of time and send them as single message.
This can be used to limit number of messages sent (prevent spam).
To enable please set following properties in config section:
batchPeriodM
- period (in minutes) to batch, if not set batching is not enabledbatchMaxMessages
- max. number of messages in batch (optional)
Batch message will be send after batchPeriodM
elapses or if number of messages collected are greater than batchMaxMessages
.
Priority messages (as exceptions etc.) are not batched and sent immediatelly.
It's advised to set
batchMaxMessages
to prevent huge messages.
Batching settings can be changed by web config.
To hold mail notification: pm2 trigger pm2-health hold 30
Notifications will restart automatically after 30 minutes.
To unhold immediatelly: pm2 trigger pm2-health unheld
All monitoring processes continues, just mail notification is held
Mail uses HTML format. To adjust template, you can edit Template.html
<!-- body -->
will be exchanged with actual message body.
<!-- timeStamp -->
will be exchanged with event timestamp (UTC).
pm2-health
update will override yourTemplate.html
, so keep backup 😊
To send test mail: pm2 trigger pm2-health mail
pm2-health
is written using TypeScript 2.6.1+ with es2017
target.
es2017
is supported by Node 8+. If you need to use ealier version, build solution using es5
or es6
target.
Solution includes VS Code settings for build and debug.