A resposta antiga era Off Topic devido a mal-entendidos . Mantendo-o como referência:
There are in fact multiple tools which allow what you want.
For example:
- logstash (which you already know)
- graylog
- Prometheus
Every one of them requires you to define triggers in some way on which you would be notified. Diving into this matter for multiple tools is way to much for this platform though.
There are multiple major areas that one would need to consider while building a really helpful monitoring and alerting system.
Gathering/Monitoring/Aggregation of:
- Availability of systems (hardware, software, services)
- Errors during operation of those systems (logs, correct responses)
- Changes over time (metrics of system parameters i.e. disk space and load, response times of services, rollout of newer versions)
Then one would be needed to define levels for alerting:
- Host/Service Up/Down
- Process Running
- Load over x.xx,x.xx,x.xx
- Disk space under x.xx
- Data growing rate bigger than x.xx MB/day
- http 500 responses > x/second
- etc