TrustRadius: an HG Insights company

Prometheus

Score8.3 out of 10

106 Reviews and Ratings

What is Prometheus?

Prometheus is a service monitoring and time series database, which is open source.

Prometheus in Practice Powerful Flexible but Not Plug-and-Play

Use Cases and Deployment Scope

We use Prometheus to scrape metrics from our Linux servers, Kubernetes clusters, Docker containers, and cloud infrastructure. Our development teams instrument custom applications using Prometheus client libraries (Python, Go, Java) to expose application-specific metrics such as request latency, error rates, and queue lengths. Prometheus also works in tandem with Alertmanager to send alerts to our on-call engineers via Slack, PagerDuty, and email.

Pros

  • Alerting
  • Faster Incident Response
  • Monitoring

Cons

  • Long-term storage limitations
  • Lack of native built-in auth

Return on Investment

  • Improved System Reliability and Uptime
  • Faster Incident Response and MTTR
  • Lack of Built-In Security or Multi-Tenancy

Usability

Alternatives Considered

Elasticsearch, Redis Software, Cortex and Zabbix

Other Software Used

Elasticsearch, Datadog, Zabbix

Prometheus for Quantifiable Data Collection

Use Cases and Deployment Scope

We use Prometheus as data metric collection source of our systems' many software services. Each service has their own Prometheus instances created to collect specific metrics. The metrics are displayed and manipulated in Grafana using the Prometheus data source. Prometheus supports these Grafana dashboard displays to aid system monitoring, performance troubleshooting, and statistics.

Pros

  • Store data metrics
  • Support code queries
  • Provide data source to any Grafana dashboard

Cons

  • Provide categorized metric lookup
  • Suggest certain code queries
  • Include metric descriptions

Most Important Features

  • Universal Integration
  • Shareable data sources
  • Best for quantifiable metrics

Return on Investment

  • Customer desires to see metric products which Prometheus directly supports
  • Most developers use Prometheus for service management anyways
  • Provides great value at every product level for internal and external use

Alternatives Considered

Grafana Loki, Grafana and Prometheus

Other Software Used

Grafana, Grafana Loki, Prometheus

A closer look at monitoring and alerting using Prometheus

Use Cases and Deployment Scope

We primarily use Prometheus for metrics and alerting.

We use Prometheus to monitor http endpoints of our services and provide real time metrics such as how many 2xx responses in last 5 minutes out of total responses for a particular endpoint and how many 5xx responses for the same period.

This helps us keep a watch on error rates and we also have alert rules defined in Prometheus such as if 5xx responses increase beyond 80% in last 5 minutes then fire an alert.

It is solving the problem of getting to know when something wrong happens on production so that we can respond to it and fix it immediately.

It also provides us with useful metrics regarding the performance of our endpoints.

This ensures higher uptime of the services and fewer issues reported by the clients.

Pros

  • Providing real time metrics of http endpoints
  • Setting up of rules for firing alerts when things go wrong
  • Firing alerts when certain threshold is reached for 5xx responses on a particular endpoint

Cons

  • Currently, the user interface of Prometheus is not very intuitive and has room for improvement
  • Prometheus can also provide more details about the errors that are causing 5xx responses. Currently it just reports on the metrics that this particular endpoint has these many 5xx responses in past these many minutes.
  • It tells us that something is wrong in the system and we need to find what is wrong and fix it. But Prometheus does not provide more context on what exactly is wrong.
  • Creating rules in Prometheus and then validating if they are correct and working as expected could be a daunting task. It could be made easier if more testing support is provided by Prometheus.

Return on Investment

  • Prometheus helps us in monitoring our error volumes and alerting when it goes beyond a threshold so that we can early detect any issues and fix them before users report it.
  • It helps us in reducing our time to detect and adhere to our uptime commitment of 99.9 %
  • Prometheus helps us setting up different rules for different endpoints so that we can make sure critical functionality is not affected and we're able to come to action if any anomaly is detected in let's say order submission endpoint.
  • Prometheus would fire an alert in such a scenario.
  • We've been able to achieve Mean Time To Detect(MTTD) of 15 minutes because of Prometheus based alerts.

Usability

Other Software Used

Grafana

Prometheus in OnPrem apps

Use Cases and Deployment Scope

We use Prometheus for on prem applications currently to monitor baseline monitors which includes CPU, Memory, Disk Space. Since on prem applications either will stay on prem or will be moved to cloud in later part, as a solution we had Prometheus which proved to be cost effective to monitor those.

Pros

  • Monitoring
  • Alerting
  • Tool to onboard

Cons

  • Should have options of adding servers from just one tool
  • We have 3 different tool to onboard, alert and dashboard which should have been consolidated in one.

Return on Investment

  • Well it help us to reduce user impacts which all observability tool does

Usability

Alternatives Considered

Datadog, Catchpoint, Kibana and OpenText SiteScope

Other Software Used

Datadog, Catchpoint, Kibana, OpenTelemetry

The response capacity of this program is incredible.

Use Cases and Deployment Scope

Prometheus has been a program that has supported us for a long time with multiple solutions and resources for our data; since we have installed this program, we can say that we operate with more professionalism and security since this software gives us the right tools for our nonsense within the digital world.

Pros

  • The response capacity of this program is incredible, and the preparation and organization of my data are reflected in my work and the good results.
  • Prometheus has been in charge of giving more professional support to our operation and, of course, to the response to all our external commercial relationships.

Cons

  • One of the worst aspects of this program that you should know before purchasing it is the price difference between its competitors, since this may be the most expensive.
  • Another feature that does not benefit this program at all is the interface it presents us with since it is a bit haphazard and does not allow us a better workflow.

Most Important Features

  • The most relevant data of my company may be constantly shared with authorized personnel for this; this program allows us to be united and organized.
  • Without hesitation, I have had the most outstanding growth and performance in my job since Prometheus gives me a pleasant command of all my work projects.

Return on Investment

  • Thanks to the support we receive from Prometheus, we can say that business processes and scenarios are properly handled by the infrastructure and base with which it was created.
  • There are many functions that we can give our master data, from sharing it to archiving it in a safe and unreliable space to work.

Other Software Used

KACE Cloud Mobile Device Manager (MDM), Kalido MDM, Oracle Utilities Meter Data Management (MDM)