Logstash is a part of the ELK stack in the private Telcocloud datacenter I used in Nokia. The Telco datacenter infrastructure has a lot of logs to be analyzed, and Logstash acts as a log shipper on the Opensatck-based infrastructure. It helped collect logs from various sources and then processed them in the required format to be displayed on Kibana for analysis & Grafana for visulisation for graphs
Pros
Supports unstructured log data into searchable fields
Wide integration with almost any data source and backend
Powerful searchable fields, including unstructured log data
Supports various formats like JSON, CSV, XML, key-value, etc
Cons
It is heavy i.e., intensive as of now. Need to reduce overhead to save CPU/RAM consumption
Need to be more Kubernetes-friendly. Should support auto-scaling and K8s observability
Initial configuration is still complex. A seamless config procedure is still required
Likelihood to Recommend
Logstash is a good choice for a full-fledged VM-based or bare-metal server data center because there is no resource crunch in a production-grade data center. It collects multiple inputs like logs, metrics, and events from diverse sources (files, syslog, Beats, Kafka, cloud services). It only loses some points in K8s-based infrastructure because there is still scope for improvement for Logstash in supporting K8s observability and auto scaling.
We investigated (and use) Logstash as apart of our ELK (ElasticSearch, Logstash and Kibana) stack. We input data from websites that post daily updates. This data is read into Logstash from RSS feeds, transformed in Logstash, then Sent to ElasticSearch / Kibana for visualization and reporting. Our problem was we needed to scrape large amounts of unstructured data from multiple sites that provided users the ability to post information. This information was free form. To make matters more interesting, these sites did not have an open API to query the data directly, so writing a simple cURL bot would not suffice. Therefore, we turned to ELK, due to previous projects where we employed ELK successfully. So the primary Return On Investment (ROI) is the fact that we had a fast, tightly Cohesive (but loosely coupled) stack of software that we knew how to configure and integrate well with our Laboratory.
Pros
Modern: most Admin, Server and/or DevtyOps-Centric software worth it's salt will have the ability to configure it's services and features from a small webpage and REST API. Logstash is no exception
Speed: Logstash configuration is just a reload away. While you CAN use the gui (see point above), editing the configuration files directly is also a great option. Our configuration files are hosted on an internal Repository, that once we make a change, we and track them as we do a reload, and those changes are reflected in Logstash almost immediately (dependent on the Data Source's speed and flow of Data)
Configuration: Logstash is very simple to configure, and fulfills our desire to keep configuration files in a plantext format.
OpenSource friendly: Logstash is opensource, and built with open source tools
Cons
Memory: Logstash is a HOG, if you are deploying it on commodity (i.e. cheap and old) hardware: You will need at least 2GB, just for Logstash. So don't expect to run your entire ELK stack on one AMD Athlon machine.
Overlap: Logstash fills in an area of the ELK stack that makes the most sense: as a log file transformer / shipper. However, if you start breaking that stack, with the addition of other components- you start seeing where features of Logstash may be implemented or solved in the additional components much easier (or better, or to a higher degree of resolution)
More Overlap: Since my team employs Syslog-ng extensively- Logstash can sometimes get in the way (and this may be a problem for DevOps stacks overall): You can configure Syslog to record certain information from a source, filter that data, and even export that data in a particular format. Logstash will pick that data up, and then parse it. However, if you don't keep your Syslog-ng configuration files, and your Logstash configuration files in sync, your results will not be what you expected, and this will translate into (sometimes) hours/days of work, hunting down a line item in a configuration file.
Likelihood to Recommend
Perfect for projects where ElasticSearch makes sense: if you decide to employ ES in a project, then you will almost inevitably use LogStash, and you should anyways. Such projects would include: 1. Data Science (reading, recording or measure web-based Analytics, Metrics) 2. Web Scraping (which was one of our earlier projects involving LogStash) 3. Syslog-ng Management: While I did point out that it can be a bit of an electric boo-ga-loo in finding an errant configuration item, it is still worth it to implement Syslog-ng management via LogStash: being able to fine-tune your log messages and then pipe them to other sources, depending on the data being read in, is incredibly powerful, and I would say is exemplar of what modern Computer Science looks like: Less Specialization in mathematics, and more specialization in storing and recording data (i.e. Less Engineering, and more Design).
VU
Verified User
Engineer in Corporate (Computer Software company, 1-10 employees)