TrustRadius Insights for IBM StreamSets are summaries of user sentiment data from TrustRadius reviews and, when necessary, third party data sources.
Business Problems Solved
Users have found Streamsets to be a versatile and user-friendly platform that solves a variety of data integration challenges. One key use case is the ability to easily develop on-premises and deploy to the cloud, helping users control their cloud budget efficiently. The platform has also been praised for its seamless integration with Apache Kafka and Apache Nifi, simplifying the process of connecting these tools with a data lake.
Streamsets has proven valuable in handling real-time data consumption, filtering, tagging, and monitoring of systems, as well as anomaly detection based on traffic patterns. Users have utilized the platform for data movement, migration, and ingestion, reducing downtime and simplifying the process. Additionally, Streamsets has been widely used for data extraction from various source systems, including IoT devices, enabling users to gain insights from previously inaccessible data sources.
The tool's ability to handle different data formats elegantly and save time compared to hand-coded ETL tools has been appreciated by users. It has been effectively used for solving big data ETL problems, offering fast transfer, support for various sources and destinations, and prompt support. Streamsets has also been utilized in AI/ML tasks such as building transformations for knowledge graphs.
Overall, Streamsets has proven reliable and efficient in handling data ingestion from various sources, meeting the needs of users across industries and providing flexibility in designing pipelines with minimal coding.
I used IBM StreamSets for data analysis. It is a brilliant tool for monitoring data for analysis and provide pie charts and graphs in an easily readable format which lets even a not so well trained but knows enough person it read it efficiently and accurately. The charts and graphs give thorough information about the data without missing any key points.
Pros
Graphs and charts are designed well
Data summation is amazing
Easy to read and understand the summed up information
Cons
Where the person's skillsets in data analysis is not of an expert.
Data monitoring and analysis.
Customer data for better customer acquisition
Likelihood to Recommend
I've used this tool personally for looking at the data of how many videos the user watches. The categories and subcategories that the user us interested in. What are the professions of the people who are most interested in a particular kind of video. What institutes and researchers are more inclined towards which categories and sub-categories etc.
P.S. I work in a company which has a catalogue of scientific research and educational videos.
VU
Verified User
Engineer in Engineering (Computer Software company, 201-500 employees)
We use IBM StreamSets for batch loading of data sets between disparate applications into a Data estate so we can query the data to find patterns. We also use IBM StreamSets to handle our continuously streaming data requirements. We went with IBM StreamSets over the competition because of their unique (patented) architecture.
Pros
Real-Time Data Ingestion.
Streaming Pipelines at Scale.
Handling Data Drift and Schema Changes.
Flexibility Across Hybrid/Multi-Cloud/On-Prem Environments.
Cons
Performance handling Large Data Volumes.
Debugging, Error Logging, and Observability.
Connector/Integration Coverage.
Likelihood to Recommend
Because real-world sources often change (new fields get added, formats get tweaked, etc.), StreamSets helps detect and adapt to those "schema drifts" or changes automatically, or with minimal manual intervention. That makes pipelines more resilient and significantly reduces the maintenance burden. Therefore, data sets with constantly changing sources/formats are great for StreamSets.
VU
Verified User
Director in Corporate (Computer Software company, 501-1000 employees)
I use IBM StreamSets for AI-related tasks including continuously training models, as well as for real-time data streaming, handling schema changes, and simply scaling pipelines.
Pros
it connects to many data sources and helps catch issues early with built-in alerts and monitoring tools
it supports real-time and batch processing, handles data drift well, and makes pipeline debugging easier with the updated UI
Cons
it lag when handling large amounts of data
the error logs were sometimes difficult to interpret
support took time to respond when I needed urgent help
Likelihood to Recommend
StreamSets is for teams looking for a fast and low-code way to build ETL pipelines, also it’s especially useful if you’re dealing with real-time data or complex source systems.