Cloud data simplified: a great all-in-one tool for all your data analytics needs
Use Cases and Deployment Scope
Previously we were using Azure SQL Database with its JSON capabilities and various Azure serverless services to manage our data, but at that growth rate, time and cost were becoming limiting factors.
Pros
- Build, schedule and monitor complex data pipelines (Azure Data Factory component)
- Access your data lake using the familiar T-SQL syntax and TDS-enabled tools (SSMS, ADS, ...). This is especially useful for business people that are used to a specific workflow.
- Support a wide range of data transformation tools, from low-code (DataFlows) to full-code (Spark), all integrated in a single central orchestrator (Azure Data Factory-like)
- Provide all these services as a single very convenient package, without the need to know beforehand all the configuration behind
Cons
- There's no support for Synapse Serverless objects (e.g., views) in SSDT - the VCS-friendly approach to schema deployments from Microsoft. SSDT is available for almost all other SQL Server and Azure SQL products, including Synapse Dedicated SQL Pools.
- There are lots of ways to accomplish the same task, and it's not very clear which one is best suited for a given scenario other than trial and error. Also, some scenarios (e.g., efficient management of late arrivals) don't have a clear solution path.
- I think it would be cool to have a tighter integration of the product with the Azure Data Studio client, not only for connecting to SQL Serverless or Dedicated Pools. For example, PySpark development and debugging would be much easier if done from ADS.
Likelihood to Recommend
I think this product is not suited for smaller, simpler workloads (where an Azure SQL Database and a Data Factory could be enough) or very large scenarios, where it may be better to build custom infrastructure.
