Azure Databricks: A Data Consultant's Dream
Use Cases and Deployment Scope
As a Big Data Consultant. Azure Databricks is my favorite tool in the house!
The biggest problems with data consulting is a plethora of programming languages it deals in, from SQL, Scala,R, Python, Java etc.
That is exactly where Azure Databricks excels! It supports all languages in a single notebook with an equivalent performance for all! Club that with a visually pleasing UI, features that integrate the entire data lifecycle, and an architecture that gets the best of spark and you have one of the best data tools in your hand!
Pros
- Data Processing and Transformations based on Spark
- Delta Lakehouse when clubbed with an external cloud storage
- Governance using Unity Catalog to unify IAM
- Delta Live Tables is a product, which although relatively newer, has a great potential with the visuals of a pipeline.
Cons
- The new UI is a bit clunky compared to the old UI. It also adds new elements in the sidebar which are not relevant to the workspace. Can be worked upon
- Delta Live Tables, although powerful, has a lot of things that can be improved, including error debugging, support for new things
- Concurrent requests need some more optimisation and work in the delta lake tables.
Likelihood to Recommend
Suppose you have multiple data sources and you want to bring the data into one place, transform it and make it into a data model. Azure Databricks is a perfectly suited solution for this. Leverage spark JDBC or any external cloud based tool (ADG, AWS Glue) to bring the data into a cloud storage. From there, Azure Databricks can handle everything. The data can be ingested by Azure Databricks into a 3 Layer architecture based on the delta lake tables. The first layer, raw layer, has the raw as is data from source. The enrich layer, acts as the cleaning and filtering layer to clean the data at an individual table level. The gold layer, is the final layer responsible for a data model. This acts as the serving layer for BI For BI needs, if you need simple dashboards, you can leverage Azure Databricks BI to create them with a simple click! For complex dashboards, just like any sql db, you can hook it with a simple JDBC string to any external BI tool.