Machine learning: databicks sets a ml platform on the data lake

Machine Learning: Databicks sets a ML platform on the Data Lake

The lakehouse architecture that combines the advantages of data lakes and data warehouses to provide companies a central data and analysis platform, now expanding DataBricks to an organizational and teammaking machine-learning platform. As part of the Data + Ai Summit, the company announced the official start of DataBricks Machine Learning, the Data Engineers, Data Scientists and Product Owner to make the joint work on ML projects.

Throttle MLOPS platform

With MLFlow Databicks had already launched an open source project for life cycle management of machine-learning projects, which is now besides Apache Spark, Delta Lake, Koalas and the newly presented Delta Sharing under the roof of the Linux Foundation Managed. Databicks Machine Learning should now go one step and bring together the entire process from the data architecture including Pipelines (Data Engineering) through the Model Training (Data Science) to providing the applications (data products) based on it.

This is to create a central, collaborative platform for data teams in companies, the translated-needed tools from preparing the data on experimenting up to productive farm bundled. The platform also supports the teams with two new features: DataBricks Automl and DataBricks Feature Store. With Automl, many of the steps to be completed by Data Scientists should be largely automated to ML model development and training largely – without the models become the black box, Databicks promises. Data Scientists should keep control of how a model is working exactly, customize it and also validate unknown records. Thanks to integration with MLFlow, all important parameters, metrics and ML models should be tracked at any time.

Machine Learning: Databicks sets a ML platform on the Data Lake

The new platform Databicks Machine Learning overview.

Keep and manage features in view

The Feature Store ames the role of a single point of truth for the existing features that exist in the organization or company. Data teams can understand the stores how the features are built and where they are already used – including the data sources used for the calculation. The Feature Store does not support Data teams in the Data Lineage, but also helps to avoid phanomena such as the online offline SKEW, which can be noticeable as varying model performance between real-time and batch applications.

More information about DataBricks Machine Learning, Databicks Automl and the Databicks Feature Store summarizes the blog post for official envision on the Data + AI Summit. The ML platform is at first as a public preview for customers of the provider.

Like this post? Please share to your friends:
Leave a Reply

;-) :| :x :twisted: :smile: :shock: :sad: :roll: :razz: :oops: :o :mrgreen: :lol: :idea: :grin: :evil: :cry: :cool: :arrow: :???: :?: :!: