From Analytics to AI: Evolving a Data Platform to Power Machine Learning at Scale
As AI becomes a core pillar of product development at Datadog, our Analytics Data Platform department has had to evolve to support it. Originally built for traditional analytics, business intelligence, and ETL pipelines, our platform now also powers the work of AI engineers.
These engineers, a newer first-class citizen among the data practitioners we support, build and deploy machine learning models across many parts of the Datadog product suite.In this talk, I’ll share how we adapted our platform to meet the needs of AI engineering at scale. We did this without compromising on self-service, governance, or reliability.
Along the way, we introduced support for new storage formats like Iceberg and distributed processing engines like Ray.
From self-serve data intake to flexible exploration workflows with robust metadata and observability, we’ve reimagined the platform experience for iterative model development in a complex, multi-source environment.I’ll walk through how AI engineers at Datadog can discover, explore, trust, and use a wide range of data to build ML models.
That includes observability telemetry from our products, as well as cloud and business datasets. I’ll show how they use our unified catalog to access data across our Iceberg-based Lakehouse, internal time series systems, and in-house storage powering core products like Logs.Whether you’re modernizing an analytics platform or building new capabilities for AI teams, this session will offer concrete lessons on building scalable, reliable, and self-serve infrastructure for data and AI.