Why modernise your data platform — and why now?
Most organisations are running their analytics on infrastructure that was designed for a different era — on-premise data warehouses, legacy ETL pipelines, and BI tools that were state-of-the-art a decade ago. The problem is not that these systems don't work. It's that they can't support what organisations now need to do with data.
Modern AI and analytics capabilities — real-time processing, machine learning at scale, generative AI integration, self-service analytics for non-technical users — require a different foundation. A legacy warehouse was not designed to serve a large language model. A batch ETL pipeline cannot support real-time decision-making. The platform becomes the constraint.
Staying on legacy infrastructure is not free. The costs are hidden but real: slower time-to-insight, higher maintenance overhead, inability to adopt new AI capabilities, and data quality issues that compound over time. Every year of delay makes the eventual migration harder and more expensive.
The good news: cloud-native data platforms have matured significantly. Tools like Databricks, Snowflake, and the major cloud providers' data stacks offer capabilities that would have been unimaginable — or unaffordable — five years ago. The barrier is no longer technology. It's change management, stakeholder alignment, and migration execution.
- Legacy platforms are not just a technical debt problem — they're a strategic constraint that limits your AI and analytics ambitions.
- The cost of inaction is real and growing. Every year of delay compounds the eventual migration complexity.
- The main barriers to modernisation are organisational, not technical — plan accordingly.
The modern data platform landscape
The terminology in this space can be overwhelming. Here's a practical map of the key components of a modern data platform and what each one does.
| Layer | What it does | Key tools |
|---|---|---|
| Ingestion | Moves data from source systems into the platform — batch or real-time | Fivetran, Airbyte, Kafka, Azure Data Factory |
| Storage | Stores raw and processed data at scale, cost-effectively | Azure Data Lake, S3, Google Cloud Storage, Delta Lake |
| Processing | Transforms, cleans, and enriches data for analysis | Databricks, Spark, dbt, Azure Synapse |
| Serving | Makes processed data available to analytics and AI tools | Snowflake, BigQuery, Redshift, Databricks SQL |
| Analytics & BI | Enables exploration, dashboarding, and self-service reporting | Power BI, Tableau, Looker, Metabase |
| AI & ML | Trains models, runs inference, serves predictions | Databricks MLflow, Azure ML, Vertex AI, SageMaker |
| Governance & catalogue | Tracks data lineage, ownership, quality, and access | Unity Catalog, Purview, Collibra, Alation |
You don't need all of these from day one. The right architecture depends on your organisation's size, maturity, and use cases. The most common mistake is buying a comprehensive platform before you understand what you actually need to build on it.
- Understand each layer before selecting tools — the layers are more important than the specific products.
- Start with the layers that are most broken or most constraining. You don't need to modernise everything at once.
- Governance and catalogue are often left to last — this is a mistake. Build them in from the start or you'll regret it.
The migration playbook — phase by phase
There is no migration that goes exactly to plan. But there are migrations that are well-designed and ones that aren't. Here is the sequence that has worked across the programs I've led and supported.
- Run old and new in parallel during migration — never cut over before the new environment is validated.
- Migrate incrementally by priority, not all at once. Each successful migration builds confidence and momentum.
- Adoption is the true measure of success. A platform nobody uses is a failed migration, regardless of technical quality.
Stakeholder alignment — the make-or-break factor
I've seen technically excellent migrations fail because of stakeholder misalignment — and I've seen imperfect migrations succeed because the people side was handled well. The technical work is the easier half.
The stakeholders you need to align — and what each one cares about:
They care about cost, risk, and strategic value. Frame the migration in terms of business outcomes — what decisions will be faster, what capabilities will be unlocked, what the cost of inaction is. Avoid technical detail at this level.
They care about continuity — will their reports still work? Will their data still be there? Involve them early, communicate the migration timeline clearly, and never let their data disappear without warning. This group will make or break adoption.
They care about security, compliance, and operational stability. Engage them in architecture decisions early — not as a blocker, but as a partner. Their requirements are legitimate and will surface eventually. Better early than late.
They are doing the work. They care about technical quality, realistic timelines, and not being set up to fail. Protect them from scope creep, give them clear priorities, and create space for them to do the work properly.
- Map your stakeholders before the migration starts and understand what each group cares about.
- Never let data consumers discover that their reports have broken — proactive communication prevents most adoption problems.
- The data engineering team is your execution engine. Protect their time and give them clear, stable priorities.
The five pitfalls that sink migrations
These are not hypothetical. Every one of these I have seen derail a real migration.
- Run a data quality assessment before finalising your timeline — quality debt is the most common source of delays.
- Incremental migration with parallel running is slower but far more reliable than big-bang cutover.
- Build cost monitoring in from day one — cloud cost management is a discipline, not an afterthought.
Real examples from the field
Supporting the UN's migration of its data warehouse to Databricks was a masterclass in the importance of patience and stakeholder management in a complex, multi-stakeholder environment. The technical migration itself was straightforward — the challenge was ensuring that dozens of teams across the organisation could continue to access their data during and after the transition, without disruption to critical reporting. The parallel running phase was longer than planned, but it was the right decision. Every team had time to validate their data in the new environment before the old system was decommissioned. No surprises, no broken reports, no lost trust.
Across insurance transformation programs, the data platform modernisation challenge was not just technical but regulatory: insurance data is heavily regulated, and every architectural decision needed to be defensible from a compliance perspective. The approach was to design the governance and security model first, then build the technical architecture around it — not the other way around. This added time upfront but eliminated the expensive rework that typically comes from retrofitting compliance onto a platform that wasn't designed for it.
At datalitiks, the advantage was starting with a blank slate — no legacy to migrate, no existing users to manage. The lesson from this experience is that the decisions you make in the first 90 days of a data platform's life are disproportionately hard to undo later. We invested heavily upfront in data modelling, governance, and quality — choices that paid dividends as the platform scaled. Starting cloud-native is an opportunity to do it right from the beginning. Don't waste it by moving fast and cutting corners on the fundamentals.