Data-platform programmes, not BI projects
Most organisations call this "data analysis," but the failure mode is rarely the analysis itself. It's that the underlying data platform was never built to support repeatable, governed, observable analytics. We don't run BI projects. We run data-platform programmes — warehouse migrations, lakehouse adoption, governance, lineage — that give analysts a foundation worth their time.
For Australian engagements, data lakes, warehouses and analytics pipelines default to AWS Sydney (ap-southeast-2) so that personal and operational data stays onshore. We work to the Privacy Act 1988 and the Australian Privacy Principles, build to the OAIC Notifiable Data Breaches scheme, and for fintech and superannuation clients we align our controls with APRA CPS 234 — covering access management, encryption in transit and at rest, audit logging and incident response. Customer data is never used to train third-party models, and our Melbourne presence keeps 4.5–5 hours of AEST overlap with our engineering centre.
This is where most data-platform programmes fail — not in the build, but in the operation. Reactive quality (someone notices a number is wrong, you investigate, you patch) is a tax on the analyst team. Proactive quality is a different posture.
From reactive to proactive
Data contracts at the producer boundary, Great Expectations or Soda checks at every transformation step, and dbt tests on every model move quality from "things broke and we fixed them" to "the build fails when a contract is broken, before the bad data lands." The cost of detection drops by an order of magnitude when it shifts left.
The governance stack
- dbt for tests-as-code — schema tests, referential integrity, freshness, custom assertions versioned in Git.
- Great Expectations or Soda for declarative quality checks that run inside the pipeline and fail loudly.
- Schema registry — Confluent Schema Registry or AWS Glue Data Catalog for contract enforcement between producers and consumers.
- PII tokenisation at ingest so personally identifying fields never reach the analytics layer in the clear.
- Audit trails — every transformation captured, for the kind of regulator review APRA CPS 234 or the OAIC NDB scheme assume.
Master-data quality is its own discipline. On the EUDR / sustainability commodity importer programme, supplier records flow through standardisation, deduplication and validation layers because the regulatory exposure of a wrong supplier record at the 30 December 2026 EUDR cut-over is too high to leave to spreadsheet hygiene.
Most data teams have monitoring on the pipeline (did the DAG run?) and none on the data (did the data look right?). Modern data observability changes this.
The observability stack
- Data observability — Monte Carlo, Soda or Lightup watching freshness, volume, distribution and schema drift on every critical table.
- OpenLineage for end-to-end lineage from source system through warehouse to BI tool, captured automatically from dbt, Airflow and Spark jobs.
- Column-level lineageso when an analyst asks "what is this number?" you can answer in minutes, not days — and so impact analysis on a schema change takes minutes too.
Our TR Capital portfolio management programme is a working example — trade reconciliation and corporate-action processing where every record has provenance, every transformation is traceable, and every reconciliation break has a documented lineage path back to the source feed. The Odoo / EUDR programmefollows the same pattern: supplier master data flowing through transformation, validation and reporting layers, all observable, all lineage-tracked, because opaque data platforms and regulator submissions don't coexist.
How we scope these programmes
Data-platform programmes look more expensive on paper than BI projects. The cost difference disappears within twelve months because the analyst team stops debugging the platform and starts producing analysis — the work the platform was built to enable in the first place.
We scope these engagements honestly, including saying when an existing platform is fine and the issue is upstream in product, process or instrumentation. If the answer is "you don't need a new warehouse, you need three data contracts and a lineage tool," we'll tell you that.