The National Data Library can deliver its vision,
but it must pivot to build trust in the age of today’s web
This post builds on observations and conversations across government and industry (and internationally) since my previous post on “How can we find the Goldilocks Zone of our National Data Infrastructure?”.
When we’re building into a future where technology has such clear benefits it is often hard to be heard above the noise of the possible. And yet I believe we must also ‘grasp the nettle’ regarding equally possible negative or unintended consequences. While we must not “stifle innovation and growth”, we also must ensure we don’t scuttle our ship in the process.
The first misconception I want to address is that data is not a technology.
Data is used to represent information, analysis, insights, opinions, facts, responsibilities and decisions. It carries many different kinds of value, as well as material rights and material risks. Data is used to find things out and to make decisions. To look into its past, data is the plural of datum, from Latin, meaning something given.
I labour this point to make clear that it is not software (e.g. AI is software that uses and produces data, as are all algorithms codified in software) and it is also not hardware. A lot of debate about data (and AI in particular) seem to conflate these facts.
Treating data as if it were software is comparing apples and carts.
Context
Having been a tech entrepreneur for 30 years, I’ve sat in meetings with Silicon Valley investors where the “move fast and break things, then ask for forgiveness” mantra has been very present. Innovators love to build; building is fun and very rewarding — both intellectually and, sometimes, financially.
It’s common to worry about the consequences later, even when ‘later’ can be measured in decades. Those raising alarm bells are often badged as ‘activists’. This is a label I’ve never felt helpful: we can draw parallels with the past when ‘health risks of smoking’, through to catastrophic climate change, risked people being boxed as ‘alarmist’ instead of ‘anxious-with-cause’.
I’ve always advocated for fair-weather innovation, while ensuring that we build in measures for foul weather. In the age of the web, commercial (and political) ambition can often be seduced by the fair-weather arguments, while policy is rarely able to move at the speed required to keep pace and safeguard us from foul-weather conditions.
A lot has changed since my previous posts on data infrastructure. We must now, right now, be proactive and preemptive to the fact that our threat model has changed (due to shifting geopolitical and technocratic-utopian positions). Foul weather is closer than we may have wished.
Trust isn’t just a buzzword; it underpins our society and, in a digital age, this means addressing our data infrastructure, ensuring our safety, de-risking innovation and enabling growth in turbulent times. Trust helps us build all of these, for the long term.
Going from vision to implementation is hard

There is a natural tendency towards overreach in most systems — this is a very human attribute. When trying to innovate, our bureaucratic systems can often feel frustrating (good entrepreneurs — both inside and outside of government — know how to navigate around, or sometimes ‘through’, such systems).
Today, in 2025, with data and the web and AI, the pace and scale of impact outpaces and out-scales any other time in history. We can create a critical mass, or a critical mess.
To that end, I believe we must take steps to better understand and balance what is in front of us. When it comes to enabling data sharing, whether in finance, health, transport, energy, property, water, or across all research, we need to first take stock of something that is very hard to build and yet can be lost in the blink of an eye: trust.
The National Data Library (NDL) represents significant intent from the UK Government to do data sharing differently, and we (citizens, consumers, businesses, sectors, markets, and the state) can all benefit from this. However, implementation really matters.
Icebreaker One has already made a strong case (selected for publication by the Wellcome Trust) for the National Data Library to:
- Focus on defined users and use cases rather than datasets, just like any effective data infrastructure.
- Begin life as a simple, decentralised version, that would curate and improve the discovery of, and timely access to, public sector data for research.
- Experiment with more complex architectures for harmonising access to data drawn from across multiple government departments.
Cutting across all of this, the NDL must be built, from its core, in ways that engender trust. Whether consumer distrust, competitive interest or geopolitical threats, we must have in place the people, processes and protections to cement trust between businesses (b2b), with consumers (b2c), and with our citizens.
The NDL can still maintain its grand vision to improve outcomes for all of us, connecting data, rapidly unlocking insights, and leading the way.
More than its technical architecture, the nature of its funding and staffing will shape what it can achieve. I’d argue we can do more together, and move faster, with common principles and parallel actions than a large centralised effort. Not only should the data never be centralised (other than for the limited use cases that require that), but neither should the absolute control.
We can go far together if we each do one thing well.
We can bind our work, together, with common principles and practices. There will be no ‘one hub’ — if you are a hub in a collection of hubs, you’re a node. Even a National Library is one node in a network of knowledge. There isn’t only one search engine in the world, nor will there be one AI.
We must also approach, with a clear perspective, that one-time broad-based consent is just one limited use case. It may apply to certain, relatively static, data but a different approach is required for highly variable data: data is rarely static. The conditions on which that ‘one-time’ consent, as we have also seen, will also change based on lived experience (e.g. changes in corporate and political governance).
In a world of rapid change, we must pivot rapidly but
never forget the foundations of trust beneath our feet
Building on the Smart Data model, there are known ways to apply decentralised, user-controlled, real-time, granular consent. There are dynamic solutions in place today which enable peer-to-peer data sharing via secure APIs. Such systems can include immediate revocation (by the user, or by a neutral governing entity/representative), and they are designed and built to enable cross-sector interoperability.
\Equally, such systems will not (and should not) be the approach for all use cases. What works will depend on the purpose (which will span personal, business, national, research as well as small, medium and large aggregate/collective datasets). What matters is how they are governed.
This is where implementation matters. If we over-centralise and over-reach in our ambition we run the risk of the litany of failed ‘IT projects’ which have cost the taxpayer billions.
I see many projects, even some with the word ‘governance’ in their titles, take a technology-led approach rather than a governance-led approach: data is not a technology—it enables insights about its subjects and those subjects have rights.
Recommendations
Without trust in the systems upon which we run our lives we will, at best, stifle innovation, productivity and growth. At worst, we risk undermining far more.
The good news is that we can run, at pace, fast but not foolhardy, to build the data infrastructure we need for today and scale for all our tomorrows, to deliver trust, ensure our safety and unlock the value of data for everyone.
Phase 1: Immediate reorientation on purpose and language (0–6 Months)
Step 1: Transition the “Library Card” model to a Trust-based, scalable model
- Pivot from the concept of ‘library’ to a ‘web of interoperability and trust’ (WIT)
For example, announce an evolution from the “National Data Library” to something like a “Web of Interoperability and Trust” (WIT) or “National Data Web” or “Trusted Data Web”, which would align with work already in place in the financial sector and give more scope to connecting existing initiatives, including Smart Data, without being specific about technology preferences.
- Define six exemplar use cases for pilot delivery in 2025
Pick six lighthouse use cases across sectors, assign to different entities for delivery, and shift governance to the model that best enables them (e.g. from “pre-approved datasets” to a federated trust framework where applicable).
Adopt an API-first architecture to allow controlled, on-demand data sharing. Use them to produce quantitative evidence as to the pros and cons of the approach.
Step 2: Introduce an adaptive Smart Data consent framework (6–12 months)
- Develop and mandate common principles
Common principles should include application for ‘Data Consent/Permission’ for consumers, SMEs, researchers, businesses and government departments to view, grant, and revoke access to datasets. These should be evidenced through use cases with specific cost benefit and threat analysis.
- Require and mandate data best practices
Require that all third-party users (e.g. AI developers, financial institutions) justify the minimum data required per use case. Mandate that every material access request is logged, auditable, and policy-compliant (e.g. GDPR, Data Bill, and related policies). Where applicable, develop formal Schemes that codify multilateral contracts and enforce them. Note that Schemes can also be industry-led and voluntary, with appropriate governance: political signalling can be an efficient way to accelerate implementation.
Phase 2: Build systemic trust (6–18 Months)
Step 1: Shift to an API-enabled Data Access Model (including trust frameworks) based on user needs
In this instance ‘API-driven’ is an architectural choice aligned with the web, not a specific technology, that enables both interoperability and controls to be implemented. Other use cases won’t need ‘APIs’ (e.g. a CSV file of Open Data), but where security and access control are required, any technical solution must be able to operate at web scale. Key is the development of the governance mechanisms that help implement trust frameworks.
- Support development of sandboxes
Support the rapid, cross-sector creation of sandbox environments with political and financial support where required.
- Support open communication, transparency, and build collaboration and reciprocity with a broad stakeholder group. Address business, public and government skills and knowledge gaps.
Embed open communications, transparency and reciprocity into all programmes. Demonstrate to businesses and consumers that technologies such as AI can deliver value without undermining trust. This can be achieved by facilitating co-design across the value chain (e.g. govt, industry and end users, whether b2b or b2c)
- Create systemic links with digital identity developments
Incorporate systems-level authentication (e.g. OAuth-based which is already ubiquitous) — data access should be controlled via digital identity-linked permissions for companies and research organisations. This is aligned with the Data Bill and FCA approaches. Digital identity is a huge topic which I do not want to dive into here, but I do assume that it will (also) be federated (there should and will be no ‘one place’ for it).
Step 2: Maximise privacy enhancement for people and businesses (e.g. SMEs)
- Create mandates and incentives for privacy enhancement
Broaden and define Schemes to enable ‘Access Rulesets’ and enforce data-minimisation principles for citizens, consumers and businesses (including b2b SME data sharing), so datasets can be queried without bulk transfers (e.g. an app can check fact without revealing the underlying data, e.g. “are they over 18 — yes” not “what’s their birthdate?”)
Phase 3: National Smart Data Infrastructure (18–36 Months)
(yes, this may feel ambitious: why shouldn’t it be this rapid?)
Step 1: Legislate Data Interoperability
- Mandate interoperable systems where consent/permission or access control is required
Draft legislation ensuring every public-sector dataset must be ‘API-accessible’ under standardised, interoperable formats. As noted, not all public-sector data needs an API. In fact Open Data should be made available in any way that makes it discoverable and usable, and even CSV files can be catalouged by other systems and made ‘API accessible’. However, where some form of access control is needed, secure APIs are the standard that works for the web (e.g. Open Banking has 12 million monthly active users across the UK).
Also note that interoperability at the API level does not mean that all internal systems must do the same thing in the same way: data can be repurposed and presented without rebuilding entire estates (this will create an incentive for doing so over time, but it doesn’t need to happen first). Interoperability and decentralisation can work together, with appropriate governance, to enhance competition.
- Mandate open access and empower regulators to enforce compliance
Mandate open access to government datasets where possible, while protecting sensitive data via consent-based permissions, and democratically set rules. Pick regulators or code bodies to oversee the development of the specific rules and compliance required, including modes of redress. Note that open access and Open Data are different things.
Step 2: Scale the Model Beyond Government Data
- Support the development of connected demonstrators and pilots
Launch pilots with commercial ecosystems to enable interoperability (e.g. real estate, sustainability finance, public health). This includes cross-border and global data-sharing ecosystems (e.g. finance, goods).
- Enable interoperability beyond the UK, within the desired framework
Develop cross-border data-sharing agreements (‘Schemes’). Ensure interoperability with international frameworks (e.g. EU Digital Identity Wallet, OECD AI Data Principles and related emerging programmes). Ensure every data use and dataset are AI-ready and privacy-compliant via automated monitoring and governance.
Summary
Strategic Benefits
- Accelerates mission-driven government: Enables implementation of the government’s five missions via trusted, scalable data access.
- Supports economic growth: Unlocks interoperability for data-driven innovation across UK sectors.
- Protects society through accountable systems: Ensures that data-enabled systems are governed with clear and enforceable safeguards. Enables a coherent approach to data sovereignty across our economy and society, to minimise the risks of misuse and systemic harms in a rapidly evolving landscape.
- Future-proofs data governance: Aligns with existing data sharing frameworks, and with evolving standards for AI training, licensing, and consent. Licensing will be a cornerstone in all future data work whether AI-linked or not.
- Embeds assurability at scale: Operationalises assurability across domains.
Five WIT principles (for discussion)
- Interoperability by design: Common standards, APIs, and metadata frameworks to ensure frictionless cross-sector data use.
- Trust infrastructure: Consent-based access, verified credentials, and relevant-time provenance tracking.
- Federated governance: Sectoral autonomy with shared rules, coordinated through a Web of Interoperability and Trust.
- User empowerment: Individuals and organisations have control their data-sharing preferences. Strong governance-led approached apply to aggregate data, with the option to revoke.
- Composable architecture: Modular, user needs based and cross-sector enabling use cases, using sharable registries, vocabularies, and open interfaces.
NB: This post contains (as-always) personal opinions and thinking-in-progress (strong opinions, weakly held) as I navigate through the maze(s) of data governance. In amongst many experiences, I sat on the MiData energy sector board; co-chaired the creation of the Open Banking Standard; was founding CEO of the Open Data Institute; co-Chaired of the Smart Data Council. I run a non-profit (IB1.org) working on data governance at sector, national and international scale.
With thanks to Jack, Peter, Chris, Frank, Jeni, Simon, Julia, Kathryn, and many others who have helped shape my thinking on this.