AI bad data comment piece
AI bad data comment piece

The AI invisible tax: Why bad data is costing the UK millions

By Kit Zhang, Professor of Inclusive AI at Buckinghamshire New University (BNU) 

The government’s 18 March report suggested AI is a £90 billion golden ticket for the UK – but beneath the surface of this digital revolution is a massive “invisible tax” already leaking millions in loss productivity, due to shoddy, non-inclusive data. 

While the official report focuses on the potential for growth, it overlooks the friction caused when foundational AI models are built on flawed datasets that do not fully capture the breadth of the British public. As a Chartered Engineer, I see these "Large Language Models" and "Transformers" not as magical solutions, but as sophisticated machinery. Right now, that machinery is running hot on contaminated fuel. 

Professor Qichun (Kit) Zhang

What is “bad data”? What is the real-world cost of it? 

“Bad data” does not simply refer to typos or errors. We are talking about exclusion. If an AI is trained on data that doesn’t reflect the whole of society, it becomes a broken tool. 

For example – if a company is using an algorithm for recruitment and the AI is trained only on the CVs of past successful hires – all of whom may happen to be overwhelmingly from one demographic – the AI learns to reject brilliant candidates from different backgrounds. This is not just a social failing – it’s a massive waste of human capital. Similarly, in healthcare, a diagnostic AI trained on data from one ethnicity can fail to recognise life-threatening conditions in another. 

Consider the cautionary tale of Amazon’s experimental recruitment AI. In 2018, it was revealed the digital giant had to scrap an internal AI recruitment tool as it had “taught itself” to be biased against women after being trained on a decade’s worth of CVs. The tech industry – historically male-dominated – had led the AI to conclude being “male” was a success factor rather than finding the best talent. Similarly, a major 2021 review by the UK government found some healthcare algorithms were less likely to refer black patients for specialist care because they were trained on historical spending data rather than actual health needs. It led to later diagnoses and expensive emergency treatments later on. 

Every time an AI makes an error like this – it is a direct hit to the UK’s productivity and represents an “invisible tax” which drains our national balance sheet, slowing the very growth the Treasury is counting on. 

The solution: What is sovereign AI? 

The government’s recent report on copyright was largely a “wait and see” exercise, but the UK cannot afford to wait. We need a new direction – sovereign AI. 

Sovereign AI offers a unique answer. In practice, it does not mean state-run or “regulated” AI in the traditional sense. It means treating AI as a vital infrastructure – like our energy grid or water supply. It involves the UK building and owning highly secure models, trained on data that is traceable, ethically sourced and legally sound. 

Rather than relying on the “wild west” approach to data harvesting – where models are built by scraping the entire internet, including biased or AI-generated rubbish – sovereign AI prioritises quality over quantity.  

Making models “inclusive-by-design” ensures they are accurate for the whole British public – making them more efficient and secure compared to the generic alternatives. 

The “transformers” powering today’s AI are, at their core, statistical mirrors. If the mirror is cracked by bias, the reflection of our economy will be distorted.  

At the Centre for AI and Future Inclusive Technologies (AIFIT) – our principle is clear. Safe, inclusive and socially responsible computing is the only route to high efficiency. The 18 March report provides a helpful map, but it isn’t the journey. To avoid this Invisible Tax, the UK must stop treating AI as a “black box” and start treating it as vital national infrastructure. If we want to be a global “AI maker” rather than a mere “taker”, we must ensure our models are as diverse and robust as the people of the UK.