By Ray Walsh
Ever since the government announced their "world beating" test and trace system, concerns have been raised over its ability to properly handle the masses of sensitive medical and location data involved.
Now we've reached the point of the inevitable blunder – caused not by a glitch in the system but the complete mishandling of data. It has brought the entire arrangement into question, casting serious doubts over the privacy and security of personal information harvested from thousands of Brits each day.
The incident, which we now know left around 16,000 covid cases unreported, was caused by the use of a thirteen-year-old version of Excel. Yes, that Excel, the Microsoft Office default spreadsheet application used by mums and dads up and down the country to track the household budget.
Microsoft Excel is basic spreadsheet software that is only able to handle 1,048,576 rows of data and that's if you're using a version released after 2007. Public Health England (PHE) did not. Their version could only manage 65,536 rows of data. So with more than 22,000 covid cases being reported each day, your average GCSE math's student could see there was always going to be a big problem.
Excel is completely unsuitable for handling such a vast and critical data project. This is because it's spreadsheet software and not a database.
Test and trace is inevitably processing vast amounts of population data, including sensitive health and location information. PHE simply picked the wrong tool for the job, and whether it was cutting corners or simple ineptitude, I doubt we'll ever know.
So why didn't the government intervene? A decent graduate-level data scientist or mid-level IT professional could have created a simple relational database that would have been more appropriate than Excel.
Alternatively, almost any off-the-shelf business intelligence platform could have been used. These platforms are designed to handle big data and can scale to meet the needs of almost any project.
Indeed, if PHE and central government's core systems and dashboards were up to date and fit-for-purpose, it would have been quite easy to integrate the data capture process into them from the outset. No .csv files or .xls files – just a web portal. This is 2020, after all.
The decision to rely on spreadsheets suggests technical incompetence of the highest order. It casts doubt over the entire system – and the vast sums of taxpayer money invested in it.
Health secretary Matt Hancock admitted that, so far, only 51% of people affected by error have actually been approached by contact tracers. As a result, there are people walking about today that may have been exposed to the virus and have absolutely no idea. Those people could be passing covid on to the most vulnerable in society – all thanks to the government and PHE's collective incompetence.
The government claims that everything is now under control. Apparently, we can all rest easy because the spreadsheet has been broken down into multiple spreadsheets.
This is hardly reassuring. It reveals a complete failure to understand the problem. Breaking down the single spreadsheet simply results in the creation of multiple ongoing issues rather than a solution to the first one.
If the government and PHE choose to continue using Excel, there is no way to guarantee that we won't end up in the same situation further down the line. Instead, PHE needs to ensure they are choosing the right tools for the job – or admit that they just aren't up to the job.
Ray Walsh is a digital privacy expert at ProPrivacy.
The opinions in Politics.co.uk's Comment and Analysis section are those of the author and are no reflection of the views of the website or its owners.