Hacker News new | past | comments | ask | show | jobs | submit login
NHS England patient data 'uploaded to Google servers', Tory MP says (theguardian.com)
188 points by choult on March 3, 2014 | hide | past | favorite | 176 comments



Surely PA Consulting should immediately be sued out of existence. This kind of behaviour must be considered beyond negligent, practically criminal.

I would strongly support throwing anyone involved in this into jail for a long time as a deterrent against future criminals.

This is just unbelievable.


Taking a sentient human being and throwing them in a cage is a profoundly violent act. I find it troubling that you guys so casually reach for it as a punitive tool, particularly when the subject has neither committed physical violence nor poses such a threat to others. Surely you clever people can think of forms of punishment/deterrence less destructive to both the individual and society as a whole.


Taking a sentient human being and throwing them in a cage is a profoundly violent act. I find it troubling that you guys so casually reach for it as a punitive tool

We aren't reaching for it casually. Some of us consider privacy a fundamental value that must be defended, and regard an attack on our privacy with the same seriousness that we would regard an attack on our physical person.

Which is more of a danger to me, someone who punches me in the face on their drunk night out and gives me a bloody lip and a bit of pain for a few hours, or someone who betrays confidences that may have lifelong implications for my employability, insurance premiums and credit levels, ability to travel freely, and for that matter my self-respect and basic human dignity, before you even get to the kinds of more extreme and very physical dangers that could be posed by invasions of privacy if we consider the lessons of history?


>Some of us consider privacy a fundamental value that must be defended

The severity of a punishment can be tuned separately from the form of punishment. Imprisonment is not appropriate merely by dint of your emotional reaction to the crime itself.

>Which is more of a danger to me...

Sufficient to warrant throwing them in a cage, being brutalized by actually violent criminals, imposing a direct cost burden on society, and also indirectly by depriving society of that individual's productivity?

Probably neither.


Financial penalties have a long record of poor influence toward desired, legal behaviour. Further, they tend to simply be, sooner or later, passed on to the customers or clients who in many cases are the original wronged. Those individuals actually responsible for the sanctioned behaviour are not or only weakly punished and perhaps influenced against its repetition.

As an individual, one could well go to prison for such misbehaviour. Corporate and government employment should not serve as an impenetrable shield and dilution of responsibility against such eventuality.

Incarceration is often described as having two goals: Punishment for crimes committed, and mitigation against such crimes. For the latter, both by actual restraint and by aversion to the potential results.

It seems that stronger aversion is needed; we have a systemic problem with recurrence -- often by the same parties -- of this behaviour.


We could debate the relative effectiveness of different forms of penalty, and the relative importance of punishment/deterrent, ongoing protection, and rehabilitation, but I'm not sure this is the forum for it.

However, let me be clear: if the facts in this case really are as I've seen reported, then I have no problem with taking people who did this, throwing them in a cage, and depriving society of their "productivity" for a while. As far as I'm concerned, that kind of productivity is about as welcome as the banking executives who command "competitive compensation packages" for running their organisations into the ground or the politicians who once elected proceed to legislate for the highest bidder.


Poor record keeping is essentially providing blackmail material on thousands, tens of thousands, hundreds of thousands, or occasionally millions of people.

We put blackmailers in to jail, and we should put people who provide enough sensitive material to blackmail an entire city in to jail as well.


> Surely you clever people can think of forms of punishment/deterrence less destructive to both the individual and society as a whole.

OK, how about getting all their personal information and putting it on a public website?


What they did is destructive to individual privacy and liberty.

This may be a strange concept to you, but you can commit a crime non-physically. After all, lots of people think bankers should be arrested for having traded bad debt and precipitating the recession.


According to them, they got approval for doing that:

> The alternative was to upload it to the cloud using tools such as Google Storage and use BigQuery to extract data from it. As PA has an existing relationship with Google, we pursued this route (with appropriate approval). This showed that it is possible to get even sensitive data in the cloud and apply proper safeguards.


And what "appropriate approval" was that, exactly?

In general, exporting personal data outside of the EEA requires the explicit notification of the data subject under UK data protection law (among other consequences of the first Principle[1]). Moreover, the rules for even processing sensitive personal information, which includes health-related information, are significantly stronger than the general case.

They should never had been given that data in the first place, of course, and giving it to them should clearly be illegal on the part of whoever disclosed it. If it turns out not to have been, that will be a compelling case for dramatically strengthening the legal data protection and privacy framework in the UK. But I don't see how either the original source or PA Consulting can get around the basic conditions for processing sensitive personal data[2]. In particular, the most likely condition they might appeal to here in the absence of explicit consent reads:

"The processing is necessary for medical purposes, and is undertaken by a health professional or by someone who is subject to an equivalent duty of confidentiality." [Emphasis added]

Even once they had it, that still doesn't give them a free pass on exporting the data outside the EEA without notification (see [3]), or actually processing the data themselves for that matter.

[1] http://ico.org.uk/for_organisations/data_protection/the_guid...

[2] http://ico.org.uk/for_organisations/data_protection/the_guid...

[3] http://ico.org.uk/for_organisations/data_protection/the_guid...


I thought that was why google hosted a lot of stuff in Ireland? So that companies could host EU data there without running afoul of the privacy laws?

I don't think the duty on a company is all that strong - the data has to stay in Europe, on a properly protected computer. Providing you needed a password or equivalent secret to get to the data they are probably okay legally.

Given the NHS is planning to sell poorly anonymized patient records at 10'000 for $10 imminently, I think we are complaining about the wrong problem.


You are wrong about the HES records and about the proposed GP records - there are criminal offences if the data is misused.

> Given the NHS is planning to sell poorly anonymized patient records at 10'000 for $10 imminently,

That's wrong too. It's okay to be against something, but only if you know what that thing actually is. The data would be pseudo anonymous with HSCIC and anonymous outside HSCIC. While there's a possibility of de-anonymisation anyone doing so would be committing a criminal offence.


I don't think the duty on a company is all that strong

You are mistaken. For sensitive personal data, there are much stricter rules on what you can do. Please see the second link I cited before.

Given the NHS is planning to sell poorly anonymized patient records at 10'000 for $10 imminently

Are you referring to care.data? That programme is essentially dead, in the face of massive opposition, and it was even before the current round of disclosures about hospital records already being leaked.


Doesn't the "safe harbor" clause apply to US companies ?

You know, that joke of a clause which says that US companies fit the needs of our data protection law as long as they claim to fit it (and they only have to claim it) ? Part of the new data protection law that was supposed to be voted in the EU following/during the PRISM scandal was revoking that stupid clause but I'm not sure what happened to that reform.


Safe Harbor avoids some of the issues with exporting data from the EEA at all.

It doesn't cover exporting the data without notifying the subjects appropriately.

It isn't even close to covering exporting sensitive personal data (which is a technical term explicitly including health-related information), for which much stronger rules apply under UK data protection legislation.


The statement from the offending party is that there was no individual identifying data. So what law?


People were able to identify individuals on the earthware map produced from this data that was publicly online.

It doesn't really matter what the offending party says, if it is demonstrably untrue.


So apparently they don't have money to spend on building a secure, state-owned cloud storage service for patient records right?

I wonder, does the GCHQ use private cloud storage contractors or for spying material the government found money to create an appropriate database? :-)


This is just crazy. According to the article, this "big data" fits in 27 DVDs, which is roughly 1-2 TB of data. Do you really need Big Query or whatever for this?


According to Wikipedia, DVDs of the largest capacity can hold 17.08 GB of data, but those are rare. If the article is correct about these details, the data could be ~460 GB at most, but is likely less than half of that if the DVDs in question were of normal capacity.


50 x 50GB blu-ray discs, for over 2TB of storage, only costs $126 at Amazon.

A 2TB portable USB drive costs $100.

But no, let's just upload everything to Google Drive.


I would assume it's not about storage; it's about processing.

The bold byline of BigQuery (https://developers.google.com/bigquery/) is "Analyze terabytes of data in seconds."

Store the data locally? Easily. Set up the necessary back-end infrastructure to sift, twist, and churn the data efficiently? Now you're talking hiring some people to set up an infrastructure for that... Or you could upload it to Google and use their infrastructure.


[deleted]


They didn't say they had approval from Google. That wouldn't make much sense, you don't really need approval from Google to use BigQuery, you just need to open a Google account and create a new BigQuery project on their Developers Console


Given that the scale of the data, it is likely that they need to contact Google to get rid of the default quotas: https://developers.google.com/bigquery/quota-policy


Why? I don't see any limit being hit by loading 27 DVDs worth of data.


PA Consulting's statement:

PA purchased the commercially available Hospital Episode Statistics data set from the NHS Information Centre (now the Health and Social Care Information Centre). The data set does not contain information linked to specific individuals. The information is held securely in the cloud in accordance with conditions specified and approved by HSCIC.

This new approach to analytics can help the NHS improve patient care. We have been able to identify where services are needed most and to understand previously unseen side effects of drugs and treatments. Our approach protects patient confidentiality and allows insights to be derived at significantly lower cost, and a hundred times faster, than any traditional approach.

HSCIC's statement:

The NHS Information Centre (NHS IC) signed an agreement to share pseudonymised Hospital Episodes Statistics data with PA Consulting in November 2011.

This included Hospital Episode Statistics on Admitted Patient Care (1999/00 to Provisional 2011/12), Outpatient (2003/4 to Provisional 2011/12) and A&E (2007/8 to Provisional 2011/12). This agreement lasted to November 2012 and was amended in December 2012 to extend to November 2015.

The agreement obliged PA Consulting to abide by conditions to protect the confidentiality of the data, including restricting the data to a named list of individuals, a prohibition on sharing any information with risk of identifying individuals and a requirement to destroy the data after the agreement end date.

PA Consulting used a product called Google BigQuery to manipulate the datasets provided and the NHS IC was aware of this. The NHS IC had written confirmation from PA Consulting prior to the agreement being signed that no Google staff would be able to access the data; access continued to be restricted to the individuals named in the data sharing agreement.

http://www.paconsulting.com/introducing-pas-media-site/relea...

http://www.hscic.gov.uk/article/3948/Statement-Use-of-data-b...


> no Google staff would be able to access the data

Well that's obviously bullshit... But aside from that, if it's commercially available and pseudonymised, I can't see much wrong with it.


It can be done !

http://people.csail.mit.edu/nickolai/papers/popa-cryptdb-cac...

That said, I doubt that Google is doing it. More interesting is that this tech appeared two years ago, I thought the world would rush to pick it up and as far as I know no one has!


I would strongly encourage everybody to contact the Information Commisioners Office, as this must be a breach of data protection ?

https://ico.org.uk/Global/contact_us

If anybody has details about how this may be a breach of the data protection act exactly, then please post below.


Surely the people in power should have paid more attention to this before handing the data to a private third party. Those who ignored the problem before it emerged should not be allowed near position of decision for public office.

Rant apart, this seriously raise the question of do we not need to have an alternative (maybe a pan European subsidized, not for profit organization) to Skynet (sorry I meant Google)?


Rant apart, this seriously raise the question of do we not need to have an alternative (maybe a pan European subsidized, not for profit organization) to Skynet (sorry I meant Google)?

The UK already has a program for accrediting and validating cloud providers (called G-Cloud[1]), that classifies services according to their suitability for storing confidential information, and of which Google is already excluded.

If this company used Google anyway, what makes you think they wouldn't have even if there was such organization in place?

[1] http://en.wikipedia.org/wiki/UK_Government_G-Cloud


Given it is now run by the ex head of Barclays Bank and is the second largest recipient of government contracts in the UK, I wouldn't hold your breath.


If they have permission of government officials then what?

We can hold companies accountable but how do you hold government accountable? In a meaningful way? Certainly we can find a myriad of excuses not to fire an government worker for a mistake I am fine with doing the same for this as well.

The key is to learn from it and put into place processes that stop it from reoccurring. We need to weigh the penalties to the harm caused. Frankly, if no one lost their life or livelihood I don't think seeking the outcome you suggest is warranted.


Frankly, if no one lost their life or livelihood I don't think seeking the outcome you suggest is warranted.

I could not disagree more strongly. A betrayal of public trust on this scale, abusing privileged access to the most sensitive and private of personal data, should be met by severe penalties.

At a minimum the people who actually disclosed the data and the responsible executives at both the original NHS-related source and at PA Consulting should be facing jail time, and the executives barred from holding public office or directing companies for a very long time.

That the company in question should be legally obliterated and that Google should be formally notified and required to completely delete the personal data they are illegally holding should go without saying. If Google refuse to comply then any Google executive who sets foot on European soil should be jailed as well.


It all depends on the value we assign to privacy. My personal opinion is that the handling of patient data should legally be protected somewhere on a scale between banking data and state secrets.

If those kinds of data are mishandled, there _is_ punishment, no matter where you work.

But then I'm culturally biased (we here in Germany seem to be collectively more paranoid about privacy than most other populations).


>If they have permission of government officials then what? //

They wouldn't - without wilful negligence - accept such "permission" from anyone other than a senior official who had in depth knowledge of the necessary requirements of privacy laws. A person in that permission is unlikely to be acting lawfully and is likely to be aware of that - there's no way they should retain a post with responsibility over anything greater than a stapler after that.


To be fair, there are a few grey areas in the law as it currently stands in the UK that might be relevant if we're arguing about government permission. The main one is probably s251 NHS Act 2006, which grants the Secretary of State for Health certain powers to set aside the default confidentiality rules for specific medical purposes. I'm investigating whether those powers are a relevant factor in this case, but so far I've found no verifiable information either way.


>The main one is probably s251 NHS Act 2006, which grants the Secretary of State for Health certain powers to set aside the default confidentiality rules for specific medical purposes. //

Most laws seem to have these SoS exclusion clauses. In this particular case, S.251(1)¹ says

"The Secretary of State may by regulations make such provision for and in connection with requiring or regulating the processing of prescribed patient information for medical purposes as he considers necessary or expedient—

So it's by regulations², ie "rules". Whilst they don't need a new law to be passed they're still a statutory instrument, it's not like this section allows the SoS to just decide by himself.

See for example S.252(2); and S.251(7), quoted here:

"Regulations under this section may not make provision for or in connection with the processing of prescribed patient information in a manner inconsistent with any provision made by or under the Data Protection Act 1998 (c 29)."

---

1 - http://www.legislation.gov.uk/ukpga/2006/41/part/13/crosshea...

2 - http://www.legislation.gov.uk/uksi/2002/1438/pdfs/uksi_20021..., http://www.hra.nhs.uk/documents/2014/02/cag-frequently-asked...


PA Consulting definitely think they were in the right here, they attended a recruitment event at my university and told us about how they did this for the NHS using Google tools. I figured they had permission from the NHS or whatever, and they also seemed to have some relationship with Google. My first thought is that this is just an MP looking for attention, but if the NHS genuinely didn't know then I agree it's surely criminal.


It's hard to know from the article just what happened and what data was uploaded.

But, even though they got approval, they may have committed a criminal offence.

This data initiative is really important. They've got to do something to win back trust. Someone has to lose a job and someone has to go to jail (if a crime was committed).


Rarr, scary database, rarrr.


PA Consulting are idiots and everybody who gave them this contract should be fired.

Saying "I didn't know" is no excuse as this is not the first time PA Consulting have lost data!

"The Home Secretary announced on 10 September that the government has terminated its contract with PA Consulting, following the recent high profile data loss

On 19 August PA Consulting formally notified the Home Office of the loss of a data stick containing sensitive information relating to the JTrack system which PA manage under contract to the Home Office

The data on JTrack relates to prisoners and other offenders in England and Wales."

http://www.scl.org/site.aspx?i=ne9297


And Obama seriously considers letting 3rd parties keeping everyone's private data?

No Mr. Obama, neither NSA keeping the data nor 3rd parties is the solution. The solution is to stop spying on everyone.


Who do you think is holding your health records right now?


It's worse than that: Ben Goldacre is reporting ( https://twitter.com/bengoldacre/status/440475049880195073 ) that the data was made publicly available.

This is beyond parody.


another update:

"No individuals directly named records online. But a massive breach of the most basic information security policies to prevent jigsaw."

https://twitter.com/bengoldacre/status/440488463008550912


We're going to need something longer than a tweet to explain what he's actually talking about.


I'd also assume the worst, even if there are no patient names out. There are too many examples of de-anonymization of purposefully anonymized data out there to warrant any belief that missing real world names alone should constitute much of a privacy blanket.


The dataset in question supposedly contains date of birth, gender and post code. For my family, that uniquely identifies every member, given the size of UK post codes.


If that means full postcodes, then that information will easily be sufficient to identify every member of the UK population, aside from outliers like twins or the occasional statistical fluke.


To add insult to injury they were probably getting paid vast amounts of taxpayers' money to be this incompetent.


It transpires that the publicly downloadable data was a mock dataset. How it took them 24 hours to work that out I've no idea, but I imagine there's a lot of headless chicken impressions going on behind the scenes at the moment.

https://twitter.com/bengoldacre/status/440576479248662528

It's still the case that real NHS data was uploaded to Google's servers.


It's beyond parody because it's just plain false. Par for the course for The Guardian over the last few months.

http://www.earthware.co.uk/default.aspx


It had nothing to do with the Guardian. What on earth are you talking about?


Ben Goldacre writes for The Guardian. What else would I be talking about?


recent (20m): "I’m sorry to say I’m also now aware of more @HSCIC stories breaking shortly. What a mess. They need to come clean asap, clean stables."

"Wait for details on story i’ve been tweeting today: twitter is first draft. Story is: small number rule breach, I believe, which is bad."

https://twitter.com/bengoldacre


This is the data in question: http://www.hscic.gov.uk/hes

It is anonymised [1], publicly licensable data.

Here are a list of users and uses. [2]

[1] "We apply a strict statistical disclosure control in accordance with the HES protocol, to all published HES data. This suppresses small numbers to stop people identifying themselves and others, to ensure that patient confidentiality is maintained."

[2] http://www.hscic.gov.uk/media/10495/Users-and-uses-of-HES/pd...


> It is anonymised

Not meaningfully, no. A UK postcode covers 20 households or less. If you have that plus gender and date of birth (as seems to be the case here), you almost always have a unique individual.


NHS number, date of birth, postcode, ethnicity and gender.

Is about as anonymous as wearing a different hat.


NHS England’s Chief Data Officer says that with the pseudonymous dataset (he calls it amber data) patient’s identifiers are removed, including "(their date of birth, postcode, and so on)". Also, I think it might be hospital number rather than NHS number in the data

http://www.england.nhs.uk/2014/01/15/geraint-lewis/


NHS number is clearly ridiculous.

But, I can see why all the other data are statistically relevant in analysis.

Postcode, so you can track viral outbreak etc with a fair degree of locational accuracy.

Date of birth, so you could analyse if people born in certain seasons had higher incidences of various diseases. (You could probably drop the day and just have month, but there could be cases where there were more hospital deaths for babies born on a Monday, etc).

Gender and ethnicity are obvious.


A hash of the NHS number would give a UID without compromising the person's details. Providing a UID that is one-way linked would allow HSCIC to go to the actual data if a care situation warranted it.

I'd argue that the first half of the postcode and month of birth are ample for outside-HSCIC use. Monday-deaths is the kind of thing that's done internally anyway.


Tiny nitpick[0]: a UK postcode corresponds to a minimum of 1, and maximum of 100 (not 20), delivery points[1] (not households). Typically about 15 delivery points.

I fully agree with your main point, though, that the data are not meaningfully anonymised.

[0] http://www.poweredbypaf.com/wp-content/uploads/2013/11/Progr... (page 19, "Small User Postcode")

[1] https://en.wikipedia.org/wiki/Delivery_point


I have begin collaborating in a rare diseases project.

So imagine you have one or two people in the country with certain symptoms. How anonymous is that?


Do you still work in that field?

If so, you're probably one of the few people that IMO actually should have some access to that data.

Maybe you'll get lucky and the data will become a researcher free-for-all, like the Enron emails did. (Not likely, though.)


No we are discussing all the security implications before we even get close to collecting the data.


I am happy to hear that. From my experience, some inroads have been made in universities regarding the ethics of research projects. Security of the computer systems used to store and handle tha data, on the other hand, is way to often still not considered.


Small numbers (cells containing values less than 6) are supposed to be removed from the data or obscured, to prevent breaches of confidentiality.

Word document: http://www.hscic.gov.uk/media/1879/NHSIC-small-numbers-terms...


I trust Google more than I do "PA Consulting". Which begs the question, how did we get here? Who in their right mind sends out 27 DVDs with probably unencrypted, highly sensitive medical data? Even if the recipient is trustworthy, the transport isn't.

This data needs to be on a locked away government server that answers queries by 3rd party by throwing away half of the data and randomizing the remainder.


Basically. An alternative headline for this story could be "Contractor moves sensitive data from insecure, non-audited medium to secure, audited medium."


That would be "Contractor moves sensitive data from insecure, non-audited medium to secure, audited medium monitored by a foreign spy agency"


Correct. I disregarded the nationality issue in assuming that GCHQ having "NHS, please pass me your data because terrorists" access to the health records is equivalent to the NSA having "[GAG ORDER] Google, a secret court has ordered you..." access to the health records, given how closely those agencies seem to be working together based on the Snowden disclosures.

Others may place differing weights and values on their cloak-and-dagger outfit of choice. ;)


Sending things in the post may be the acceptable method of sending sensitive data. I'm guessing that the UK prosecutes people who mess with the post the same way that the US prosecutes people who mess with the mail.

EDIT I'm not sure what I'm talking about. Was the complaint about "27 DVDs" just the amount of data or the method?


It seems to be about the _assumption_ that they were unencrypted.

We mail out encrypted DVDs semi-regularly. Works fine.


I've worked with anonymized patient data in the U.S. at a small consulting firm nobody's ever heard of. We received encrypted physical media via USPS and registered mail. It may seem byzantine, but we also worked exclusively on air-gapped servers.

Best practice in data security is pretty straightforward - you don't connect data to the internet if a single breach is catastrophic. We talk about things like the Target hack as catastrophic breaches, but they aren't. You can change your password or cancel your credit card. You can't change your medical history - once public, it is always public.


Actually, I wouldn't be surprised if there was a law stating that those DVDs be encrypted... I mean, the key might be in a text file on the DVD, but there are plenty of regs surrounding the transportation of patient data.


That's true; they may be secured in that fashion (does anybody know what NHS policy is?)

I was thinking "secure" in the sense of "Physically inside a datacenter with privileged access" as opposed to "Physically on 27 DVDs that are... Hm.... Where are those DVDs... Hey Bob, did you have the DVDs last? Shoot, I know I put those DVDs around here somewhere..."


Having worked with sensitive data on physical media - you couldn't be more mistaken about how carefully it is monitored and handled.


I don't understand why the data being on "Google servers" is generating such outrage. Google almost certainly has superior security to this "PA Consulting" or even the government itself.


I don't think Americans get how distrustful other nations are of American-based cloud providers. It is not a matter of Google's or any other providers' behaviour but of the US Government (warrant-less searches and the like). This lack of trust predates the current NSA-Snowden affair and goes back to the Patriot Act (IMHO).

Of course I can't speak for all nations or industries (or even companies) but in my part of the Canadian Health care sector it is simply unthinkable to use a US-based cloud provider for anything to do with patient data.


As an American, I'll be a little surprised if that mistrust runs deeply in Britain, given how closely the Snowden disclosures revealed GCHQ and the NSA to be working. If the NSA had a vested interest in getting health records on a British citizen, I doubt it'd be difficult to get the British government to send them over. You know, to fight terrorism.

Of course, people aren't strictly rational actors, so I suppose one could hold the cognitive dissonance that one's private data is safer from the NSA physically stored in Europe than it is in the US.


As an American, I'll be a little surprised if that mistrust runs deeply in Britain

Why? I don't approve of the NSA behaviour, and I don't approve of the GCHQ behaviour in conducting mass surveillance either. The fact that the latter is theoretically done by my government for my country's benefit doesn't make me think any better of the spy agencies or those in government whose laws and tax money allow it.

I would rather take my chances with the terrorists than put up with all the nonsense done in the name of fighting them today. Not only do I think that as a practical matter the nonsense is far more likely to harm me or those I care about, and not only do I strongly disapprove of all the time, money, media attention and other resources I consider wasted on most so-called anti-terror measures when much more deserving causes could have used those resources better, I also think the current culture of rampant paranoia and fear-mongering is helping the terrorists to win anyway.


Whatever the UK Government thinks, EU and UK Data Protection law should make this illegal.


I've worked with anonymized patient data in a litigation consulting context (in the U.S.). I worked at a small consulting firm nobody's ever heard of.

We worked exclusively on air-gapped servers behind several layers of physical security. Even encrypted data was never, ever sent over the public wire.

You don't connect sensitive data to the internet if a single breach is catastrophic. We talk about things like the Target hack as catastrophic breaches, but they aren't. You can change your password or cancel your credit card. You can't change your medical history - once public, it is always public.


I don't understand why the data being on "Google servers" is generating such outrage.

Because Google's systems demonstrably aren't secure (see numerous recent discussions about NSA etc.) and therefore aren't an appropriate choice to transmit or store sensitive personal data under UK data protection rules.


Is your concern that the NSA itself is targeting British health records? If so, what makes you think anyone else is better prepared for this adversary?

Or, if you're talking about someone other than the NSA (which you allude to with your use of "etc"), then who is it? What are they going to do, and why do you feel it's demonstrably more likely to happen with the data on Google's servers? Where would you put the data instead, and how would this help reduce the likelihood of the scenario?

"On a powered-down hard drive in a vault" doesn't count; presumably, PA was given the contract because the potential benefits of the research were believed to outweigh the privacy risks.


The NSA specifically isn't the point, though I do believe their blanket surveillance of the Internet (and the similar surveillance by other government spy agencies such as our own GCHQ) should be considered a hostile act and incur a proportionate response. I simply don't accept that there is any ethical legitimacy to such dragnet programmes or that they bring more benefit than the risk they obviously pose to free, peaceful, democratically governed civilisation.

In any case, Google aren't a part of any government. They are a private, profit-making corporation, and they are in the business of collecting as much data on people as possible for their own benefit, and they evidently cannot comply with the restrictions on handing personal data required by UK law even if they want to. Providing this kind of treasure trove to them should be unthinkable.


You didn't answer any of my questions.


I would leave the data in the hands of medical professionals who are subject to medical ethics and confidentiality, kept away from anyone who might have any other ambitions for it. And I would keep it within the range of European data protection law, which is in general considerably stronger (rightly so, IMHO) than the rules in places like the US, though in this particular case if HIPAA is relevant that may not actually be the case.

Edit: Incidentally:

presumably, PA was given the contract because the potential benefits of the research were believed to outweigh the privacy risks.

That "belief" would be a huge assumption, for which I see little evidence in this specific case, nor any historical pattern to suggest it is reasonable. In any case, it was clearly a bad assumption with hindsight, because the privacy risks are evidently not merely risks at this point.


> I would leave the data in the hands of medical professionals

Presumably, the medical professionals aren't also IT professionals. If they want their data to be on a hard drive, and accessible via a network, using some apps, then some group of non-medical professionals is going to need to maintain those services.

Who do you think that should be, and why do you think their systems would be more secure than Google's?

> kept away from anyone who might have any other ambitions for it

What ambitions are you implicitly accusing Google of having? Do you think Google's going to tap into their customers' private files and sell them to a third party?

If not, then what's the actual, non-vague scenario you're worried about?


Who do you think that should be, and why do you think their systems would be more secure than Google's?

The security isn't the only point here. They transferred the data outside of the jurisdiction where our laws apply, and they're not allowed to do that without fulfilling conditions that they appear not to have satisfied.

Note that it is not within the power of PA Consulting to vary these conditions, whatever any contract says, nor are HSCIC above the law in this respect (though some of the relatively recent and dubiously worded get-out-of-jail-free cards like s251 might protect them to some extent).

What ambitions are you implicitly accusing Google of having? Do you think Google's going to tap into their customers' private files and sell them to a third party?

I'm not implicitly accusing them, I'm openly stating that I think they would do tap those files in a heartbeat if (a) it would help them to earn more from their advertising or other profit-generating activities, and (b) they thought they could get away with it.

I regard organisations like Google (and other big data miners like Facebook) as some of the most dangerous entities on the planet today. They respect little other than money, and they have consistently not just pushed the boundaries of what is acceptable behaviour but IMHO (and apparently in numerous other people's opinion and indeed in the law's opinion in many places and on many occasions) stepped far over the line. They can continue to do this because the regulators who should be reining them in are toothless and because they have an army of lawyers and lobbyists who exemplify just about everything that makes those professions unpopular.

I do very little with Google services myself, by deliberate choice, and I sure as hell do not consent to anyone breaching their duty of confidentiality regarding my medical records and giving them to Google either.


> I'm openly stating that I think they would do tap those files in a heartbeat if (a) it would help them to earn more from their advertising or other profit-generating activities, and (b) they thought they could get away with it.

Wow. Do you feel that Google has ever done anything so flagrantly over the line before? If so, when?


Do you feel that Google has ever done anything so flagrantly over the line before? If so, when?

I'm not sure they've had the chance to do anything this flagrantly over the line, because I'm not aware that they've ever had access to this kind of data before. To be clear, if they didn't know they had this data, I don't think it's fair to blame them for anything anyway. That would clearly be unreasonable. But I think once notified they should be required to delete the data immediately, and I don't trust them as far as I can throw them not to abuse any access they do have if they can come up with enough legal sophistry to convince themselves it's somehow justified.

To answer your original question, for other over-the-line cases relating to privacy specifically, I consider some of the Google Street View practices seriously shady, for example. Not to mention some of the creepy things they have been known to do on just the normal Google web sites, where most people probably assume what they're typing doesn't get sent to Google until they hit search/send/whatever. And of course the whole attempt to unify everything under the banner of Google+ and real names is pretty much second only to Facebook in terms of turning everyone into a database record. Then there's Google Glass.

I'm firmly of the view that technology is neutral, and not inherently good or evil. It is how the technology is used that counts. And in that respect, I think Google have now proven to have evil tendencies many, many times.



If a malicious incident like the below happened then Google wouldn't even be liable, unlike data that is kept under UK control.

From http://gawker.com/5637234/gcreep-google-engineer-stalked-tee...

In at least four cases, Barksdale spied on minors' Google accounts without their consent, according to a source close to the incidents. In an incident this spring involving a 15-year-old boy who he'd befriended, Barksdale tapped into call logs from Google Voice, Google's Internet phone service, after the boy refused to tell him the name of his new girlfriend, according to our source. After accessing the kid's account to retrieve her name and phone number, Barksdale then taunted the boy and threatened to call her.

In other cases involving teens of both sexes, Barksdale exhibited a similar pattern of aggressively violating others' privacy, according to our source. He accessed contact lists and chat transcripts, and in one case quoted from an IM that he'd looked up behind the person's back. (He later apologized to one for retrieving the information without her knowledge.) In another incident, Barksdale unblocked himself from a Gtalk buddy list even though the teen in question had taken steps to cut communications with the Google engineer.


Data must only leave HSCIC under carefully controlled circumstances. It's pseudo-anon within HSCIC and anon outside HSCIC, but still with careful controls on who has access and what they can do with it.

I can't tell from the article what data PA Consulting had; nor why they had it; nor why they felt the need to upload it to Google. Even though it's been anonymised it should still be treated as sensitive confidential data.

There's a possibility that someone at PA Consulting has committed a criminal offence even though they "got permission" to upload it to Google.

The otrage about Google is around keepin data within the EU And this protected by EU data protection laws.


I agree. I personally trust Google far more than the US or my own (UK) government.

It's the whole monopoly on force thing. I have no fear that Google will grab me out of my bed in the middle of the night.

With the UK government — even though I trust them to be pretty rational and fair, and to my knowledge I've done nothing that would warrant that treatment — there's always the worry that they have the ability to significantly reduce my quality of life (if they ever chose to) and I might not even deserve it.

It's always surprising to me the fear regular people have of companies doesn't extend to governments, who have far more power over their lives.


Because it's illegal to put MY data on a server in the US.


The same government that had completely penetrated Google's network for years without them realising?


I'm pretty sure "the government" in GP's context referred to the UK government, not the US.


It was the UK Government (GCHQ) that was breaking into Google's networks.


GCHQ has done some appalling stuff, but the Google network penetration was done by the NSA.


You're both right, its a collaboration between the two:

>For the MUSCULAR project, the GCHQ directs all intake into a “buffer” that can hold three to five days of traffic before recycling storage space. From the buffer, custom-built NSA tools unpack and decode the special data formats that the two companies use inside their clouds. Then the data are sent through a series of filters to “select” information the NSA wants and “defeat” what it does not.

http://www.washingtonpost.com/world/national-security/nsa-in...


The report in question (linked in the article) provides "exceptional" evidence for the performance of Cloud technology by comparing a Google BigQuery search against an on-premises SQL Server query.

So, cloud is good because map-reduce performs better than relational databases...


This sounds all too like the project I am collaborating on. It will be moderately big data (a few terabytes at most). The guys developing it just now are on their third NoSQL database - Elasticsearch.

Them: "Look at how fast it is"

Me: "You only have 3GB of data in it"

Them: "Its so fast to develop, just connect Angular straight to Elasticsearch"

Me: "Absolutely no concern given to security"

Them: "The previous project used an SQL database and ended up having so many table"

Me: "So it was probably properly designed"


I've seen this theme way too much recently- developers giving preference to their own convenience over the security of their application, or, even worse, their confidential data. Every time, however, it was due to incompetance; they didn't know what they were doing wrong.

Frameworks like Meteor.js encourage bad habits like this. Quoting straight from their homepage[1], "All the same APIs are available on the client and the server — including database APIs! — so the same code can easily run in either environment."

Running arbitrary database queries from the client cannot possibly be a good idea.

[1] https://www.meteor.com/


Ahh, hipsters.


Actually I didn't make the comment about the database being properly designed. I was at a loss as to what to say when that was the complaint.


Not even that - they say "We uploaded it into a traditional Microsoft SQL database on a 1TB drive". So it sounds like they were just using a single spindle. Purchasing an SSD for that SQL Server would probably have given comparable performance benefits.


It's small enough you can get servers that can hold it in memory....


The report also indicates that appropriate government approvals were obtained and that they protected the healthcare data appropriately. Might they have been using an appropriately secured Google cloud tool? I understand that this still doesn't tell you the country in which the data resides, but that's a lot different than posting data on the open internet or an unsecured cloud instance.


Why does the government repeatedly hire incompetent people?? They pay crazy amounts of money for it too.

I hope there is a very public investigation into this. We are losing privacy every day now and this is one area of our lives that needs to remain private at all costs. There is very little I can see to gain and lots to lose from losing privacy in health. Especially in a public system like the NHS.


Because they're all part of the old boys network. No other reason.


Sad but seems to be true.


I think there's 2 reasons: Often times the services are contracted out and those contracts are often subject to "lowest bid wins" rules. It is very difficult to judge a quality like competence especially when there's differing interpretations of purchasing legislation involved.

The other reason is that it is often difficult to fire people in the public sector. I believe it to be due to the impact of unions, which will protect even the most egregious offender out of principle. This builds a culture around being hard to get rid of people, even if it is a non-union position the HR process of dismissal is often the same. For example public bodies often have a lot of employee supports (Employee Assistance programs,etc) and before you dismiss anyone you must allow them to avail of these supports.


The government doesn't hire them, it contracts the work out to consulting firms... who in turn hire incompetent people.


Good. Imo, the fearmongering here is actually quite irrational. Google have more credibility (and money) to lose from a high publicity hack than government contractors who already act with impunity. If they'd invested in their own own map-reduce deployment we'd only be hearing another story about government contractors wasted millions of £ in taxpayers money on Big Brother data analysis.

> The extracted information will contain a person's NHS number, date of birth, postcode, ethnicity and gender.

Big woop? Your NHS# isn't used outside of the NHS or for anything of concern to most people, and your postcode (and address) is held on the unedited electoral roll by hundreds of organisations. Most people don't even opt-out of the edited register accessible for a small fee on 192.com

Why aren't us Brits worried about our credit histories and county court judgements being recorded and held by Equifax, an American company?

What specifically are people actually afraid of with regard to this data set sitting on Googles servers? I just don't get the regular public outcry about NHS data.


My postcode and date of birth if sufficient to uniquely identify me. What you seem to miss is that this then acts as a key into the medical information held in this data set. If it was just the NHS number, date of birth, postcode, ethnicity and gender that was available, nobody would have cared much.

That the electoral roll data is available is a further reason why this is bad: It means that by cross-indexing this data with the electoral roll or similar data, one can take the poorly semi-anonymised NHS data and undo a large percentage of the anonymisation, either completely or with a high degree of probability.

> What specifically are people actually afraid of with regard to this data set sitting on Googles servers? I just don't get the regular public outcry about NHS data.

The issue is not Google per se, but that this loose and fast handling with data that is in no way anonymous indicates that the government and consultancies involved does not in any way understand or respect the concerns people have about privacy and the protection of personal information.

We don't want, e.g., a future where employers can look up our health issues and decide to get rid of someone they see as a potential liability, or use it to help manufacture justifications to get rid of someone who is troublesome. Or one where relatives of someone with a cancer diagnosis receives ads about hospice care, possibly before they've even been told. Or any number of other gross invasions of privacy that this data becoming easily available could enable.


> Google have more credibility (and money) to lose from a high publicity hack than government contractors

No, Google's (along with other major USA based firms) credibility flew out the window the minute E. Snowden released the NSA documents. I don't think that any corporation who has to manage sensitive information is going to trust Google or any other USA based company in the post-Snowden Era. The risks outweigh the benefits.


Except that we're now in the post-Snowden era, and companies like Google have taken measures to harden their networks against (among other things) GCHQ-esque intrusion.

In contrast, the other party in this story---the NHS---apparently hands out sensitive medical data on physical DVDs. I suppose on the plus side one doesn't need to worry about GCHQ being interested in illicitly acquiring that data, as if they have a relevant interest in it they can just ask their neighboring government agency.

While this was clearly an inappropriate act on the part of the contractor, it's not inappropriate for moving the data to a less secure medium or less credible institution than the origin.


The best any US company can say is that they will only turn your information when they obtain a legally binding gag order.

Until our culture changes, US companies can't get that faith back.


What from the leaked documents tells you they lack credibility? From what I remember it was England's own GCHQ that was wiretapping the leased lines between Google data centers.


>What specifically are people actually afraid of with regard to this data set? I just don't get the regular public outcry about NHS data.

Well specifically it would be all the sensitive medial information which for some reason you omitted to mention in your comment. People are only talking about DOB and post codes because that information can be used to identify an individual and associate them with the medical records.


Apologies, I should have specified I meant on Googles servers.

The HES database contains attendence records. You only need a a single verified data point, such as a tabloid hack following you to the hospital one day, to remove pseudoanonymity. The debate over whether pseudoanonymised records or personalised records should be made available to organisations, in real terms, isn't distinct. You still only need one data point (an address, DOB etc) in the poorly pseudoanonymised set. Nothing really changes.

The implication seems to be the data is somehow less secure now its in Googles cloud, but that doesn't quite fit the reality of what we know about how data permeates through these incompetant organisations to begin with. The fact that PA had the data on DVD rather than disk is already an indication that they are a joke. Do you know of any prolific transparent encryption solutions for optical media? Most likely this data was in plaintext. If they carry the data around on unencrypted DVD what is the likelyhood that their own servers are secure, or at least more secure than Googles?

The bottom line is these records all exist and are necessary for the NHS to function, so a competant organisation may as well mine the data set. The issue, then, is that PA aren't competant, not that they use effective tools. Outrage is being misdirected.


Outrage is being misdirected.

No, there is plenty of justification for outrage all round. The NHS staff shouldn't have given the data to an untrustworthy organisation. That organisation shouldn't have given it to a data mining company under the jurisdiction of a foreign government. And that data mining company and foreign government will deserve similar outrage if they don't properly delete the illegally uploaded data as soon as possible after they are properly notified of the circumstances.

It wasn't necessary to share these records like this. You seem to be confusing access by clinicians, or at least legitimate medical researchers subject to similar medical ethics and confidentiality rules, with (as now alleged) just leaving it out there for literally anyone to find it.

Perhaps you aren't outraged by this, but I'll bet most people with a sensitive medical condition that might lead to unjustified discrimination would be. First they came for the HIV-positive, but I was not HIV-positive and I said nothing.


But it was a government contractor who uploaded and queried the data (PA), they just used Google's platform.

Frankly, this seems as related to Google as Nokia would be if someone used one of their cellphones to detonate an explosive. They're just the database provider.


The Google relevance is server location, if outside of UK and if outside EU various data protection laws at each level would appear to have been breached.


It's not relevant that it's Google. The pertinent information [to me] is that it's not a certified host for this confidential data; that the data has been sent outside of the UK; and to a lesser extent [for me] that the data is by virtue of the host it's been sent to legally and undetectably accessible by a foreign government.


Yes, but putting "Google" in the headline is more likely to get clicks.


The shame of all of this noise is that resources going into medical research today ends up getting spent on data security and building expensive, custom solutions that avoid using servers of a certain type or location in the name of privacy.

Sure, it would be more secure to conduct medical research without using computers at all, but what about all those people dying of nasty diseases? If I had 6 months to live, I probably wouldn't mind these "criminals" trying to find me a cure.

Instead, we have a deafening din of screaming about data privacy and little or no mention of the benefits of the medical research itself. If people could calm down a little bit about Big Brother, these guys could spend more time doing their jobs, helping sick people.


Medical data is a great tool but the problem is that these stories are poisoning public good will. There is no point telling people to calm down when they have just learned that records of every meeting they ever had with their doctor were available on the public internet and identifiable to anybody who knows their address and DOB. That is something that people rightly get upset about.

Additionally, it's not like these events are all just accidents or incompetence. The UK government made a policy decision to sell medical records to insurance companies[1].

Also, is it really true that release to the insurance industry is unacceptable to the HSCIC? Its own information governance assessment from August says that access to individual patients records can "enable insurance companies to accurately calculate actuarial risk so as to offer fair premiums to its [sic] customers. Such outcomes are an important aim of Open Data, an important government policy initiative."[2]

[1] http://www.telegraph.co.uk/health/nhs/10659147/Patient-recor...

[2] http://www.theguardian.com/commentisfree/2014/feb/28/care-da...


Not underplaying at all - your point is spot on - but this data only relates to hospital attendances and not GP interactions. Currently GP interactions are not available in the database, and that's the point of care.data.


Sorry. Yes you are quite right.

When I said public internet I was actually referring to the things Ben Goldacre has been tweeting ( https://twitter.com/bengoldacre/status/440475049880195073 ) and I'm not sure which data set he is talking about.


The shame of all of this noise is that resources going into medical research today ends up getting spent on data security and building expensive, custom solutions that avoid using servers of a certain type or location in the name of privacy.

If the various disclosures, legal or not, actual or planned, actually had anything to do with legitimate clinical care or medical research, I think a lot of us would look more kindly upon them. There seems to be little evidence that this is the case, and plenty of evidence that the data was or was going to be disclosed, for profit, to organisations who are not involved in either direct clinical care or legitimate medical research, such as insurers and foreign governments.


Government privacy breaches are one of the things I despise most about current Western society. I am - day in, day out - one of those guys calling for ministerial blood.

However I have long thought that proper open access to health data could be as revolutionary as, say, antibiotics. The government can do whatever the hell they like with my data - on the condition that anyone else can too.

Can you imagine what insights could be gained with canonical graph schemas for individual (but (pseudo|a)nonymised) health records and a bit of statistics/ML? I think it would change the world, but it will never happen unless people like us get our hands on it. No amount of management consultants will innovate on the same scale as the tech community; saving lives through Github and AWS sounds like the only thing I'd do with my weekend.

On a side note, I think the same argument could be applied to a great many public services. I recently emailed my doctor a letter from one of my private doctors in PDF format for addition to my records. Can you guess what happened next? Yep, he printed it out and gave it to a secretary to scan it back in, because the Java app that manages this stuff has very tightly controlled boundaries. Shortly after that I overstayed on a trip to a different part of the UK and desperately needed a top-up of my meds. The solution? Print a prescription and mail it to the pharmacy by next-day post, because Scottish NHS and English NHS computers are incapable of communicating with each other. How long would it be after going open source before all this BS is obliterated? I'm thinking months.


> The government can do whatever the hell they like with my data - on the condition that anyone else can too.

Wonderful.

Now consider how a potential employer might use your data:

Have you ever seen a GP for stress or mental health issues? Oh, maybe we are dis-inclined to hire you.

Suffer from back pain? Well that's one of the most given reasons for needing time off from work, which means you are a liability and we won't hire you.

Or even a business looking for partners:

Of these two evenly matched companies, this one over here has a CEO that has seen his Dr for stress and has had heart attacks in the past... he can't take the pressure and will be taking it easy when we need aggressive, let's go with the other one.

That's before you even touch what insurance companies will do, or whether schools will look into the mental health of little Timothy's family before determining whether to let the child into the school, or what retirement facilities might be available to you at what cost when you're 75+.

Once out in the open, this data is free for everyone to use. And it's going to be hard to then restrict how it is used by trying to bolt one gate after another.


Although I didn't say it explicitly in the particular sentence you quoted, I did say (pseudo|a)nonymously just after. I'm not suggesting identifiable health records should be searchable by all!

You raise interesting points. I think you are quite correct in saying these things are likely to happen. But I wonder whether hiding such information is actually the most efficient strategy. I mean if some divine power gave us all the ability to see inside each others minds, and ergo all the bullshit, lies, and politicking that makes up a great deal of human interaction was to evaporate, wouldn't the world be a better, more forgiving place? Obviously this is all a bit esoteric, and the game-theoretical analysis of moving from the status quo to a world of almost creepy honesty would almost certainly show that it could never happen, but I find it a useful thought nonetheless.


I said before that I do want the problems with using real names to be fixed. And I am thinking that insurance for periodic doctor visits is probably flawed.


PA has a history of not looking after UK Government data very well, they famously left a USB stick on a train with alot of confidential data on it....

http://news.bbc.co.uk/1/hi/7575989.stm


Leaving a USB stick on a train is one thing. Spending weeks to upload 27 DVDs' worth of confidential health data to Google servers is quite another! One is negligent. The other is, imho, criminal.


I shall be writing to my local MP about this, I'd encourage anyone else to do the same!


I agree its the right thing to do, but I have very little faith in it making a difference.


This. Boilerplate reply and a permanent filing in the cylindrical filing cabinet awaits. If you're lucky, they'll use it to lean on whilst writing a cheque to the guy who just installed a duck house in their pond at your expense...


Funny. I've written to several of my local MPs and always received a direct response regarding the issues I have raised.

How many times have you received boilerplate replies?


I gave my local MP my opinion on extending the copyright law in the UK a few years back. The response was along the lines of "Our party believes this. You are in our constituency".

I wrote back explaining that I didn't believe that was the way democracy was supposed to work, the politicians are there to represent the people, not the other way around. The next reply was sympathetic to my view, but I have my doubt that my opinion was actually represented there.


Three times now to three different MPs. Gave up then. Glad you got something more.


I'm actually sorry to hear that. What topics were you contacting them about, if you don't mind my asking? I wrote about private issues, copyright and digital rights issues, and various local council issues (fences and paths). I got a good response every time.

It would be cool to compile a list of responsive and unresponsive MPs…


So was I at the time. It dented my confidence in politicians. This was then immediately followed by the expenses scandals in which one of the MPs I contacted was implicated.

Three different MPs. The issues were about planning permission being denied for a structure that everyone else down the road has, an issue I had with the CPS who lost the evidence after I was stabbed and refused to formally charge the person and the reduction of the number of parking spaces in my area by 50% resulting in people fighting in the streets over spaces (literally!).

Privacy, copyright and digital rights issues, I vote with my feet.

With other issues, I wield a solicitor now. It's a much better use of time and money and that is saying something.


In case you weren't aware, you can use the excellent https://www.writetothem.com/ to make it easier to send letters to your MP


I've very carefully opted out of every single program that the NHS has created for digital records going back years.

Whether that has done any good I have no idea but I do have signed letters from all relevant organisations (Doctors surgery) saying that I've opted out.

This does have legal battle written all over it.


Don't the British have something like HIPAA in the US? If so PA Consulting would have had to follow those rules when using Google's infrastructure. Google's infrastructure passes many security levels and has just about every security certification (up to but _not_ including ITAR). There's nothing inherently insecure about doing this as long as they follow the rules. What are the rules about this over there?


A second scandal is now emerging out of this, as digital mapping firm Earthware are accused of posting HES data in Google maps form on its website for all to see.

[1] http://www.independent.co.uk/life-style/health-and-families/...

[2] http://www.hscic.gov.uk/article/3947/Statement-Use-of-data-b...


Earthware's statement claims that they used mock data

HES Data Map Statement 3 March 2014 18:55 GMT. Earthware was contacted this morning by the HSCIC regarding a demo online map we had created to demonstrate how HES data might be displayed in a mapping environment.Earthware immediately withdrew this map from our website upon request from the HSCIC. Earthware would like to clarify the following: The map displayed mock data held by a third party who provided this data to Earthware via a web API. We do not hold nor have we ever held HES data on our servers. No patient identifiable data was ever displayed on the map. Earthware are confident that we have not breached any legal or regulatory rules regarding the licencing or publication of HES data. We will continue to co-operate fully with the HSCIC if required. http://www.earthware.co.uk/


Interestingly enough, the company behind this cluster#### has previous and proven record of similar behaviour.[1][2] Sure, it takes conscious effort to upload multiple DVD's worth of data, which already rules out accidents - but because this is not an isolated incident, I wouldn't rule out corporate policy of willful neglect either.

"Fined and fired" is not a sufficient deterrent.

1. http://www.theregister.co.uk/2008/09/11/pa_consulting_home_o...

2: http://www.scl.org/site.aspx?i=ne9297


27 DVDs = not a lot of data. What were they doing, transcoding through paper tape?


> The data set was so large it took up 27 DVDs and took a couple of weeks to upload.

27 * 8 GiB (DVD-9) / 2 weeks = ~1.5 Mbps. Yep, that's British broadband bought from the lowest bidder.


I think it's tough to say. There's considerable overhead to packeted traffic. Also it sounds like they had a simple setup so there would be a delay in exchanging the DVDs after each one finishes. Uploads probably finished over the weekend with nobody there to start the next one too.


That's nearly 4.5TB if it was gzipped plain-text.


Sooo... not a lot of data. That's kabdib's point.

This isn't the 90s or something, a couple of terabytes is no longer "big data".


Sneakernet would have been faster...


They're probably not allowed to take the data out of the building for security purposes ...


Once the data enters the US, it is subject to HIPAA. PA is criminally liable.


So, that's an interesting question, right?

If it's UK data, from the UK .gov, being stored to a server whose hardware is in the US, to be worked on by a UK consultancy, should it actually be subject to HIPAA?

Jurisdiction in the Internet is tricky business.


The risk of a "privacy disaster of unprecedented proportions" for NHS data was predicted by Glyn Moody just 3 weeks ago on TechDirt:

http://www.techdirt.com/articles/20140207/09552726132/uk-pol...


> The data set was so large it took up 27 DVDs and took a couple of weeks to upload.

Really? 27 DVDs worth of data is only about 127GB of data and it tooks weeks to upload? I'm on a standard Comcast cable line and I could probably upload that in a few days at most.


All that is now needed is to cross reference it with the ridiculously extensive CCTV footage that the Govt. in Britain has collected & continues to collect everyday.

Perfect Surveillance at a granularity that was not possible before.


No no no - perfect healthcare at a granularity that was not possible before! Plays much better with the voters!


You may have a great future in Politics!



For fuck's sake.

I opted out of 'care.data' (what a stupid name), and now I find out my data was breached anyway?

I wonder if leaked people can start a class action lawsuit.


"Nonsense! We never uploaded any data to Google servers, we just put it in the cloud!"


It took weeks to upload? Never underestimate the bandwidth of a station wagon full of tapes...


We were shipping hard disks between the States and Spain. Unfortunately Spanish customs can make the delay unreasonably large (especially when someone put the value of the drive at $100).


>(especially when someone put the value of the drive at $100) //

Is that wrong? I see a Seagate Baracuda 3TB on Amazon at $110. Do you mean it should have included the data value or that they inflated the price or what?


Yes, but the drive wasn't what had the value (it was second hand). It got sent back afterwards.

It was publicly funded research data, to be processed and uploaded to the Genome Browser at UCSC. Had someone just put zero on it I think it would have gone a lot more smoothly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: