De-identified data for research | 15 May 2014 | Meeting note

Thursday 15 May 2014, 10:00-12:00

Attendees

Melanie Wright – University of Essex
Marcus Besley – BIS
Frances Pottier – BIS
Vanessa Cuthill – ESRC
Tanvi Desai – University of Essex
Tracey Gyatang – NPC
Johanna Hutchinson – BIS
David Knight –DoH
James Denman  – CLG
Maria Sigala – ESRC
Nicky Tarry – DWP
Edgar Whitley – LSE
Steve Bond – ONS
Simon Meats  -CO
Naqina Akram – DWP
Olivia Varney-Winter – RSS
Oliver Butler – Law Commission
Melanie Wright – ADS
Steve Pavis – ADRC Scotland
David Ford – ADRC Wale
Rufus Rottenberg – CO
Peter Elias – ESRC
Peter Lawrence – CO
Iain Bourne – ICO
Peter Elias – University of Warwick
Daniele Bega – HMRC
Jackie Riley – HMRC

Meeting note

  1. Welcome and introductions

Previously provided – rationale for legislation

There was a discussion about the proposed purpose of the legislation. This was to create a single common approach that all public bodies who wish to share data for research purposes can adopt with set standards that all can easily apply and work through consistently with certainty. This should not only benefit the public bodies who will now be able to more consistently, safely and easily share such data, but crucially it is preferable for the public as well because, rather than a varying landscape of uncertainty of who can do what with what, there will be a single, consistent clear, transparent and tested/overseen system that can be verified, scrutinised and understood by all. At the same time the other existing powers and methods of sharing data will continue to exist and be capable of use.

  1. There followed three presentations regarding the rationale for legislation; these covered the academic perspective, government perspective and the third sector perspective.

Government perspective

Three issues 1. Why we need it 2. Why we need more, better, faster, more flexible 3. Why we need it to be more accessible outside government

  • We have seen the benefits of use of admin data and data linking (both admin to admin data and admin to survey data). – what we do now only gets us so far
  • Society is becoming more dynamic, faster moving so policy development and evaluation needs to be faster and more joined up
  • Joined up government policy needs data linking for government, academia and third sector
  • Open data agenda – we want & need innovations from outside government.
  • Big Data – Public sector data uses catching up with private sector?
  • Efficiency and effectiveness of policy development, for individuals, society and the economy
  • International competitiveness – we can’t afford to react later than competitor countries – a 2 year disadvantage cannot be acceptable.
  • Need to move from what often became a default “No, not yet, too complicated, very risky, will take ages to work out” to a default “Yes, within sensible limits laid down by DPA and HRA”
  • Complex legislation beyond DPA and HRA. Statutory departments like HMRC have no powers to share data unless explicitly stated in legislation. Others have varying legislation that prohibits or permits data sharing. There are often explicit gateways which states what data can be shared with what organisation for what purpose. This has affected departments’ abilities to use common law powers. This leaves highly complex arrangements that differ between departments and can be unclear whether a new sharing proposal meets existing legislative frameworks.
  • Need to get greater consistency between Departments to support clarity, confidence, efficiency and focus on privacy issues that would matter to the data subject. We need to share more de-identified data to respect their interests and address concerns. Best impact, best value.
  1. The Administrative Data Liaison Service gave a presentation entitled: Why greater sharing of de-identified data is desirable – an academic perspective.

Please see attached document: Legal barriers to and benefits of access to administrative data: Initial summary of responses to an email survey initiated by the ESRC funded Administrative Data Liaison Service.

  1. A representative of the Third Sector then spoke about two examples of where data sharing would have been beneficial to the NPC data lab. These are summarised in the attached case study paper.
  1. There was then an opportunity for questions.

Q. Are we saying there are things that we cannot currently do, when in fact we can currently do them but with great difficulty?

A. It is true that some data sharing can and does take place under current legislation; however this happens under dozens of different gateways. It is slow, inefficient and unwieldy. It is also open to interpretation and has led to lengthy, complex and costly legal negotiations and debates.

Q. This legislation would solve the legal barriers but is the greater barrier a cultural one?

A. Yes, this may well be true. However this legislation is the first step towards changing the culture. We recognise that we need up to date efficient research, we have evidence that current legislation does not permit that. We appreciate that changing the culture surrounding data sharing will take time to change we believe that a transparent legislative process with agreed governance will kick start this change.

Q. So is legislation in order to ‘nudge’ cultural change appropriate?

A. Whilst we hope that legislation does lead to a cultural change, the legislation is being proposed predominantly to facilitate up to date and efficient research that is able to respond to a fast changing world.

7.     A presentation by the ADRN on Governance and Transparency.

Initial procedures papers will be available at the end of May 2014.

There will be a UK Statistics Authority led governance board. They have held one meeting and a second is booked for early June. The board will oversee all ADRN activities. They will ratify all proposals and procedures.

All perspective ADRC users will have to be accredited. The full details of the accreditation process will be available in early June. Applications will be made to an Independent Approvals Panel. The panel will be formed of individuals who are independent of the ADRN. They will be an expert disinterested body. It is anticipated that this panel will meet ‘virtually’ on a monthly basis and face to face on a quarterly basis. Therefore, applicants will receive the outcome of their application in a timely manner. The ADRN will provide advice to researchers on preparing a report for the panel. Requirements will be laid out in full but are expected to include the need for applicants to:

  • be affiliated with an RCUK approved institution;
  • be a fit and proper person;
  • be able to carry out the research independently, or under the supervision of an appropriate supervisor;
  • have a clear understanding and approach to privacy;
  • undertake an agreed training programme (to include legal access, legal sanctions, disclosure control and manual output scrutiny)

Their project must:

  • have passed through an ethical review process;
  • have demonstrable public benefit;
  • not be related to any operational government function;
  • be feasible;
  • have sufficient scientific merit;
  • not be able to be carried out in any other way.

All research outputs must be made accessible in the public domain. Researchers must inform the ADRN of the publication and/or presentation of any research. Copies of the documentation, any code and/or syntax used must be made available to the ADRN. In addition, researchers must submit a summary of research in accessible language (maximum two pages) on completion.

Following consideration the panel can accept, reject or request further clarification on the proposal. There will be an appeals procedure for rejected proposals to the Board of the ADRN. The data holder can decline to provide the data at any time, regardless of the panel decision.

The ADRN will publish information on all researchers, project proposals and datasets used.

There is some uncertainly about the numbers of applications that are anticipated. ESRC have reviewed applications for current data labs and secure environments for early modelling. They have prepared an option analysis and strategies for the various application scale possibilities; including slow start with a sharp increase, and a high early uptake. It is speculated that there may be some pent up demand that may create an early peak. ESRC will also programme research activities that make use of the resources.

8. A presentation on Handling of data – TTP and security, Please see document: An example of an existing data linkage model: The case of the Farr Institute @ Scotland.

A presentation about the key features of Secure Anonymised Information Linkage (SAIL) in Wales. They have 320 individual data suppliers and have completed over 250 projects to date.

9. The ADRN gave some further information regarding training for prospective researchers. The training would be compulsory and would be delivered face to face. It aims to establish a culture of responsibility and confidentiality. It will provide education on the legal frameworks. Personal responsibility and disclosure control. Pilot training will be rolled out during the test period.

Following training the researcher will be required to sign a legal service agreement and a confidentiality agreement. These are counter signed by their Institution. There will be a variety of sanctions and penalties that will be taken if there are any breaches of these agreements. The policy is currently under development and is based upon the UK Data Secure Lab policy. ESRC can withhold funding from both individual and the institution. They can also ban both the individual and/or the institution from future research.

10. Group work

A) Do you agree with rationale for the proposal?

Those present agreed with the rationale as a means of achieving a level playing field in terms of vires to share de-identified data. The following reasons were given:

  • researchers currently don’t have a good understanding of what gateways are available;
  • the legislation supersedes lots of ‘bits’ of legislation that currently exist;
  • the current odds and ends of legislation approach lack transparency;
  • there are serious inefficiencies in the current process;

Two doubts were:

  • If the legislation was only used for ‘nudging’ attitudes then it was disproportionate; however a response to this criticism was that cultural change may result from the legislation, but its principal purpose was to create a level playing field, which was a proportionate use of legislation
  • Unsure how well the research strand sits with the other two strands

B) What are the risks?

  • Proposals are not the “cure all” and do not solve issues of budgets, culture and organisational resistance – but they may help the cultural issue.
  • Withdrawal of funding from the ADRC network.
  • Confusion in media around terms such as identified/de-identified/pseudonymised/disclosure control
  • Repeat of the problems around care.data

C) Are the safeguards appropriate?

Suggested safeguards for exploration and discussion

  • What options are there for independent external scrutiny, including independent security analysis and accreditation of the safe havens – or is UKSA the only body that can accredit the safe havens?;
  • communication of the safeguards to the public is key, vagueness leads to mistrust;
  • commercial companies can be partners but they will never see the data;
  • we must be clear that the safe havens never sell data;
  • we must be clear about exactly what purposes data will and will not be used for;
  • prioritise transparency.
  • In order to avoid confusion with care.data, data from NHS organisations should be excluded from the power – access for researchers to health data will be from the Health and Social Care Information Centre
  • explore whether Trusted Third Parties Indexers should be public sector organisations only – this might reduce fear that the data would be mishandled.
  • in the messaging around the proposal there would need to be a justification as to why individuals were not given the opportunity to “opt out” of having their data shared, given that such an opt out was made available for health data held by GPs in care.data. In response to this, it was pointed out that there are existing gateways for data to be shared, where is no opt out. In addition, the data that is the subject of these proposals would not be able to leave the safe haven.

11. Due to time restriction it was requested that any further points or questions e made via correspondence.

12. Rufus proposed that the next stage of the process be carried out by correspondence. The next meeting on de-identified data will be at 70 Whitehall on Friday 13 June 13.00-15.00.

This entry was posted in Meeting notes on by .
Tim Hughes

About Tim Hughes

Tim is Involve's incoming director, taking over from 21st January 2017. Tim has led campaigns and advocacy on open government; advised national, devolved and local governments, civil society organisations and multilateral institutions; and researched and written on topics including public participation, open government, democratic reform, civil society advocacy and public administration.

Leave a Reply

Your email address will not be published. Required fields are marked *