WO2014075836A1 - Pseudonymisation and re-identification of identifiers - Google Patents

Pseudonymisation and re-identification of identifiers Download PDF

Info

Publication number
WO2014075836A1
WO2014075836A1 PCT/EP2013/069319 EP2013069319W WO2014075836A1 WO 2014075836 A1 WO2014075836 A1 WO 2014075836A1 EP 2013069319 W EP2013069319 W EP 2013069319W WO 2014075836 A1 WO2014075836 A1 WO 2014075836A1
Authority
WO
WIPO (PCT)
Prior art keywords
identifier
identification
pseudonymisation
pseudonym
encrypted
Prior art date
Application number
PCT/EP2013/069319
Other languages
French (fr)
Inventor
Harald AAMOT
Original Assignee
Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts filed Critical Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts
Publication of WO2014075836A1 publication Critical patent/WO2014075836A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • G06F21/6254Protecting personal data, e.g. for financial or medical purposes by anonymising data, e.g. decorrelating personal data from the owner's identification

Definitions

  • the invention relates to pseudonymisation and re-identification of identifiers, in particular, of patient identifiers, e.g. for translationai research, such as cancer research.
  • a patient identifier is usually encrypted to provide a pseudonym that is forwarded to research institutes or scientists along with the patient's medical data.
  • a known method for providing pseudonyms is shown in Fig. 1.
  • the patient in Fig. 1 is associated with a PID 11 1.
  • the physician enters the patient's data including the patient's PID and his or her medical data into a system, which encrypts the PID to generate a cipher 122.
  • the cipher 122 is pseudonymised to provide a pseudonym 123.
  • the encryption method is a non-injective encryption method.
  • the medical data of a single patient are associated with different pseudonyms 123. Therefore, the scientists operating on the patient's medical data cannot determine whether two sets of medical data belong to the same patient or to different patients.
  • next generation sequencing methods have provided a wealth of new possibilities for the characterization of tumors of individual patients and laid the basis for new treatment pos- sibilities. If research is conducted with anonymized data as described above, the ability to make a retrospective linkage to clinically relevant information, which is a necessity in biomedical translationai research, is barred. Anonymization withdraws the possibility to re- identify a corresponding patient, thus a direct benefit from research results for this patient is impossible. This is also an ethical constraint, as under defined circumstances, individual re- search results in genetic and genomic research might lead to new treatment possibilities and therefore should be offered to study participants in a clinically relevant timeframe.
  • next-generation-sequencing technologies and other high-throughput methods imply that many different persons and often external organizations are involved in the data collection and analysis process and therefore impose additional risks to patient privacy.
  • a secure pseudonymisation is desirable. Therefore, a data protection and privacy con- cept with solid pseudonymisation and a streamlined re-identification process to transfer the results back to the clinic is needed.
  • a trusted third party pseudonymises a patient identifier PID to yield a pseudonym PSN.
  • the PSN may then be provided to a scientist along with the patient's medical data.
  • the TTP may be asked for the PID associated with a specific PSN.
  • the pseudonymisation method shown in Fig. 2 is reversible, e.g., by using mapping tables or mapping functions. To preserve privacy, this mapping has to be secure.
  • the invention provides a pseudonymisation method, wherein the method comprises receiving an identifier, and in response thereto: Encrypting said identifier using at least a first injective encryption method to provide a pseudonym, encrypting said identifier using a public key of an asymmetric encryption key system to provide an encrypted identifier, and storing said encrypted identifier along with said pseudonym.
  • the injective encryption method may be a strictly injective encryption method, i.e. a method that provides injective encryption in a strict mathematical sense.
  • any other method of mapping an identifier on a pseudonym may alternatively be used that does not provide a collision between any two identifiers.
  • the injective encryption method for all practical purposes of this disclosure, may be a method of generating a unique pseudonym for each identifier.
  • the method thus generates different pseudonyms, while the same pseudonym is generated when the method is applied repeatedly to the same identifier.
  • a collision-resistant hashing method such as HMAC or PBKDF2
  • the injective encryption method may thus e.g. comprise a strictly injective encryption method or a collision-resistant hashing method.
  • the identifier is a pa- tient identifier
  • multiple sets of medical data may easily be associated with the same patient.
  • the patient identifier may easily be re-identified by decrypting the encrypted pa- tient identifier. Re-identification thus does not require that the first encryption method may easily or quickly be reversed. In contrast, the method does not require any decryption of the pseudonym itself at all. Moreover, re-identification does not require a table including the pseudonyms and the associated identifiers. Hence, the pseudonymisation method of the inven- tion provides a high level of security.
  • the identifier may comprise a number, a text, a combination thereof or any other information.
  • the identifier is uniquely associated with a concrete or abstract person or object, such as a group, entity, organization, unit, account, transaction, or any combination thereof.
  • the identifier may be a patient identifier, a bank account identifier, a data retention/preservation identifier in telecommunications or a financial transaction identifier.
  • the private key of the asymmetric encryption key system is not needed.
  • the private key of the asymmetric encryp- tion key system is not stored or otherwise present at the entity performing the pseudonymisation method like, for example a pseudonymisation system, such that this entity may not re- identify a given pseudonym.
  • the private key may in some instances only be stored at a re-identification system, which is separate from the pseudonymisation system, or may only be available to an ombudsman.
  • said step of encrypting said identifier to provide a pseudonym comprises encrypting said identifier using said first injective encryption method to provide an encrypted code, and encrypting said encrypted code using at least a second injective encryption method to provide said pseudonym.
  • two encryption methods are used to provide the pseudonym. Hence, the security of pseudonymisation is increased.
  • the first and the second injective encryption methods may be the same or may be different.
  • the method further comprises, for each of one or more additional asymmetric encryption key systems, each comprising an additional public key: Encrypting said identifier using said additional public key to provide an additional encrypted identifier, and storing said additional encrypted identifier along with said pseudonym.
  • the additional asymmetric encryption key systems may each be associated with a different re- identification system, ombudsman or clinic. This allows re-identification of a given pseudonym using any one of the asymmetric encryption key systems. It is therefore not required to contact a specific individual dedicated ombudsman, but re-identification may be performed by any ombudsman having a private key of any of the asymmetric encryption key systems used for encryption. This is also beneficial in case one of the private keys is lost. In that case, the private key of any of the other ombudsmen may be used to decrypt the corresponding encrypted identifier.
  • the method further comprises receiving a data request comprising a requested pseudonym, and, in response thereto, retrieving a requested encrypted identifier as an encrypted identifier that was previously stored along with said requested pseudonym and, in response thereto, returning said requested encrypted identifier.
  • This embodiment allows fast re-identification as only the requested pseudonym needs to be returned rather than the entire contents of the re-identification memory.
  • the data request is received from a re-identification system and the requested encrypted identifier is returned to said re-identification system.
  • the method further comprises the step of outputting said pseudonym.
  • the step of receiving the identifier may comprise receiving the identifier via an interface such as a user interface and the method may comprise outputting said pseudonym via said interface.
  • the pseudonym may be output via output means which are different from input means used to receive the identifier.
  • the method may comprise storing the pseudonym, e.g. in a database such as a research database, a transaction database, an account database, etc.
  • the method further comprises receiving data, in particular, medical data.
  • the method may further comprise outputting said pseudonym along with said data, e.g. via output means, and/or storing said pseudonym along with said data.
  • the method further comprises encrypting the data in response to receiving them and before outputting and/or storing them.
  • the method further comprises, in response to receiving the identifier, a step of determining whether the identifier has been encrypted to provide a pseudonym before, preferably using a pseudonymisation database.
  • a step of determining whether the identifier has been encrypted to provide a pseudonym before preferably using a pseudonymisation database.
  • These embodiments may further comprise, when it is determined that the identifier has not been encrypted to provide a pseudonym before, storing an indication in the pseudonymisation database.
  • the indication may, for example, comprise the identifier or any result of an injective mapping thereof, such as a pseudonym or an encrypted code provided for the identifier as specified above.
  • the steps of encrypting the identifier to provide an encrypted identifier and of storing the encrypted identifier along with the pseudonym are performed for those identifiers, for which it was determined that the identifier has not been encrypted to provide a pseudonym before.
  • the method may further comprise retrieving the pseudonym from the pseudonymisation database in response to determining that the identifier has been encrypted to provide a pseudonym before. In some of these embodiments, the method may further comprise storing the pseudonym in the pseudonymisation database in case it was determined that the identifier has not been encrypted to provide a pseudonym before.
  • the method further comprises the step of querying a pseudonymisation database for the encrypted code and, when the encrypted code is found in the pseudonymisation database, retrieving the pseudonym from the pseudonymisation database.
  • the method may further comprise, when the encrypted code is not found in the pseudonymisation database, in response to encrypting the identifier to provide the pseudonym, storing the pseudonym along with the encrypted code in the pseudonymisation database.
  • the steps of encrypting the identifier to provide the encrypted identifier and of storing the encrypted identifier along with the pseudonym are performed in response to the encrypted code not being found in the pseudonymisation database.
  • the method further comprises, in response to a deletion request, deleting the encrypted identifier(s), e.g. from the re-identification memory and/or the re- identification database.
  • the deletion request may e.g. include the identifier, the encrypted identifier and/or the pseudonym.
  • the re-identification data stored for a given identifier may be deleted, such that future re-identification is prevented. This may e.g. be useful when, initially having given his or her permission to store data to allow re- identification, a patient later revokes the permission.
  • the method further comprises, in response to a deletion or change request, deleting or changing the pseudonym, e.g. in the re-identification memory and/or the re-identification database.
  • the deletion or change request may e.g. include the identifier, the encrypted identifier and/or the pseudonym.
  • the re-identification data stored for a given identifier or pseudonym may be deleted or changed, such that future re- identification is prevented. This may e.g. be useful when a patients' identity has been revealed for a given pseudonym or an encryption method used to generate the pseudonym is not considered safe anymore.
  • the steps of receiving the identifier and/or receiving the data request may use an interface, such as e.g. a PHP (Hypertext Preprocessor) interface or a Web Service (e.g. SOAP) interface.
  • a PHP Hypertext Preprocessor
  • SOAP Web Service
  • the latter is preferred as it allows incorporating the interface in a desired software program, web page, office application, such as Microsoft OfficeTM application, or the like.
  • it allows the use of a broad variety of programming languages to implement the method of the invention.
  • the method further includes printing the pseudonym and/or displaying the pseudonym on a screen.
  • the pseudonym may be printed in text, such as plain text, and/or as a barcode, such as a Code39, Codel28 and/or 2-dimensional barcode, such as a QR (Quick Response) code, e.g. on a piece of paper, a sticker, a label, such as an adhesive label, a tag, an adhesive tape, or the like.
  • any or all of the pseudonymisation method may be performed by a pseudonymisation system and, in particular, by a pseudonymisation logic included therein.
  • the invention provides a pseudonymisation system, comprising input means adapted to receive identifiers, a re-identification memory and pseudonymisation logic coupled to said input means and to said re-identification memory.
  • Said pseudonymisation logic is adapted to: Receive a identifier via the input means, and in response thereto: encrypt said identifier using at least a first injective encryption method, in particular, using a Keyed- Hash Message Authentication Code (HMAC) method or a Password-Based Key Derivation Function 2 (PBKDF2) method, to provide a pseudonym, encrypt said identifier using a public key of an asymmetric encryption key system to provide an encrypted identifier, and store said encrypted identifier along with said pseudonym in said re-identification memory.
  • HMAC Keyed- Hash Message Authentication Code
  • PBKDF2 Password-Based Key Derivation Function 2
  • the re-identification memory may be a buffer, e.g. only storing entries for a single pseudonym.
  • the re-identification memory is adapted to be coupled to an internal or external re-identification database and further adapted to transmit all or parts of its content to the re-identification database in response to the pseu- donymisation logic encrypting the identifier to provide the pseudonym and the encrypted identifier.
  • the re-identification memory or the re-identification database may e.g. be implemented using Microsoft AccessTM, Microsoft SQL ServerTM, such as versions 2000, 2005, 2008, 2012, MySQLTM, PostgreSQLTM, OracleTM or IBM DB2TM. Other database management systems may be used in alternative embodiments.
  • the re-identification memory comprises previously provided pseudonyms along with encrypted identifiers provided for the same identifier. It is preferred that, for every identifier received via the input means, at least one encrypted identifier and a pseudonym are stored in the re-identification memory. This may comprise that, in case it is deter- mined that the identifier has not been encrypted before to provide a pseudonym, the pseudonymisation logic stores the encrypted identifier along with the pseudonym in the re- identification memory in response to encrypting the identifier to provide the pseudonym and encrypting the identifier to provide the encrypted identifier.
  • the pseudonymisation system further comprises output means coupled to the pseudonymisation logic and the pseudonymisation logic is further adapted to, in response to encrypting the identifier to provide a pseudonym, output said pseudonym via said output means.
  • the pseudonymisation system further comprises a public key memory coupled to said pseudonymisation logic, wherein said public key memory is adapted to hold said public key of said asymmetric encryption key system and wherein said pseudonymisation logic is adapted to retrieve said public key from said public key memory, wherein said public key memory is preferably detachable from said pseudonymisation logic.
  • the pseudonymisation system further comprises coupling means adapted to couple the public key memory to the pseudonymisation logic. Allowing the public key memory to be detached from the pseudonymisation logic allows for easy replacement of public keys and for storing it in a safe place when not needed.
  • said pseudonymisation logic is adapted to encrypt said identifier to provide a pseudonym by encrypting said identifier using said first injective encryption method to provide an encrypted code, and encrypting said encrypted code using at least a second injective encryption method to provide said pseudonym.
  • the pseudonymisation system further comprises a pseudonymisation database coupled to the pseudonymisation logic.
  • the pseudonymisation logic may be adapted to store the identifier, the encrypted code or the pseudonym in the pseudonymisation database as specified in more detail above.
  • said first injective encryption method comprises an AES-256 encryption in CBC mode with a predetermined initialization vector or in ECB mode is used to encrypt the identifier.
  • AES Advanced Encryption Standard
  • said first injective encryption method comprises at least one of a Keyed-Hash Message Authentication Code (HMAC) method and a Password- Based Key Derivation Function 2 (PBKDF2) method.
  • HMAC Keyed-Hash Message Authentication Code
  • PBKDF2 Password- Based Key Derivation Function 2
  • the Keyed-Hash Message Authentica- tion Code is specified in the Federal Information Processing Standards Publication 198 available from http://csrc.nist.gov/publications/fips/fipsl98/fips-198a.pdf and
  • the HMAC allows for injective encryption and thus provides a unique encrypted code for each given identifier.
  • the HMAC can further be used directly as a pseu- donym, because it is secured against dictionary attacks with a cryptographic key.
  • PBKDF2 Password-Based Key Derivation Function 2
  • the PBKDF2 adds additional entropy to a one-way function like the HMAC.
  • an AES-256 encryption e.g. in CBC mode with a predetermined initialization vector or in ECB mode, of the identifier is used to provide an encrypted code, and subsequently, the Keyed-Hash Message Authentication Code (HMAC) method and/or the Password-Based Key Derivation Function 2 (PBKDF2) are applied to the encrypted code to provide the pseudonym.
  • HMAC Keyed-Hash Message Authentication Code
  • PBKDF2 Password-Based Key Derivation Function 2
  • said pseudonymisation logic is adapted to, for each of one or more additional asymmetric encryption key systems, each comprising an additional public key: Encrypt said identifier using said additional public key to provide an additional encrypt- ed identifier, and store said additional encrypted identifier along with said pseudonym in said re-identification memory.
  • the pseudonymisation logic may, in particular, be adapted to perform these steps in response to receiving the identifier via the input means.
  • the pseudonymisation system further comprises a re- identification interface adapted to receive data requests and coupled to said pseudonymisation logic, wherein said pseudonymisation logic is further adapted to: Receive a data request comprising a requested pseudonym via said re-identification interface, and in response thereto: Retrieve a requested encrypted identifier from said re-identification memory as an encrypted identifier that is stored along with said requested pseudonym in said re-identification memory and return said requested encrypted identifier via said re-identification interface.
  • said pseudonymisation logic comprises separate first and second logic portions coupled to each other, wherein said first logic portion is coupled to said input means and is adapted to receive said identifier via said input means and encrypt said identifier using at least said first injective encryption method to provide an intermediate pseudonymised code.
  • the second logic portion is coupled to said re-identification memory and is adapted to receive said intermediate pseudonymised code from said first logic portion, encrypt said intermediate pseudonymised code using at least a third injective encryption method to provide said pseudonym, and store said encrypted identifier along with said pseudonym in said re-identification memory.
  • the first or second logic portions may be adapted to encrypt the identifier using the private key of the asymmetric encryption key system to provide the encrypted identifier.
  • the first and second logic portions may be spatially separated from each other.
  • the first and second logic portions may, for example, be coupled via a network like the Internet, a wide area network (WAN) or an intranet.
  • WAN wide area network
  • intranet intranet
  • this embodiment al- lows a fast re-identification of a pseudonym by an authorized re-identification system or ombudsman, as the encrypted identifier and the pseudonym are stored in the re-identification memory coupled to the second logic portion.
  • the input means further incorporates a Reverse Turing Test (RTT) to prevent abusive repetitive service usage.
  • RTT Reverse Turing Test
  • the pseudonymisation system can determine if a machine or a human is using the service and refuse service use to a machine.
  • the invention provides a re-identification method comprising: Receiving a re-identification request including a requested pseudonym, transmitting, in response to receiv- ing said re-identification request, a data request including said requested pseudonym to a pseudonymisation system or a re-identification database, receiving, in response to transmitting said data request, a requested encrypted identifier, and decrypting, in response to receiving said requested encrypted identifier, said requested encrypted identifier using a private key of an asymmetric encryption key system to provide a decrypted identifier.
  • This method provides a fast re-identification as it only requires to request an encrypted identifier from a single entity, namely, the pseudonymisation system or a re-identification database which may be external or internal to a re-identification system performing the steps of the re- identification method. Moreover, with the method of the invention, it is not necessary to transmit the private key of the asymmetric encryption key system over a network to perform the re-identification, such that security is further improved.
  • the steps of receiving the re-identification request, transmitting the data request and/or receiving the requested encrypted identifier may use an interface, such as e.g. a PHP interface or a SOAP/Web Service interface, as specified above.
  • an interface such as e.g. a PHP interface or a SOAP/Web Service interface, as specified above.
  • the method further comprises, e.g. in response to receiving the re- identification request, combining at least two partial keys to obtain the private key of the asymmetric encryption key system.
  • the partial keys may e.g. be obtained from different om- budsmen.
  • the partial keys may be obtained in response to receiving the re- identification request.
  • the partial keys may each be stored in a respective private key memory as described below, such as e.g. on different chip cards.
  • the generated private key may be deleted after decrypting the encrypted identifier to further improve data secu- rity.
  • the invention provides a re-identification system comprising input and output means, a data interface and re-identification logic coupled to said input and output means and to said data interface.
  • Said re-identification logic is adapted to: Receive a re- identification request via said input means, said re-identification request including a requested pseudonym, transmit, in response to receiving said re-identification request, a data request including said requested pseudonym to a pseudonymisation system or a re-identification database via said data interface, receive, in response to transmitting said data request, a requested encrypted identifier via said data interface, and decrypt said requested encrypted identifier using a private key of an asymmetric encryption key system to provide a decrypted identifier, and output said decrypted identifier via said output means.
  • the re-identification system further comprises a private key memory including said private key of said asymmetric encryption key system, said pri- vate key memory coupled to said re-identification logic, and wherein said re-identification logic is adapted to retrieve said private key from said private key memory, wherein, preferably, said private key memory is detachable from said re-identification logic.
  • the re-identification system further comprises a key interface coupled to the re-identification logic and further adapted to be coupled to a private key memory.
  • the private key memory may be part of the re-identification system or may be separate. These embodiments allow the private key memory to be detached from the re-identification logic and to be stored in a safe place when not needed.
  • the private key memory may, e.g. be included in a chip card, for example, including an RF chip.
  • the key interface may comprise a card reader.
  • the key interface may be adapted to be coupled to the private key memory in a wired or a wireless manner, e.g. via electrical contacts or contactless.
  • the key interface may have an antenna and a receiver or transceiver electrically coupled to the anten- na, to wirelessly couple the key interface to the private key memory.
  • the private key memory may also include or be electrically coupled to an antenna.
  • the private key memory may, e.g. be integrated in an authorization memory, such as an identity card that a user, such as an ombudsman, may already possess for other purposes.
  • the private key memory may be integrated on a private chip card, an official identity card, such as an electronic passport or identification (ID) card, an electronic access card for a building, a vehicle, etc., a badge, such as an employee identification badge, an identity card issued by a company, such as an insurance card, an identity card issued by an organization, such as a membership card, or the like.
  • the private key memory is integrated in a Health Professional Card
  • the key interface comprises a card reader adapted to couple to and read from a Health Professional Card.
  • the Health Professional Card may be an identity card that is issued to a physician or a health institution by a governmental or non-governmental institution, such as a health insurance company or a an association of physicians.
  • the key interface may comprise an eHealth-BCS terminal.
  • the private key memory stores the private key as a protected key.
  • the private key may be protected by a personal identification number ( ⁇ ), a password/passphrase and/or a hardware security model (HSM).
  • personal identification number
  • HSM hardware security model
  • the re-identification server may request a specific entry of the re-identification memory or database, in which encrypted identifiers are stored along with a pseudonym. Even in case an unauthorized entity should gain access to the re-identification memory or database, the information provided therein will be useless to this entity as long as it does not possess the private key of the asymmetric encryption key system used for encryption. As these entries thus do not directly allow for re-identification, e.g., of a patient, they may be made available to a requesting entity without requiring a high level of security checking.
  • the data request includes an identification indicator of the re- identification system, an ombudsman and/or an asymmetric encryption key system.
  • the pseudonymisation logic of the pseudonymisation system may be adapted to retrieve a requested encrypted identifier from the re-identification memory that was provided using the public key of the re-identification system, ombudsman or an asymmetric encryption key system, respectively, indicated by the indicator.
  • the invention provides a computer-readable medium having instructions stored thereon that, when executed by a computer, cause the computer to perform the method of any of the aforementioned kinds.
  • the method may in some embodi- ments be implemented using a programming language.
  • these embodiments allow for easy integration of all or parts of the method in a conventional hospital information system.
  • the method may be implemented using the language PHP.
  • PHP is a scripting language that can be embedded into HTML and may be interpreted by a hypertext pre-processor.
  • Object oriented programming is supported and can be combined with scripting and rapid web development.
  • Fig. 1 illustrates a known method for encryption and pseudonymisation
  • Fig. 2 illustrates known pseudonymisation and re-identification methods
  • Fig. 3 illustrates embodiments of the pseudonymisation and re-identification systems of the invention
  • Fig. 4 illustrates embodiments of the pseudonymisation and re-identification systems of the invention
  • Fig. 5 illustrates a pseudonymisation method according to an embodiment
  • Fig. 6 illustrates a re-identification method according to an embodiment
  • Fig. 7 illustrates a pseudonymisation system according to an embodiment
  • Fig. 8 illustrates a re-identification system according to an embodiment
  • Fig. 9 shows a user interface by which a user may enter a patient identifier according to an embodiment
  • Fig. 10 shows another user interface by which the user is informed about a pseudonym according to an embodiment
  • Fig. 11 illustrates pseudonymisation and re-identification systems according to an embodiment.
  • a hospital information systems may have patient registration modules that assign a unique patient identifier (PID) to a patient to constitute a mature identity management.
  • PID patient identifier
  • SPL sample processing lab
  • biospecimen like blood and tissue have to be annotated with contextual information like project name, tissue source site, sample, or study participant.
  • Working with personal data at a SPL is not necessary and therefore not ac- ceptable, although patients may have consented into processing of personal data where needed.
  • a pseudonym PSN instead of personal data or the PID is attached to a sample that is sent to a SPL.
  • pseudonymisation of the PID into a PSN is performed before a sample and its associated data leaves the hospital to get processed somewhere else.
  • the PSN provided using the method of the invention is repetitively unambiguous for a patient, so that several samples from the same patient like, e.g., a tumor and a subsequent control sample can be correlated by the researchers and also utilized for secondary use.
  • the invention allows re-identification by several persons independently from each other. Each of these persons is able to compute back the PID from a PSN, i.e., to perform re- identification. In the context of this disclosure, these persons are called ombudsmen, because they safeguard the privacy of the patients but are able to identify individuals from a pseudonym if this is necessary.
  • any private key used for encrypting the patient identifier to provide an encrypted patient identifier may be made available to other entities, organisations, systems, servers and/or people, depending on the requirements of the application.
  • the present invention provides a safe pseudonymisation and re-identification, e.g. with regard to the following threats:
  • a system or hardware theft is a serious threat.
  • a dictionary attack is possible.
  • a dictionary attack with lookup or rainbow tables can be made uneconomical by applying a deterministic Salt to the HMAC.
  • the computing cost of a dictionary attack by an insider with the secret key may be controlled in a linear way, e.g. by applying the function PBKDF2.
  • a high iteration parameter for PBKDF2 and security measures against physical theft and hijacking provide effective security against this threat.
  • Social engineering regarding the ombudsmen is a serious threat.
  • a measure that must be taken against social engineering is the education of the ombudsmen so they can develop consciousness.
  • An additional or alternative measure is to apply a personal identification number (PIN) or passphrase to the private key of their asymmetric encryption system as specified above.
  • PIN personal identification number
  • RTT Reverse Turing Test
  • Fig. 3 shows pseudonymisation and re-identification systems that are provided according to an embodiment of the invention.
  • medical data (MD AT) 18 and a patient iden- tifier (PID) 1 1 are entered into a computer.
  • the MDAT 18 and the PID 1 1 are then transferred to a trusted third party (TTP) 2 which operates as a pseudonymisation system. Transmission of MDAT 18 and PID 11 from the hospital 1 to TTP 2 may be unencrypted or encrypted using any known method.
  • TTP 2 a private key of the TTP is used to compute a unique HMAC 22 of the PID 11 using a first injective encryption method.
  • the HMAC 22 is then injectively mapped on a pseudonym (PSN) 23 using a second injective encryption method.
  • the PSN 23 is then transmitted from the TTP 2 to a research database 3 along with the patient's medical data 18.
  • the HMAC 22 may directly be used as a pseudonym, such that the step 27 may be omitted. Accordingly, in these embodiments, the HMAC 22 is transferred to the research database 3.
  • the TTP 2 provides three different encrypted patient identifiers 29 from the PID 1 1 using different public keys 31 of three different asymmetric encryption key systems associated with different ombudsmen 4.
  • each of the ombudsmen is associated with a different re-identification system.
  • the private keys of the asymmetric encryption key systems are not stored or otherwise available to TTP 2.
  • each private key is stored at a respective re-identification system associated with the asymmetric key system, or is otherwise available to one of the ombudsmen 4.
  • TTP 2 stores each of the encrypted PIDs 29 along with the PSN 23.
  • the encrypted PIDs 29 are stored along with the PSN 23 in a re-identification memory of the TTP 2.
  • the ombudsman 4 In case one of the ombudsmen 4 seeks to obtain the decrypted PID for a given requested pseudonym, the ombudsman 4 sends a data request to the TTP 2 and thereby notifies TTP 2 of the requested PSN. TTP 2 then checks the re-identification memory for the requested PSN and provides the ombudsman 4 with the corresponding encrypted PID that was previously stored along with the requested PSN. In particular, TTP 2 may return the encrypted PID stored along with the requested PSN that was previously provided using the public key 31 of the asymmetric key system associated with the ombudsman 4 from which the data request was received. The ombudsman 4 then uses the private key 32 of his asymmetric encryption system to decrypt the encrypted PID 29 received from the TTP 2. As a result of the decryption, a decrypted PID is obtained.
  • Fig. 4 illustrates pseudonymisation and re-identification systems that are similar to the sys- terns illustrated in Fig. 3.
  • the PID 1 1 included in the patient identifying data (ID AT) 12 is encrypted using the PBKDF2 method as a first encryption method.
  • the PBKDF2 uses a secret symmetric key of a pseudonymisa- tion service provider 2' and further adds additional entropy called Salt.
  • the PBKDF2 is applied with a deterministic salt (dSalt). This avoids the determination problem which was previously described with reference to Fig. 1.
  • the PB DF2 provides a derived key (DK) which corresponds to an encrypted code 22' for the PID 1 1.
  • the DK is further used to provide a pseudonym PSN 23' by a second encryption method.
  • the pseudonym 23' is the stored along with the patient's medical data MDAT 18 for secondary use in a research database 3 ' .
  • the PID 11 is also encrypted to provide a plurality of encrypted PIDs 29'.
  • Each of these encryptions uses a different public key 31 ' of an asymmetric key system associated with a different ombudsman 4'.
  • the encrypted PIDs 29' are stored along with the pseudonym PSN 23' in a re-identification database.
  • the re-identification database is separate from the pseudonymisation system and the re-identification system, while, in other embodiments, it may be comprised by any of these.
  • An ombudsman 4' seeking to re-identify a given requested PSN 23' may query the re- identification database for an encrypted PID 29' that corresponds to the PSN 23' and which was provided using the public key 31 ' of that ombudsman's 4' asymmetric encryption system. The ombudsman 4' may then decrypt the encrypted PID 29' retrieved from the re- identification database using the private key 32' of his asymmetric encryption system.
  • Fig. 5 illustrates a pseudonymisation method according to an embodiment.
  • a pseudonymisation system receives a patient identifier PID in step 500 and performs a first injective encryption method to compute an encrypted code such as an HMAC or a derived key from the received PID in step 502.
  • step 504 it is checked whether the encrypted code is already present in a pseudonymisation database. In case the encrypted code is found in the pseudony- misation database, the pseudonym PSN associated with the encrypted code is retrieved from the pseudonymisation database in step 506. In that case, the method continues with step 512.
  • step 512 it is checked whether public keys of additional asymmetric encryption key systems have been added since the last retrieval of the PSN. If yes, the method proceeds with step 514.
  • a pseudonym PSN is generated in step 508 and stored in a pseudonymisation database along with the encrypted code at step 510. This allows to retrieve the PSN later on in step 506 when the same PID is received again at step 500. Subsequently, for each ombudsman, an encrypted patient identifier is provided at step 514. To this effect, the PID is encrypted for each ombudsman using a public key of an asymmetric encryption system of that ombudsman.
  • the encrypted patient identifier provided in step 514 is then stored in a re-identification memory along with the pseudonym at step 516.
  • the PSN may be stored with each of the encrypted PIDs in a single table. Alterna- tively or additionally, the PSN may be stored with each of the encrypted PIDs in a separate table, such that there is a different table for each of the asymmetric encryption key systems.
  • the re-identification memory is a buffer and the contents of the buffer are transmitted to an external re-identification database when encryption of the PID to provide the PSN and the encrypted patient identifiers has completed. When an encrypted PID has been provided for each ombudsman, the method proceeds with step 518.
  • Fig. 6 illustrates a re-identification method according to an embodiment.
  • a re- identification request is received.
  • the re-identification request includes a requested pseudonym.
  • the method proceeds at step 552 with transmitting a data request to a pseudonymisation system.
  • the data request includes the requested PSN.
  • the data request may be directed to pseudonymisation logic or a re-identification memory of the pseudonymisation system or to a re-identification database.
  • the method continues in step 554 with receiving an encrypted patient identifier from the pseudonymisation system.
  • the received encrypted patient identifier is decrypted at step 556 using a private key of an asymmet- ric encryption key system to provide a decrypted patient identifier.
  • the method of Fig. 6 may be performed by a re-identification system of the invention.
  • Fig. 7 illustrates a pseudonymisation system 100 according to an embodiment.
  • the system 100 comprises input means 102 to receive a patient identifier PID.
  • pseudonymisation logic 104 is coupled to the input means 102 and receives the PID.
  • Pseudonymisation log- ic 104 encrypts the PID to provide a PSN, which is then output via output means 108 of the system 100.
  • the pseudonymisation logic 104 is coupled to a public key memory 120 holding one or more public keys of asymmetric encryption key systems. Using each of the public keys stored in the public key memory 120, the pseudonymisation logic 104 encrypts the PID to provide one or more encrypted PIDs, one for each public key.
  • the pseudonymisation logic 104 then stores the pseudonym along with the encrypted PIDs in a re- identification memory 110.
  • the pseudonymisation system 100 further comprises a re-identification interface 112 coupled to the pseudonymisation logic 104.
  • the pseudonymisation logic 104 may receive a data request via the re-identification interface 112, the data request including a requested pseudonym.
  • the pseudonymisation logic 104 may, in response, retrieve an encrypted patient identifier from the re-identification memory 110 and transmit the retreived encrypted patient identifier via the re-identification interface 112.
  • system 100 further comprises a pseudonymisation database 106 coupled to the pseudonymisation logic 104.
  • the pseudonymisation database 106 may receive indications for previously provided pseudonyms from the pseudonymisation logic 104 in response to the pseudonymisation logic 104 encrypting the PID to provide the pseudonym.
  • pseudonymisation logic 104 may, in response to receiving a PID via input means 102, query the pseudonymisation database 106 whether a pseudonym has been provided for the received PID before.
  • logic 104 may omit some or all of the encryption of the PID to provide a pseudonym and may, instead, retrieve a pseudonym for the received PID from the pseudonymisation database 106. Similarly, the pseudonymisation logic 104 may encrypt the PID to provide encrypted PID(s) only when it is determined that a pseudonym has not been provided for the received PID before.
  • Fig. 8 illustrates a re-identification system 200 according to an embodiment.
  • the system 200 comprises input means 202 shown on the right-hand side of Fig. 8.
  • the input means 202 may, e.g. comprise a keyboard, a touch screen, a storage medium, an electrical or optical interface, an e-mail and/or a user interface such as a graphical user interface.
  • the input means 202 is coupled to re-identification logic 204.
  • the re-identification logic 204 may receive a re- identification request via the input means 202.
  • the re-identification request includes a re- quested pseudonym.
  • the re-identification logic 204 transmits a data request to a re-identification database and/or a pseudonymisation server such as the pseudonymisation server 100 shown in Fig. 7.
  • the re- identification logic 204 is coupled to a data interface 212 of the re-identification system 200 which may be coupled with the re-identification interface 1 12 of the pseudonymisation system of Fig. 7.
  • the data request includes the requested pseudonym as previously received in the re-identification request.
  • the re-identification logic 204 of Fig. 8 receives an encrypted patient identifier via the data interface 212 and decrypts the received encrypted patient identifier.
  • the re-identification logic 204 uses the private key of an asymmetric key system.
  • the private key may be stored in a private key memory of the re- identification system as indicated by reference number 220.
  • the re- identification system 200 may be adapted to be coupled to a private key memory, e.g. via a private key memory interface.
  • the private key memory interface may, e.g. comprise any of a card slot, a USB interface, a receptacle for an optical, magnetic and/or electrical storage medium, etc.
  • the private key memory 220 may be detachable from the re- identification logic 204.
  • the re-identification logic 204 provides a decrypted patient identifier and outputs the decrypted patient identifier via output means 210 of the re-identification system 200.
  • the output means 210 may, e.g. com- prise a display, a storage medium, an electrical or optical interface, an e-mail and/or a graphical user interface.
  • Fig. 9 shows a screenshot of a user interface that may be provided to the user requesting a first pseudonym according to an embodiment of the invention.
  • the user interface 600 shows a popup window 616 on top, in which the user is asked to identify himself by entering a user name and a password. The user will only be allowed to enter data and to request a pseudonym for a PID if he successfully completes authentication using window 616.
  • user interface 600 includes an input field 602, in which the user may enter a patient identifier. The user will only be able to access this field when he has successfully authenticated himself before using the popup window 616. Moreover, the user is informed in box 603 that the PID corresponds to the unique patient identifier that is also used in the hospital information system. The user may further select a project to which he wishes to enter data using drop-down list 604. Similarly, using drop-down list 606, the user may specify a sample type like, for example "tumor”. Moreover, using drop-down list 608, the user may specify a sample number. The user may further indicate whether the sample re- fers to DNA or RNA using drop-down list 610.
  • RTT is performed by the "Completely Automated Public Turing test to tell Computers and Humans Apart” (CAPTCHA) method in that the user is requested to enter a captcha in field 612. Having entered all required data, the user may click on the button "Pseudonymize" 614. When the button 614 is clicked, the data entered by the user are transferred to a pseudonymisation system to perform a pseudonymisation method.
  • CATCHA Computers and Humans Apart
  • the system may be adapted to perform batch pseudonymisation, e.g., with a spreadsheet upload function.
  • Batch- pseudonymisation with spreadsheets is beneficial for the pseudonymisation of a plurality of biospecimen that were, e.g., gained at once. Batch sizes may be limited to a certain amount due to security issues with malicious service abuse.
  • the authentication mechanism for the pseudonymisation system may, e.g. be realized with the "Lightweight Directory Access Protocol" (LDAP) querying an institutional domain server.
  • LDAP Lightweight Directory Access Protocol
  • Fig. 10 shows a screenshot of another user interface 700 that is provided to the user in response to completion of the pseudonymisation method. This may require that the user has first filled out the fields in user interface 600 of Fig. 9 and requested pseudonymisation by clicking on the button 614.
  • the user interface 700 of Fig. 10 shows a sample code 702, both, as a barcode and as a plain text.
  • the sample code is composed of a project category, project number, pseudonym, sample type, sample counter and a DNA/RNA- adjunct, if applicable as indicated in box 703.
  • the interface 700 also shows the pseudonym 704 alone as a barcode and as a text.
  • the user is provided with a button labelled "print Accompanying Ticket" 706. By clicking on the button 706, the accompanying ticket may be printed out and sent together with the biospecimen to a sample processing lab.
  • Fig. 11 shows pseudonymisation and re-identification systems according to an embodiment.
  • the pseudonymisation system comprise a trusted third party TTP 1 20", which is similar to the TTP 2 shown in Fig. 3. However, the system of Fig. 11 includes a further trusted third party TTP2 21".
  • TTPl receives and encrypts a PID 1 1 using a first injective encryption method to provide an intermediate pseudonymised code (PSN1) 22", which may correspond to an encrypted code as described elsewhere in this disclosure.
  • PSN1 pseudonymised code
  • TTP2 21" receives the intermediate pseudonymised code 22" provided by the TTPl 20" and performs an additional injective en- cryption method on the encrypted code 22" to provide a further intermediate pseudonymised code (PSN2) 24" for the same PID 1 1. While, in some embodiments, the PSN2 21 " may be used as a pseudonym, Fig. 11 illustrates an optional additional step 28, by which pseudonym 23" is provided by applying yet a further injective encryption method on the further intermediate pseudonymised code 24". PSN 23" is then transmitted to a research database 3", where PSN 23" is stored along with the medical data MDAT 18 that are transferred from the clinic via TTPl 20" and TTP2 21" to the research database 3". In embodiments, in which step 28 is omitted, the further intermediate pseudonymised code 24" generated by TTP2 21" is transmitted to the research database 3", where it is stored along with MDAT 18.
  • TTPl 20" and TTP2 21 " may be spatially separate from each other. In some instances, they may be coupled via a network, such as the internet, a WAN or an intranet. TTPl and TTP2 may both be considered part of a pseudonymisation system according to an embodiment, wherein, in particular, TTPl 20" may comprise a first logic portion and TTP2 21 " may comprise a second logic portion as defined above.
  • TTPl 20" encrypts the PID 1 1 using each of a plurality of public keys 31 " associ- ated with different ombudsmen 4" to provide a plurality of encrypted patient identifiers 29".
  • TTPl 20" further forwards the encrypted patient identifiers 29" to TTP2 21 ".
  • TTP2 21 " stores the pseudonym 23", or, in some embodiments the intermediate pseudonymised codes 24", along with the encrypted patient identifiers 29" received from TTPl 20" in a re-identification memory located at TTP2 21 ".
  • An ombudsman 4" seeking to decrypt a requested PSN may then transmit a data request to TTP2 21", wherein the data request includes the requested PSN.
  • TTP2 21" queries the re-identification memory and retrieves from the re-identification memory an encrypted PID 29" that was previously stored in the re-identification memory along with a pseudonym that matches the requested pseudonym.
  • TTP2 21 " returns the retrieved encrypted PID 29" to the ombudsman 4".
  • the ombudsman 4" may then decrypt the encrypted PID 29 " received from TTP2 21 " using the private key 32" of his asymmetric encryption system to provide a decrypted patient identifier for the requested pseudonym.

Abstract

The invention provides methods, systems and computer-readable media for pseudonymisation and re-identification of identifiers. In particular, the invention provides a pseudonymisation method, wherein the method comprises: Receiving a identifier, and in response thereto: Encrypting said identifier using at least a first injective encryption method to provide a pseudonym, encrypting said identifier using a public key of an asymmetric encryption key system to provide an encrypted identifier, and storing said encrypted identifier along with said pseudonym.

Description

Pseudonymisation and Re-Identification of Identifiers
FIELD
The invention relates to pseudonymisation and re-identification of identifiers, in particular, of patient identifiers, e.g. for translationai research, such as cancer research.
BACKGROUND The usage of patient data for research generally imposes risks concerning the privacy and informational self-determination of the patient. Therefore, a patient identifier (PID) is usually encrypted to provide a pseudonym that is forwarded to research institutes or scientists along with the patient's medical data. A known method for providing pseudonyms is shown in Fig. 1. The patient in Fig. 1 is associated with a PID 11 1. The physician enters the patient's data including the patient's PID and his or her medical data into a system, which encrypts the PID to generate a cipher 122. Subsequently, the cipher 122 is pseudonymised to provide a pseudonym 123. The encryption method used in Fig. 1 produces different ciphers 122 when used repeatedly for the same PID 1 1 1, i.e., the encryption method is a non-injective encryption method. As different ciphers 122 are provided for the same PID 11 1 , the medical data of a single patient are associated with different pseudonyms 123. Therefore, the scientists operating on the patient's medical data cannot determine whether two sets of medical data belong to the same patient or to different patients.
However, next generation sequencing methods have provided a wealth of new possibilities for the characterization of tumors of individual patients and laid the basis for new treatment pos- sibilities. If research is conducted with anonymized data as described above, the ability to make a retrospective linkage to clinically relevant information, which is a necessity in biomedical translationai research, is barred. Anonymization withdraws the possibility to re- identify a corresponding patient, thus a direct benefit from research results for this patient is impossible. This is also an ethical constraint, as under defined circumstances, individual re- search results in genetic and genomic research might lead to new treatment possibilities and therefore should be offered to study participants in a clinically relevant timeframe. Also in a scenario where clinical follow-up data is added later to a biological sample to study outcome effects, it would be beneficial to associate the follow-up data with the same pseudonym as the original sample. In addition, next-generation-sequencing technologies and other high-throughput methods imply that many different persons and often external organizations are involved in the data collection and analysis process and therefore impose additional risks to patient privacy. To retain a semantic reference between patient and sample, but still comply with data privacy requirements, a secure pseudonymisation is desirable. Therefore, a data protection and privacy con- cept with solid pseudonymisation and a streamlined re-identification process to transfer the results back to the clinic is needed.
Known pseudonymisation concepts allowing re-identification of the patient usually rely on external trusted third parties as shown, e.g., in Fig. 2. In the method of Fig. 2, a trusted third party (TTP) pseudonymises a patient identifier PID to yield a pseudonym PSN. The PSN may then be provided to a scientist along with the patient's medical data. In case re-identification is needed, the TTP may be asked for the PID associated with a specific PSN. To allow re- identification, the pseudonymisation method shown in Fig. 2 is reversible, e.g., by using mapping tables or mapping functions. To preserve privacy, this mapping has to be secure. To this effect, a symmetric encryption scheme is used and re-identification has to be restricted by organizational matters, since the IT system that implements the illustrated method can both encrypt and decrypt the mapping information. Therefore, also outsourcing of the pseudonymisation service is possible but has to be combined with exhaustive contracts that regulate which entity is authorized to re-identify which patients. The service provider then has to look up this contract for each re-identification request and requires documented evidence that the requester is entitled to re-identify the patient from the given pseudonym. Therefore, a complex, bureaucratic and lengthy re-identification process with human interaction is predetermined for this approach.
In Elger B.S. et al., "Strategies for health data exchange for secondary, cross-institutional clinical research", Computer Methods and Programs in Biomedicine 2010, 99(3):230-251, generation of a pseudonym for a PID is performed using a symmetric AES-256 encryption. Moreover, an additional integrity protection is performed. The reversal of the pseudonym back to the PID can only be performed by the clinical centre holding the proper AES-256 decryption key. While this moves the competence for pseudonym reversal away from the TTP to the clinical centre, a more secure pseudonymisation of patient identifiers is desirable.
SUMMARY It is an objective of the invention to allow secure pseudonymisation of an identifier and fast re-identification thereof. This problem is solved by the pseudonymisation method of claim 1 , the pseudonymisation system of claim 5, the re-identification method of claim 12, the re- identification system of claim 13 and the computer-readable medium of claim 15. Preferred embodiments are addressed in the depending claims. In a first aspect, the invention provides a pseudonymisation method, wherein the method comprises receiving an identifier, and in response thereto: Encrypting said identifier using at least a first injective encryption method to provide a pseudonym, encrypting said identifier using a public key of an asymmetric encryption key system to provide an encrypted identifier, and storing said encrypted identifier along with said pseudonym.
As the first encryption method is an injective encryption method, repeatedly encrypting the same identifier will always yield the same pseudonym. To this effect, the injective encryption method may be a strictly injective encryption method, i.e. a method that provides injective encryption in a strict mathematical sense. However, as the variety of possible identifiers may be limited, depending on the application, any other method of mapping an identifier on a pseudonym may alternatively be used that does not provide a collision between any two identifiers. Hence, the injective encryption method, for all practical purposes of this disclosure, may be a method of generating a unique pseudonym for each identifier. For any two different identifiers, the method thus generates different pseudonyms, while the same pseudonym is generated when the method is applied repeatedly to the same identifier. One example of such a method that may be used as an injective encryption method is a collision-resistant hashing method, such as HMAC or PBKDF2, as will be described in more detail below. The injective encryption method may thus e.g. comprise a strictly injective encryption method or a collision-resistant hashing method. Hence, e.g., in embodiments, in which the identifier is a pa- tient identifier, multiple sets of medical data may easily be associated with the same patient. Moreover, the patient identifier may easily be re-identified by decrypting the encrypted pa- tient identifier. Re-identification thus does not require that the first encryption method may easily or quickly be reversed. In contrast, the method does not require any decryption of the pseudonym itself at all. Moreover, re-identification does not require a table including the pseudonyms and the associated identifiers. Hence, the pseudonymisation method of the inven- tion provides a high level of security.
While the above-mentioned problem has been illustrated by means of medical research and patient identifiers, the skilled person will understand that embodiments of the invention are also useful in other areas, in which secure pseudonymisation of an identifier and fast re- identification thereof is desirable. The identifier may thus be used to identify different objects or persons based on the requirements of the application.
The identifier may comprise a number, a text, a combination thereof or any other information. In some embodiments, the identifier is uniquely associated with a concrete or abstract person or object, such as a group, entity, organization, unit, account, transaction, or any combination thereof. For example, the identifier may be a patient identifier, a bank account identifier, a data retention/preservation identifier in telecommunications or a financial transaction identifier. For executing the pseudonymisation method, the private key of the asymmetric encryption key system is not needed. It is even preferred that the private key of the asymmetric encryp- tion key system is not stored or otherwise present at the entity performing the pseudonymisation method like, for example a pseudonymisation system, such that this entity may not re- identify a given pseudonym. In contrast, the private key may in some instances only be stored at a re-identification system, which is separate from the pseudonymisation system, or may only be available to an ombudsman. In a preferred embodiment, said step of encrypting said identifier to provide a pseudonym comprises encrypting said identifier using said first injective encryption method to provide an encrypted code, and encrypting said encrypted code using at least a second injective encryption method to provide said pseudonym. In this embodiment, two encryption methods are used to provide the pseudonym. Hence, the security of pseudonymisation is increased. The first and the second injective encryption methods may be the same or may be different. According to a preferred embodiment, the method further comprises, for each of one or more additional asymmetric encryption key systems, each comprising an additional public key: Encrypting said identifier using said additional public key to provide an additional encrypted identifier, and storing said additional encrypted identifier along with said pseudonym. The additional asymmetric encryption key systems may each be associated with a different re- identification system, ombudsman or clinic. This allows re-identification of a given pseudonym using any one of the asymmetric encryption key systems. It is therefore not required to contact a specific individual dedicated ombudsman, but re-identification may be performed by any ombudsman having a private key of any of the asymmetric encryption key systems used for encryption. This is also beneficial in case one of the private keys is lost. In that case, the private key of any of the other ombudsmen may be used to decrypt the corresponding encrypted identifier.
In a preferred embodiment, the method further comprises receiving a data request comprising a requested pseudonym, and, in response thereto, retrieving a requested encrypted identifier as an encrypted identifier that was previously stored along with said requested pseudonym and, in response thereto, returning said requested encrypted identifier. This embodiment allows fast re-identification as only the requested pseudonym needs to be returned rather than the entire contents of the re-identification memory. In some instances the, the data request is received from a re-identification system and the requested encrypted identifier is returned to said re-identification system.
In some embodiments, the method further comprises the step of outputting said pseudonym. In particular, the step of receiving the identifier may comprise receiving the identifier via an interface such as a user interface and the method may comprise outputting said pseudonym via said interface. Alternatively, the pseudonym may be output via output means which are different from input means used to receive the identifier. Additionally or alternatively, the method may comprise storing the pseudonym, e.g. in a database such as a research database, a transaction database, an account database, etc. According to some embodiments, the method further comprises receiving data, in particular, medical data. In these embodiments, the method may further comprise outputting said pseudonym along with said data, e.g. via output means, and/or storing said pseudonym along with said data. In some instances, the method further comprises encrypting the data in response to receiving them and before outputting and/or storing them.
In some embodiments, the method further comprises, in response to receiving the identifier, a step of determining whether the identifier has been encrypted to provide a pseudonym before, preferably using a pseudonymisation database. These embodiments may further comprise, when it is determined that the identifier has not been encrypted to provide a pseudonym before, storing an indication in the pseudonymisation database. The indication may, for example, comprise the identifier or any result of an injective mapping thereof, such as a pseudonym or an encrypted code provided for the identifier as specified above. It is, in particular, preferred that the steps of encrypting the identifier to provide an encrypted identifier and of storing the encrypted identifier along with the pseudonym are performed for those identifiers, for which it was determined that the identifier has not been encrypted to provide a pseudonym before.
In some embodiments, the method may further comprise retrieving the pseudonym from the pseudonymisation database in response to determining that the identifier has been encrypted to provide a pseudonym before. In some of these embodiments, the method may further comprise storing the pseudonym in the pseudonymisation database in case it was determined that the identifier has not been encrypted to provide a pseudonym before.
It is preferred that the method further comprises the step of querying a pseudonymisation database for the encrypted code and, when the encrypted code is found in the pseudonymisation database, retrieving the pseudonym from the pseudonymisation database. The method may further comprise, when the encrypted code is not found in the pseudonymisation database, in response to encrypting the identifier to provide the pseudonym, storing the pseudonym along with the encrypted code in the pseudonymisation database. In some embodiments, the steps of encrypting the identifier to provide the encrypted identifier and of storing the encrypted identifier along with the pseudonym are performed in response to the encrypted code not being found in the pseudonymisation database.
In a preferred embodiment, the method further comprises, in response to a deletion request, deleting the encrypted identifier(s), e.g. from the re-identification memory and/or the re- identification database. The deletion request may e.g. include the identifier, the encrypted identifier and/or the pseudonym. In this embodiment, the re-identification data stored for a given identifier may be deleted, such that future re-identification is prevented. This may e.g. be useful when, initially having given his or her permission to store data to allow re- identification, a patient later revokes the permission.
In a preferred embodiment, the method further comprises, in response to a deletion or change request, deleting or changing the pseudonym, e.g. in the re-identification memory and/or the re-identification database. The deletion or change request may e.g. include the identifier, the encrypted identifier and/or the pseudonym. In this embodiment, the re-identification data stored for a given identifier or pseudonym may be deleted or changed, such that future re- identification is prevented. This may e.g. be useful when a patients' identity has been revealed for a given pseudonym or an encryption method used to generate the pseudonym is not considered safe anymore.
In some embodiments, the steps of receiving the identifier and/or receiving the data request may use an interface, such as e.g. a PHP (Hypertext Preprocessor) interface or a Web Service (e.g. SOAP) interface. The latter is preferred as it allows incorporating the interface in a desired software program, web page, office application, such as Microsoft Office™ application, or the like. In addition, it allows the use of a broad variety of programming languages to implement the method of the invention.
In a preferred embodiment, the method further includes printing the pseudonym and/or displaying the pseudonym on a screen. For example, the pseudonym may be printed in text, such as plain text, and/or as a barcode, such as a Code39, Codel28 and/or 2-dimensional barcode, such as a QR (Quick Response) code, e.g. on a piece of paper, a sticker, a label, such as an adhesive label, a tag, an adhesive tape, or the like.
In some embodiments, any or all of the pseudonymisation method may be performed by a pseudonymisation system and, in particular, by a pseudonymisation logic included therein. In a further aspect, the invention provides a pseudonymisation system, comprising input means adapted to receive identifiers, a re-identification memory and pseudonymisation logic coupled to said input means and to said re-identification memory. Said pseudonymisation logic is adapted to: Receive a identifier via the input means, and in response thereto: encrypt said identifier using at least a first injective encryption method, in particular, using a Keyed- Hash Message Authentication Code (HMAC) method or a Password-Based Key Derivation Function 2 (PBKDF2) method, to provide a pseudonym, encrypt said identifier using a public key of an asymmetric encryption key system to provide an encrypted identifier, and store said encrypted identifier along with said pseudonym in said re-identification memory.
In some instances, the re-identification memory may be a buffer, e.g. only storing entries for a single pseudonym. In these embodiments, it is preferred that the re-identification memory is adapted to be coupled to an internal or external re-identification database and further adapted to transmit all or parts of its content to the re-identification database in response to the pseu- donymisation logic encrypting the identifier to provide the pseudonym and the encrypted identifier. The re-identification memory or the re-identification database may e.g. be implemented using Microsoft Access™, Microsoft SQL Server™, such as versions 2000, 2005, 2008, 2012, MySQL™, PostgreSQL™, Oracle™ or IBM DB2™. Other database management systems may be used in alternative embodiments.
In some embodiments, the re-identification memory comprises previously provided pseudonyms along with encrypted identifiers provided for the same identifier. It is preferred that, for every identifier received via the input means, at least one encrypted identifier and a pseudonym are stored in the re-identification memory. This may comprise that, in case it is deter- mined that the identifier has not been encrypted before to provide a pseudonym, the pseudonymisation logic stores the encrypted identifier along with the pseudonym in the re- identification memory in response to encrypting the identifier to provide the pseudonym and encrypting the identifier to provide the encrypted identifier. In some embodiments, the pseudonymisation system further comprises output means coupled to the pseudonymisation logic and the pseudonymisation logic is further adapted to, in response to encrypting the identifier to provide a pseudonym, output said pseudonym via said output means. According to a preferred embodiment, the pseudonymisation system further comprises a public key memory coupled to said pseudonymisation logic, wherein said public key memory is adapted to hold said public key of said asymmetric encryption key system and wherein said pseudonymisation logic is adapted to retrieve said public key from said public key memory, wherein said public key memory is preferably detachable from said pseudonymisation logic. In some embodiments, the pseudonymisation system further comprises coupling means adapted to couple the public key memory to the pseudonymisation logic. Allowing the public key memory to be detached from the pseudonymisation logic allows for easy replacement of public keys and for storing it in a safe place when not needed.
In a preferred embodiment, said pseudonymisation logic is adapted to encrypt said identifier to provide a pseudonym by encrypting said identifier using said first injective encryption method to provide an encrypted code, and encrypting said encrypted code using at least a second injective encryption method to provide said pseudonym.
In some embodiments, the pseudonymisation system further comprises a pseudonymisation database coupled to the pseudonymisation logic. The pseudonymisation logic may be adapted to store the identifier, the encrypted code or the pseudonym in the pseudonymisation database as specified in more detail above.
According to a preferred embodiment, said first injective encryption method comprises an AES-256 encryption in CBC mode with a predetermined initialization vector or in ECB mode is used to encrypt the identifier. The Advanced Encryption Standard (AES) is described in http://csrc.nist.Rov/publications/fips/fipsl97/fips-197.pdf which is hereby incorporated by reference.
According to a preferred embodiment, said first injective encryption method comprises at least one of a Keyed-Hash Message Authentication Code (HMAC) method and a Password- Based Key Derivation Function 2 (PBKDF2) method. The Keyed-Hash Message Authentica- tion Code is specified in the Federal Information Processing Standards Publication 198 available from http://csrc.nist.gov/publications/fips/fipsl98/fips-198a.pdf and
http://csrc.nist.gov/publications/fips/fipsl98-l/FIPS-198-l final.pdf which are hereby incorporated by reference. The HMAC allows for injective encryption and thus provides a unique encrypted code for each given identifier. The HMAC can further be used directly as a pseu- donym, because it is secured against dictionary attacks with a cryptographic key. Moreover, details of the Password-Based Key Derivation Function 2 (PBKDF2) are described in
ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-5v2/pkcs5v2-0.pdf which is hereby incorporated by reference. The PBKDF2 adds additional entropy to a one-way function like the HMAC. In a preferred embodiment, an AES-256 encryption, e.g. in CBC mode with a predetermined initialization vector or in ECB mode, of the identifier is used to provide an encrypted code, and subsequently, the Keyed-Hash Message Authentication Code (HMAC) method and/or the Password-Based Key Derivation Function 2 (PBKDF2) are applied to the encrypted code to provide the pseudonym.
In a preferred embodiment, said pseudonymisation logic is adapted to, for each of one or more additional asymmetric encryption key systems, each comprising an additional public key: Encrypt said identifier using said additional public key to provide an additional encrypt- ed identifier, and store said additional encrypted identifier along with said pseudonym in said re-identification memory. The pseudonymisation logic may, in particular, be adapted to perform these steps in response to receiving the identifier via the input means.
According to a preferred embodiment, the pseudonymisation system further comprises a re- identification interface adapted to receive data requests and coupled to said pseudonymisation logic, wherein said pseudonymisation logic is further adapted to: Receive a data request comprising a requested pseudonym via said re-identification interface, and in response thereto: Retrieve a requested encrypted identifier from said re-identification memory as an encrypted identifier that is stored along with said requested pseudonym in said re-identification memory and return said requested encrypted identifier via said re-identification interface.
In a preferred embodiment, said pseudonymisation logic comprises separate first and second logic portions coupled to each other, wherein said first logic portion is coupled to said input means and is adapted to receive said identifier via said input means and encrypt said identifier using at least said first injective encryption method to provide an intermediate pseudonymised code. In this embodiment, the second logic portion is coupled to said re-identification memory and is adapted to receive said intermediate pseudonymised code from said first logic portion, encrypt said intermediate pseudonymised code using at least a third injective encryption method to provide said pseudonym, and store said encrypted identifier along with said pseudonym in said re-identification memory. The first or second logic portions may be adapted to encrypt the identifier using the private key of the asymmetric encryption key system to provide the encrypted identifier. The first and second logic portions may be spatially separated from each other. The first and second logic portions may, for example, be coupled via a network like the Internet, a wide area network (WAN) or an intranet. In this embodi- ment, the risk of unauthorized re-identification of the identifier by decrypting the pseudonym is reduced as two different logic portions, each having their own encryption method, are used. Moreover, even if an attacker gained access to the private key of any of the first and the third encryption method, he could not directly decrypt the pseudonym. Still, this embodiment al- lows a fast re-identification of a pseudonym by an authorized re-identification system or ombudsman, as the encrypted identifier and the pseudonym are stored in the re-identification memory coupled to the second logic portion.
In some embodiments, the input means further incorporates a Reverse Turing Test (RTT) to prevent abusive repetitive service usage. With this test, the pseudonymisation system can determine if a machine or a human is using the service and refuse service use to a machine.
In a further aspect, the invention provides a re-identification method comprising: Receiving a re-identification request including a requested pseudonym, transmitting, in response to receiv- ing said re-identification request, a data request including said requested pseudonym to a pseudonymisation system or a re-identification database, receiving, in response to transmitting said data request, a requested encrypted identifier, and decrypting, in response to receiving said requested encrypted identifier, said requested encrypted identifier using a private key of an asymmetric encryption key system to provide a decrypted identifier.
This method provides a fast re-identification as it only requires to request an encrypted identifier from a single entity, namely, the pseudonymisation system or a re-identification database which may be external or internal to a re-identification system performing the steps of the re- identification method. Moreover, with the method of the invention, it is not necessary to transmit the private key of the asymmetric encryption key system over a network to perform the re-identification, such that security is further improved.
In some embodiments, the steps of receiving the re-identification request, transmitting the data request and/or receiving the requested encrypted identifier may use an interface, such as e.g. a PHP interface or a SOAP/Web Service interface, as specified above.
In some embodiments, the method further comprises, e.g. in response to receiving the re- identification request, combining at least two partial keys to obtain the private key of the asymmetric encryption key system. The partial keys may e.g. be obtained from different om- budsmen. For example, the partial keys may be obtained in response to receiving the re- identification request. The partial keys may each be stored in a respective private key memory as described below, such as e.g. on different chip cards. In some instances, the generated private key may be deleted after decrypting the encrypted identifier to further improve data secu- rity.
In a further aspect, the invention provides a re-identification system comprising input and output means, a data interface and re-identification logic coupled to said input and output means and to said data interface. Said re-identification logic is adapted to: Receive a re- identification request via said input means, said re-identification request including a requested pseudonym, transmit, in response to receiving said re-identification request, a data request including said requested pseudonym to a pseudonymisation system or a re-identification database via said data interface, receive, in response to transmitting said data request, a requested encrypted identifier via said data interface, and decrypt said requested encrypted identifier using a private key of an asymmetric encryption key system to provide a decrypted identifier, and output said decrypted identifier via said output means.
According to a preferred embodiment, the re-identification system further comprises a private key memory including said private key of said asymmetric encryption key system, said pri- vate key memory coupled to said re-identification logic, and wherein said re-identification logic is adapted to retrieve said private key from said private key memory, wherein, preferably, said private key memory is detachable from said re-identification logic.
In some embodiments, the re-identification system further comprises a key interface coupled to the re-identification logic and further adapted to be coupled to a private key memory. In these embodiments, the private key memory may be part of the re-identification system or may be separate. These embodiments allow the private key memory to be detached from the re-identification logic and to be stored in a safe place when not needed. The private key memory may, e.g. be included in a chip card, for example, including an RF chip. In these embodiments, the key interface may comprise a card reader. Additionally or alternatively, the key interface may be adapted to be coupled to the private key memory in a wired or a wireless manner, e.g. via electrical contacts or contactless. For example, the key interface may have an antenna and a receiver or transceiver electrically coupled to the anten- na, to wirelessly couple the key interface to the private key memory. Accordingly, the private key memory may also include or be electrically coupled to an antenna.
The private key memory may, e.g. be integrated in an authorization memory, such as an identity card that a user, such as an ombudsman, may already possess for other purposes. In some embodiments, the private key memory may be integrated on a private chip card, an official identity card, such as an electronic passport or identification (ID) card, an electronic access card for a building, a vehicle, etc., a badge, such as an employee identification badge, an identity card issued by a company, such as an insurance card, an identity card issued by an organization, such as a membership card, or the like.
In a preferred embodiment, the private key memory is integrated in a Health Professional Card, and the key interface comprises a card reader adapted to couple to and read from a Health Professional Card. The Health Professional Card may be an identity card that is issued to a physician or a health institution by a governmental or non-governmental institution, such as a health insurance company or a an association of physicians. For example, the key interface may comprise an eHealth-BCS terminal.
It is, in particular, preferred that the private key memory stores the private key as a protected key. In more detail, the private key may be protected by a personal identification number (ΡΓΝ), a password/passphrase and/or a hardware security model (HSM).
In order to process a pseudonym, the re-identification server may request a specific entry of the re-identification memory or database, in which encrypted identifiers are stored along with a pseudonym. Even in case an unauthorized entity should gain access to the re-identification memory or database, the information provided therein will be useless to this entity as long as it does not possess the private key of the asymmetric encryption key system used for encryption. As these entries thus do not directly allow for re-identification, e.g., of a patient, they may be made available to a requesting entity without requiring a high level of security checking. In some instances, all or parts of the contents of the re-identification memory or database may be transmitted to a re-identification server without human interaction and/or in an automated fashion. In some embodiments, the data request includes an identification indicator of the re- identification system, an ombudsman and/or an asymmetric encryption key system. Correspondingly, the pseudonymisation logic of the pseudonymisation system may be adapted to retrieve a requested encrypted identifier from the re-identification memory that was provided using the public key of the re-identification system, ombudsman or an asymmetric encryption key system, respectively, indicated by the indicator.
In yet a further aspect, the invention provides a computer-readable medium having instructions stored thereon that, when executed by a computer, cause the computer to perform the method of any of the aforementioned kinds. To this effect, the method may in some embodi- ments be implemented using a programming language. Hence, these embodiments allow for easy integration of all or parts of the method in a conventional hospital information system. In particular, the method may be implemented using the language PHP. PHP is a scripting language that can be embedded into HTML and may be interpreted by a hypertext pre-processor. Object oriented programming is supported and can be combined with scripting and rapid web development.
Further embodiments and benefits become evident from the following description in connection with the accompanying drawings.
DESCRIPTION OF THE DRAWINGS
The foregoing aspects and many of the attendant advantages will become more readily appre ciated as the same become better understood with the reference to the following detailed de scription, when taken in conjunction with the accompanying drawings, wherein:
Fig. 1 illustrates a known method for encryption and pseudonymisation,
Fig. 2 illustrates known pseudonymisation and re-identification methods,
Fig. 3 illustrates embodiments of the pseudonymisation and re-identification systems of the invention, Fig. 4 illustrates embodiments of the pseudonymisation and re-identification systems of the invention,
Fig. 5 illustrates a pseudonymisation method according to an embodiment,
Fig. 6 illustrates a re-identification method according to an embodiment, Fig. 7 illustrates a pseudonymisation system according to an embodiment,
Fig. 8 illustrates a re-identification system according to an embodiment,
Fig. 9 shows a user interface by which a user may enter a patient identifier according to an embodiment,
Fig. 10 shows another user interface by which the user is informed about a pseudonym according to an embodiment and
Fig. 11 illustrates pseudonymisation and re-identification systems according to an embodiment.
DETAILED DESCRIPTION
A hospital information systems may have patient registration modules that assign a unique patient identifier (PID) to a patient to constitute a mature identity management. If biological samples and medical data are sent from caregivers at the hospital to a sample processing lab (SPL) and the sequencing facility for genetic analysis, biospecimen like blood and tissue have to be annotated with contextual information like project name, tissue source site, sample, or study participant. Working with personal data at a SPL is not necessary and therefore not ac- ceptable, although patients may have consented into processing of personal data where needed. Hence, according to the invention, a pseudonym PSN instead of personal data or the PID is attached to a sample that is sent to a SPL. Therefore, pseudonymisation of the PID into a PSN is performed before a sample and its associated data leaves the hospital to get processed somewhere else. The PSN provided using the method of the invention is repetitively unambiguous for a patient, so that several samples from the same patient like, e.g., a tumor and a subsequent control sample can be correlated by the researchers and also utilized for secondary use. Additionally, the invention allows re-identification by several persons independently from each other. Each of these persons is able to compute back the PID from a PSN, i.e., to perform re- identification. In the context of this disclosure, these persons are called ombudsmen, because they safeguard the privacy of the patients but are able to identify individuals from a pseudonym if this is necessary. However, the skilled person will readily understand that, alternatively or additionally to ombudsmen, any private key used for encrypting the patient identifier to provide an encrypted patient identifier may be made available to other entities, organisations, systems, servers and/or people, depending on the requirements of the application.
As record linlcage at the research site is made with a pseudonym that has been derived from a mature and stable PID, imprecision is low and error-tolerant record linkage mechanisms which require personal information like names and birthdates are not needed. From a data privacy point of view record the linkage with a PSN in accordance with the invention is better than record linkage with personal information or a PID. To provide a controlled on-demand aggregation of clinical data that contains the PID with translational data that contains a PSN, pseudonym generation according to the invention is repetitively unambiguous for all data sources. Hence, a data warehouse for translational research objectives with records linked by a PSN is facilitated.
Moreover, the present invention provides a safe pseudonymisation and re-identification, e.g. with regard to the following threats:
• database theft,
• system or hardware theft, · social engineering and
• malicious service use. Database theft, e.g. with regard to the re-identification memory or database does not have bad outcomes because any deterministic mapping information is secured. In particular, to decrypt the encrypted patient identifier stored in the re-identification memory or database, the private key of the ombudsman is required. Only the information relating to which and how many pseudonyms have been created is available to the attacker in plain text. A cryptanalysis is thus rendered futile as the cryptographic keys are not available to the attacker.
A system or hardware theft is a serious threat. With the secret key of the pseudonymisation service provider and all data from the re-identification memory or database and, in some embodiments, the pseudonymisation database, a dictionary attack is possible. A dictionary attack with lookup or rainbow tables can be made uneconomical by applying a deterministic Salt to the HMAC. The computing cost of a dictionary attack by an insider with the secret key may be controlled in a linear way, e.g. by applying the function PBKDF2. A high iteration parameter for PBKDF2 and security measures against physical theft and hijacking provide effective security against this threat. Social engineering regarding the ombudsmen is a serious threat. A measure that must be taken against social engineering is the education of the ombudsmen so they can develop consciousness. An additional or alternative measure is to apply a personal identification number (PIN) or passphrase to the private key of their asymmetric encryption system as specified above.
To prevent malicious service use, repetitive access to the pseudonymisation system with high frequency by a machine in order to guess PID-PSN relations may be inhibited. An authentication procedure to gain access to the pseudonymisation system may be employed. This at least prohibits unauthorized access. Repetitive system usage may be prevented by a„Reverse Turing Test" (RTT). With this test, the pseudonymisation server can determine if a machine or a human is trying to access the system and refuse access to a machine. A human is not fast enough to perform a service abuse attack. Therefore a RTT, authentication and/or authorization mechanisms may be employed by the methods and systems of the invention as further security measures against this threat.
Fig. 3 shows pseudonymisation and re-identification systems that are provided according to an embodiment of the invention. At a hospital 1 , medical data (MD AT) 18 and a patient iden- tifier (PID) 1 1 are entered into a computer. The MDAT 18 and the PID 1 1 are then transferred to a trusted third party (TTP) 2 which operates as a pseudonymisation system. Transmission of MDAT 18 and PID 11 from the hospital 1 to TTP 2 may be unencrypted or encrypted using any known method. At TTP 2, a private key of the TTP is used to compute a unique HMAC 22 of the PID 11 using a first injective encryption method. In an additional step 27, the HMAC 22 is then injectively mapped on a pseudonym (PSN) 23 using a second injective encryption method. The PSN 23 is then transmitted from the TTP 2 to a research database 3 along with the patient's medical data 18. In an alternative embodiment, the HMAC 22 may directly be used as a pseudonym, such that the step 27 may be omitted. Accordingly, in these embodiments, the HMAC 22 is transferred to the research database 3. Returning to Fig. 3, the TTP 2 provides three different encrypted patient identifiers 29 from the PID 1 1 using different public keys 31 of three different asymmetric encryption key systems associated with different ombudsmen 4. In the embodiment of Fig. 3, each of the ombudsmen is associated with a different re-identification system. The private keys of the asymmetric encryption key systems are not stored or otherwise available to TTP 2. In con- trast, each private key is stored at a respective re-identification system associated with the asymmetric key system, or is otherwise available to one of the ombudsmen 4. TTP 2 stores each of the encrypted PIDs 29 along with the PSN 23. In the embodiment of Fig. 3, the encrypted PIDs 29 are stored along with the PSN 23 in a re-identification memory of the TTP 2.
In case one of the ombudsmen 4 seeks to obtain the decrypted PID for a given requested pseudonym, the ombudsman 4 sends a data request to the TTP 2 and thereby notifies TTP 2 of the requested PSN. TTP 2 then checks the re-identification memory for the requested PSN and provides the ombudsman 4 with the corresponding encrypted PID that was previously stored along with the requested PSN. In particular, TTP 2 may return the encrypted PID stored along with the requested PSN that was previously provided using the public key 31 of the asymmetric key system associated with the ombudsman 4 from which the data request was received. The ombudsman 4 then uses the private key 32 of his asymmetric encryption system to decrypt the encrypted PID 29 received from the TTP 2. As a result of the decryption, a decrypted PID is obtained.
Fig. 4 illustrates pseudonymisation and re-identification systems that are similar to the sys- terns illustrated in Fig. 3. As a difference, in the pseudonymisation system of Fig. 4, the PID 1 1 included in the patient identifying data (ID AT) 12 is encrypted using the PBKDF2 method as a first encryption method. The PBKDF2 uses a secret symmetric key of a pseudonymisa- tion service provider 2' and further adds additional entropy called Salt. In particular, in the embodiment of Fig. 4 the PBKDF2 is applied with a deterministic salt (dSalt). This avoids the determination problem which was previously described with reference to Fig. 1. As a result, the PB DF2 provides a derived key (DK) which corresponds to an encrypted code 22' for the PID 1 1. The DK is further used to provide a pseudonym PSN 23' by a second encryption method. The pseudonym 23' is the stored along with the patient's medical data MDAT 18 for secondary use in a research database 3 ' .
Similar to Fig. 3, the PID 11 is also encrypted to provide a plurality of encrypted PIDs 29'. Each of these encryptions uses a different public key 31 ' of an asymmetric key system associated with a different ombudsman 4'. The encrypted PIDs 29' are stored along with the pseudonym PSN 23' in a re-identification database. In this embodiment, the re-identification database is separate from the pseudonymisation system and the re-identification system, while, in other embodiments, it may be comprised by any of these. An ombudsman 4' seeking to re-identify a given requested PSN 23' may query the re- identification database for an encrypted PID 29' that corresponds to the PSN 23' and which was provided using the public key 31 ' of that ombudsman's 4' asymmetric encryption system. The ombudsman 4' may then decrypt the encrypted PID 29' retrieved from the re- identification database using the private key 32' of his asymmetric encryption system. Fig. 5 illustrates a pseudonymisation method according to an embodiment. A pseudonymisation system receives a patient identifier PID in step 500 and performs a first injective encryption method to compute an encrypted code such as an HMAC or a derived key from the received PID in step 502. In step 504, it is checked whether the encrypted code is already present in a pseudonymisation database. In case the encrypted code is found in the pseudony- misation database, the pseudonym PSN associated with the encrypted code is retrieved from the pseudonymisation database in step 506. In that case, the method continues with step 512. At step 512, it is checked whether public keys of additional asymmetric encryption key systems have been added since the last retrieval of the PSN. If yes, the method proceeds with step 514. If no, the retrieved pseudonym is returned at step 518. In case the encrypted code is not found in the database in step 504, a pseudonym PSN is generated in step 508 and stored in a pseudonymisation database along with the encrypted code at step 510. This allows to retrieve the PSN later on in step 506 when the same PID is received again at step 500. Subsequently, for each ombudsman, an encrypted patient identifier is provided at step 514. To this effect, the PID is encrypted for each ombudsman using a public key of an asymmetric encryption system of that ombudsman. The encrypted patient identifier provided in step 514 is then stored in a re-identification memory along with the pseudonym at step 516. In some embodiments, the PSN may be stored with each of the encrypted PIDs in a single table. Alterna- tively or additionally, the PSN may be stored with each of the encrypted PIDs in a separate table, such that there is a different table for each of the asymmetric encryption key systems. In some embodiments, the re-identification memory is a buffer and the contents of the buffer are transmitted to an external re-identification database when encryption of the PID to provide the PSN and the encrypted patient identifiers has completed. When an encrypted PID has been provided for each ombudsman, the method proceeds with step 518.
Fig. 6 illustrates a re-identification method according to an embodiment. At step 550, a re- identification request is received. The re-identification request includes a requested pseudonym. In response to receiving the re-identification request, the method proceeds at step 552 with transmitting a data request to a pseudonymisation system. The data request includes the requested PSN. In particular, the data request may be directed to pseudonymisation logic or a re-identification memory of the pseudonymisation system or to a re-identification database. In response to transmitting the data request, the method continues in step 554 with receiving an encrypted patient identifier from the pseudonymisation system. In response thereto, the received encrypted patient identifier is decrypted at step 556 using a private key of an asymmet- ric encryption key system to provide a decrypted patient identifier. In some embodiments, the method of Fig. 6 may be performed by a re-identification system of the invention.
Fig. 7 illustrates a pseudonymisation system 100 according to an embodiment. The system 100 comprises input means 102 to receive a patient identifier PID. Moreover, pseudonymisation logic 104 is coupled to the input means 102 and receives the PID. Pseudonymisation log- ic 104 encrypts the PID to provide a PSN, which is then output via output means 108 of the system 100. Furthermore, the pseudonymisation logic 104 is coupled to a public key memory 120 holding one or more public keys of asymmetric encryption key systems. Using each of the public keys stored in the public key memory 120, the pseudonymisation logic 104 encrypts the PID to provide one or more encrypted PIDs, one for each public key. The pseudonymisation logic 104 then stores the pseudonym along with the encrypted PIDs in a re- identification memory 110.
The pseudonymisation system 100 further comprises a re-identification interface 112 coupled to the pseudonymisation logic 104. The pseudonymisation logic 104 may receive a data request via the re-identification interface 112, the data request including a requested pseudonym. As described elsewhere in this disclosure, the pseudonymisation logic 104 may, in response, retrieve an encrypted patient identifier from the re-identification memory 110 and transmit the retreived encrypted patient identifier via the re-identification interface 112.
To prevent unnecessary repetitions of some or all of the encryption operations, system 100 further comprises a pseudonymisation database 106 coupled to the pseudonymisation logic 104. The pseudonymisation database 106 may receive indications for previously provided pseudonyms from the pseudonymisation logic 104 in response to the pseudonymisation logic 104 encrypting the PID to provide the pseudonym. Correspondingly, pseudonymisation logic 104 may, in response to receiving a PID via input means 102, query the pseudonymisation database 106 whether a pseudonym has been provided for the received PID before. If yes, logic 104 may omit some or all of the encryption of the PID to provide a pseudonym and may, instead, retrieve a pseudonym for the received PID from the pseudonymisation database 106. Similarly, the pseudonymisation logic 104 may encrypt the PID to provide encrypted PID(s) only when it is determined that a pseudonym has not been provided for the received PID before.
Fig. 8 illustrates a re-identification system 200 according to an embodiment. The system 200 comprises input means 202 shown on the right-hand side of Fig. 8. The input means 202 may, e.g. comprise a keyboard, a touch screen, a storage medium, an electrical or optical interface, an e-mail and/or a user interface such as a graphical user interface. The input means 202 is coupled to re-identification logic 204. The re-identification logic 204 may receive a re- identification request via the input means 202. The re-identification request includes a re- quested pseudonym. In response to receiving the re-identification request, the re-identification logic 204 transmits a data request to a re-identification database and/or a pseudonymisation server such as the pseudonymisation server 100 shown in Fig. 7. To this effect, the re- identification logic 204 is coupled to a data interface 212 of the re-identification system 200 which may be coupled with the re-identification interface 1 12 of the pseudonymisation system of Fig. 7. The data request includes the requested pseudonym as previously received in the re-identification request.
In response to transmitting the data request, the re-identification logic 204 of Fig. 8 receives an encrypted patient identifier via the data interface 212 and decrypts the received encrypted patient identifier. To this effect, the re-identification logic 204 uses the private key of an asymmetric key system. The private key may be stored in a private key memory of the re- identification system as indicated by reference number 220. Alternatively, the re- identification system 200 may be adapted to be coupled to a private key memory, e.g. via a private key memory interface. The private key memory interface may, e.g. comprise any of a card slot, a USB interface, a receptacle for an optical, magnetic and/or electrical storage medium, etc. Also in embodiments, in which the private key memory 220 is part of the re- identification system 200, the private key memory 220 may be detachable from the re- identification logic 204.
As a result of decrypting the received encrypted patient identifier, the re-identification logic 204 provides a decrypted patient identifier and outputs the decrypted patient identifier via output means 210 of the re-identification system 200. The output means 210 may, e.g. com- prise a display, a storage medium, an electrical or optical interface, an e-mail and/or a graphical user interface.
Fig. 9 shows a screenshot of a user interface that may be provided to the user requesting a first pseudonym according to an embodiment of the invention. The user interface 600 shows a popup window 616 on top, in which the user is asked to identify himself by entering a user name and a password. The user will only be allowed to enter data and to request a pseudonym for a PID if he successfully completes authentication using window 616.
Underneath the popup window 616, user interface 600 includes an input field 602, in which the user may enter a patient identifier. The user will only be able to access this field when he has successfully authenticated himself before using the popup window 616. Moreover, the user is informed in box 603 that the PID corresponds to the unique patient identifier that is also used in the hospital information system. The user may further select a project to which he wishes to enter data using drop-down list 604. Similarly, using drop-down list 606, the user may specify a sample type like, for example "tumor". Moreover, using drop-down list 608, the user may specify a sample number. The user may further indicate whether the sample re- fers to DNA or RNA using drop-down list 610. For security reasons, RTT is performed by the "Completely Automated Public Turing test to tell Computers and Humans Apart" (CAPTCHA) method in that the user is requested to enter a captcha in field 612. Having entered all required data, the user may click on the button "Pseudonymize" 614. When the button 614 is clicked, the data entered by the user are transferred to a pseudonymisation system to perform a pseudonymisation method.
Additionally or alternatively to receiving a single patient identifier, the system may be adapted to perform batch pseudonymisation, e.g., with a spreadsheet upload function. Batch- pseudonymisation with spreadsheets is beneficial for the pseudonymisation of a plurality of biospecimen that were, e.g., gained at once. Batch sizes may be limited to a certain amount due to security issues with malicious service abuse. The authentication mechanism for the pseudonymisation system may, e.g. be realized with the "Lightweight Directory Access Protocol" (LDAP) querying an institutional domain server.
Fig. 10 shows a screenshot of another user interface 700 that is provided to the user in response to completion of the pseudonymisation method. This may require that the user has first filled out the fields in user interface 600 of Fig. 9 and requested pseudonymisation by clicking on the button 614. The user interface 700 of Fig. 10 shows a sample code 702, both, as a barcode and as a plain text. In the embodiment of Fig.10, the sample code is composed of a project category, project number, pseudonym, sample type, sample counter and a DNA/RNA- adjunct, if applicable as indicated in box 703. Moreover, the interface 700 also shows the pseudonym 704 alone as a barcode and as a text. In addition, the user is provided with a button labelled "print Accompanying Ticket" 706. By clicking on the button 706, the accompanying ticket may be printed out and sent together with the biospecimen to a sample processing lab.
Fig. 11 shows pseudonymisation and re-identification systems according to an embodiment. The pseudonymisation system comprise a trusted third party TTP 1 20", which is similar to the TTP 2 shown in Fig. 3. However, the system of Fig. 11 includes a further trusted third party TTP2 21". TTPl receives and encrypts a PID 1 1 using a first injective encryption method to provide an intermediate pseudonymised code (PSN1) 22", which may correspond to an encrypted code as described elsewhere in this disclosure. TTP2 21" receives the intermediate pseudonymised code 22" provided by the TTPl 20" and performs an additional injective en- cryption method on the encrypted code 22" to provide a further intermediate pseudonymised code (PSN2) 24" for the same PID 1 1. While, in some embodiments, the PSN2 21 " may be used as a pseudonym, Fig. 11 illustrates an optional additional step 28, by which pseudonym 23" is provided by applying yet a further injective encryption method on the further intermediate pseudonymised code 24". PSN 23" is then transmitted to a research database 3", where PSN 23" is stored along with the medical data MDAT 18 that are transferred from the clinic via TTPl 20" and TTP2 21" to the research database 3". In embodiments, in which step 28 is omitted, the further intermediate pseudonymised code 24" generated by TTP2 21" is transmitted to the research database 3", where it is stored along with MDAT 18.
TTPl 20" and TTP2 21 " may be spatially separate from each other. In some instances, they may be coupled via a network, such as the internet, a WAN or an intranet. TTPl and TTP2 may both be considered part of a pseudonymisation system according to an embodiment, wherein, in particular, TTPl 20" may comprise a first logic portion and TTP2 21 " may comprise a second logic portion as defined above.
Moreover, TTPl 20" encrypts the PID 1 1 using each of a plurality of public keys 31 " associ- ated with different ombudsmen 4" to provide a plurality of encrypted patient identifiers 29". TTPl 20" further forwards the encrypted patient identifiers 29" to TTP2 21 ". In response, TTP2 21 " stores the pseudonym 23", or, in some embodiments the intermediate pseudonymised codes 24", along with the encrypted patient identifiers 29" received from TTPl 20" in a re-identification memory located at TTP2 21 ". An ombudsman 4" seeking to decrypt a requested PSN may then transmit a data request to TTP2 21", wherein the data request includes the requested PSN. TTP2 21" then queries the re-identification memory and retrieves from the re-identification memory an encrypted PID 29" that was previously stored in the re-identification memory along with a pseudonym that matches the requested pseudonym. In response, TTP2 21 " returns the retrieved encrypted PID 29" to the ombudsman 4". The ombudsman 4" may then decrypt the encrypted PID 29 " received from TTP2 21 " using the private key 32" of his asymmetric encryption system to provide a decrypted patient identifier for the requested pseudonym.
It is to be understood that many modifications may be provided to the exemplifying embodiments of the method, the system or the computer readable medium without leaving the scope of the invention. Consequently, the invention may be practiced within the scope of the claims differently from the examples described. Also, the described features and characteristics may be of importance of the invention in any combination. While some embodiments have been described in terms of patient identifiers, the skilled person will readily see that the described systems and methods may also be employed for identifiers of persons, groups, entities, units, organizations, accounts, transactions, etc.

Claims

A pseudonymisation method, wherein the method comprises:
receiving (500) an identifier (11), and in response thereto:
encrypting said identifier (1 1) using at least a first injective encryption method to provide a pseudonym (23; 23'; 23"),
encrypting (514) said identifier (1 1) using a public key (31; 31 '; 31 ") of an asymmetric encryption key system to provide an encrypted identifier (29; 29'; 29"), and
storing (516) said encrypted identifier (29; 29'; 29") along with said pseudonym (23; 23'; 23").
The method of claim 1, wherein said step of encrypting said identifier (1 1) to provide a pseudonym (23; 23'; 23") comprises:
encrypting (502) said identifier (11 ) using said first injective encryption method to provide an encrypted code (22; 22'; 22") and
encrypting (508) said encrypted code (22; 22'; 22") using at least a second injective encryption method to provide said pseudonym (23; 23'; 23").
The method of any of the preceding claims, further comprising, for each of one or more additional asymmetric encryption key systems, each comprising an additional public key (31; 31 '; 31 "):
encrypting (514) said identifier (11) using said additional public key (31 ; 3 ; 31 ") to provide an additional encrypted identifier (29; 29'; 29"), and
storing (516) said additional encrypted identifier (29; 29'; 29") along with said pseudonym (23; 23'; 23").
The method of any of the preceding claims, further comprising:
receiving a data request comprising a requested pseudonym (23; 23 '; 23") and, in response thereto:
retrieving a requested encrypted identifier as an encrypted identifier (29; 29'; 29") that was previously stored along with said requested pseudonym (23; 23'; 23") and, in response thereto, returning said requested encrypted identifier (29; 29'; 29"). A pseudonymisation system (100), comprising:
input means (102) adapted to receive identifiers,
a re-identification memory (1 10), and
pseudonymisation logic (104) coupled to said input means (102) and to said re- identification memory (1 10), said pseudonymisation logic (104) adapted to:
receive a identifier (11) via said input means (102), and in response thereto:
encrypt said identifier (1 1 ) received by the input means (102) using at least a first injective encryption method to provide a pseudonym (23; 23'; 23"),
encrypt said identifier (1 1) using a public key (31 ; 31 '; 31 ") of an asymmetric encryption key system to provide an encrypted identifier (29; 29'; 29"), and store said encrypted identifier (29; 29'; 29") along with said pseudonym (23; 23'; 23") in said re-identification memory (1 10).
The pseudonymisation system of claim 5, further comprising a public key memory (120) coupled to said pseudonymisation logic (104), wherein said public key memory (120) is adapted to hold said public key (31 ; 31 '; 31 ") of said asymmetric encryption key system and wherein said pseudonymisation logic (104) is adapted to retrieve said public key (31 ; 31 "; 31 ") from said public key memory (120), wherein said public key memory (120) is preferably detachable from said pseudonymisation logic (104).
The pseudonymisation system of claim 5 or 6, wherein said pseudonymisation logic (104) is adapted to encrypt said identifier (1 1) to provide a pseudonym (23; 23'; 23") by:
encrypting said identifier (11) using said first injective encryption method to provide an encrypted code (22; 22'; 22") and
encrypting said encrypted code (22; 22'; 22") using at least a second injective encryption method to provide said pseudonym (23; 23'; 23").
The pseudonymisation system of any of claims 5 to 7, wherein said first injective encryption method comprises at least one of a Keyed-Hash Message Authentication Code (HMAC) method and a Password-Based Key Derivation Function 2 (PBKDF2) method.
9. The pseudonymisation system of any of claims 5 to 8, wherein said pseudonymisation logic (104) is adapted to, for each of one or more additional asymmetric encryption key systems, each comprising an additional public key (31 ; 3 ; 3 '):
encrypt said identifier (1 1) using said additional public key (31 ; 31 ' ; 3 Γ ' ) to provide an additional encrypted identifier (29; 29'; 29"), and
store said additional encrypted identifier (29; 29'; 29") along with said pseudonym (23; 23'; 23") in said re-identification memory (1 10).
10. The pseudonymisation system of any of claims 5 to 9, further comprising a re- identification interface (1 12) adapted to receive data requests and coupled to said pseudonymisation logic (104), wherein said pseudonymisation logic (104) is further adapted to:
receive a data request comprising a requested pseudonym (23; 23'; 23") via said re- identification interface (1 12), and in response thereto:
retrieve a requested encrypted identifier from said re-identification memory (110) as an encrypted identifier (29; 29'; 29") that is stored along with said requested pseudonym (23; 23'; 23") in said re-identification memory (1 10) and
return said requested encrypted identifier (29; 29'; 29") via said re-identification interface (1 12).
11. The pseudonymisation system of any of claims 5 to 10, wherein said
pseudonymisation logic (104) comprises separate first and second logic portions coupled to each other, wherein said first logic portion is coupled to said input means (102) and is adapted to
receive said identifier (1 1) via said input means (102) , and
encrypt said identifier (1 1) using at least said first injective encryption method to provide an intermediate pseudonymized code (22"), and
wherein said second logic portion is coupled to said re-identification memory (110) and is adapted to:
receive said intermediate pseudonymized code (22") from said first logic portion, encrypt said intermediate pseudonymized code (22") using at least a third injective encryption method to provide said pseudonym (23"), and
store said encrypted identifier (29") along with said pseudonym (23") in said re- identification memory (1 10).
2. A re-identification method comprising:
receiving a re-identification request including a requested pseudonym (23; 23'; 23"), transmitting, in response to receiving said re-identification request, a data request including said requested pseudonym (23; 23'; 23") to a pseudonymisation system (100) or re-identification database,
receiving, in response to transmitting said data request, a requested encrypted identifier (29; 29'; 29"), and
decrypting, in response to receiving said requested encrypted identifier (29; 29'; 29"), said requested encrypted identifier (29; 29'; 29") using a private key (32; 32'; 32") of an asymmetric encryption key system to provide a decrypted identifier.
3. A re-identification system (200) comprising:
input (202) and output (210) means,
a data interface (212), and
re-identification logic (204) coupled to said input (202) and output (210) means and to said data interface (212), said re-identification logic (204) adapted to:
receive a re-identification request via said input means (202), said re-identification request including a requested pseudonym (23; 23'; 23"),
transmit, in response to receiving said re-identification request, a data request including said requested pseudonym (23; 23'; 23") to a pseudonymisation system
(100) or a re-identification database via said data interface (212),
receive, in response to transmitting said data request, a requested encrypted identifier
(29; 29'; 29") via said data interface (212),
decrypt said requested encrypted identifier (29; 29'; 29") using a private key (32; 32'; 32") of an asymmetric encryption key system to provide a decrypted identifier, and output said decrypted identifier via said output means (210).
4. The re-identification system of claim 13, further comprising a private key memory (220) including said private key (32; 32'; 32") of said asymmetric encryption key system, said private key memory (220) coupled to said re-identification logic (204), and wherein said re-identification logic (204) is adapted to retrieve said private key (32; 32'; 32") from said private key memory (220), wherein, preferably, said private key memory (220) is detachable from said re-identification logic (204). A computer-readable medium having instructions stored thereon that, when executed by a computer, cause the computer to perform the method of any of claims 1 to 4 and 12.
PCT/EP2013/069319 2012-11-16 2013-09-18 Pseudonymisation and re-identification of identifiers WO2014075836A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP12193020.0 2012-11-16
EP12193020 2012-11-16

Publications (1)

Publication Number Publication Date
WO2014075836A1 true WO2014075836A1 (en) 2014-05-22

Family

ID=47178505

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2013/069319 WO2014075836A1 (en) 2012-11-16 2013-09-18 Pseudonymisation and re-identification of identifiers

Country Status (1)

Country Link
WO (1) WO2014075836A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109981285A (en) * 2019-03-11 2019-07-05 北京纬百科技有限公司 A kind of password protection method, password method of calibration and system
US20190303617A1 (en) * 2018-02-23 2019-10-03 International Business Machines Corporation Coordinated de-identification of a dataset across a network
WO2019222006A1 (en) * 2018-05-16 2019-11-21 Microsoft Technology Licensing, Llc Obfuscation and deletion of personal data in a loosely-coupled distributed system
US10657287B2 (en) 2017-11-01 2020-05-19 International Business Machines Corporation Identification of pseudonymized data within data sources
WO2021043789A1 (en) * 2019-09-06 2021-03-11 Koninklijke Philips N.V. Provenance verification for selective disclosure of attributes
WO2021061295A1 (en) * 2019-09-27 2021-04-01 Mastercard International Incorporated Method and system for securing personally identifiable information
EP3805963A1 (en) * 2019-10-11 2021-04-14 Koninklijke Philips N.V. Provenance verification for selective disclosure of attributes
US11106821B2 (en) 2018-03-20 2021-08-31 Micro Focus Llc Determining pseudonym values using tweak-based encryption
US11138338B2 (en) 2018-03-20 2021-10-05 Micro Focus Llc Statistical property preserving pseudonymization
CN114144783A (en) * 2019-07-15 2022-03-04 艾克斯坦德尔有限公司 Cryptographic pseudonym mapping method, computer system, computer program and computer-readable medium
EP4191455A1 (en) * 2021-12-06 2023-06-07 HERE Global B.V. Method and apparatus for managing user requests related to pseudonymous or anonymous data

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070192139A1 (en) * 2003-04-22 2007-08-16 Ammon Cookson Systems and methods for patient re-identification
US20090265788A1 (en) * 2006-03-17 2009-10-22 Deutsche Telekom Ag Method and device for the pseudonymization of digital data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070192139A1 (en) * 2003-04-22 2007-08-16 Ammon Cookson Systems and methods for patient re-identification
US20090265788A1 (en) * 2006-03-17 2009-10-22 Deutsche Telekom Ag Method and device for the pseudonymization of digital data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ELGER B.S. ET AL.: "Strategies for health data exchange for secondary, cross- institutional clinical research", COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, vol. 99, no. 3, 2010, pages 230 - 251
POMMERENING KLAUS ET AL: "Secondary use of the EHR via pseudonymisation", STUDIES IN HEALTH TECHNOLOGY AND INFORMATICS, I O S PRESS, AMSTERDAM, NL, vol. 103, 1 January 2004 (2004-01-01), pages 441 - 446, XP002488507, ISSN: 0926-9630 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11468192B2 (en) 2017-11-01 2022-10-11 Green Market Square Limited Runtime control of automation accuracy using adjustable thresholds
US10657287B2 (en) 2017-11-01 2020-05-19 International Business Machines Corporation Identification of pseudonymized data within data sources
US10747903B2 (en) 2017-11-01 2020-08-18 International Business Machines Corporation Identification of pseudonymized data within data sources
US11093645B2 (en) * 2018-02-23 2021-08-17 International Business Machines Corporation Coordinated de-identification of a dataset across a network
US20190303617A1 (en) * 2018-02-23 2019-10-03 International Business Machines Corporation Coordinated de-identification of a dataset across a network
US11093639B2 (en) * 2018-02-23 2021-08-17 International Business Machines Corporation Coordinated de-identification of a dataset across a network
US11138338B2 (en) 2018-03-20 2021-10-05 Micro Focus Llc Statistical property preserving pseudonymization
US11106821B2 (en) 2018-03-20 2021-08-31 Micro Focus Llc Determining pseudonym values using tweak-based encryption
US11157652B2 (en) 2018-05-16 2021-10-26 Microsoft Technology Licensing, Llc. Obfuscation and deletion of personal data in a loosely-coupled distributed system
WO2019222006A1 (en) * 2018-05-16 2019-11-21 Microsoft Technology Licensing, Llc Obfuscation and deletion of personal data in a loosely-coupled distributed system
CN109981285B (en) * 2019-03-11 2020-10-09 北京纬百科技有限公司 Password protection method, password verification method and system
CN109981285A (en) * 2019-03-11 2019-07-05 北京纬百科技有限公司 A kind of password protection method, password method of calibration and system
CN114144783A (en) * 2019-07-15 2022-03-04 艾克斯坦德尔有限公司 Cryptographic pseudonym mapping method, computer system, computer program and computer-readable medium
WO2021043789A1 (en) * 2019-09-06 2021-03-11 Koninklijke Philips N.V. Provenance verification for selective disclosure of attributes
WO2021061295A1 (en) * 2019-09-27 2021-04-01 Mastercard International Incorporated Method and system for securing personally identifiable information
US11270026B2 (en) 2019-09-27 2022-03-08 Mastercard International Incorporated Method and system for securing personally identifiable information
EP3805963A1 (en) * 2019-10-11 2021-04-14 Koninklijke Philips N.V. Provenance verification for selective disclosure of attributes
EP4191455A1 (en) * 2021-12-06 2023-06-07 HERE Global B.V. Method and apparatus for managing user requests related to pseudonymous or anonymous data
US20230179577A1 (en) * 2021-12-06 2023-06-08 Here Global B.V. Method and apparatus for managing user requests related to pseudonymous or anonymous data

Similar Documents

Publication Publication Date Title
WO2014075836A1 (en) Pseudonymisation and re-identification of identifiers
Neubauer et al. A methodology for the pseudonymization of medical data
Aamot et al. Pseudonymization of patient identifiers for translational research
Li et al. A secure electronic medical record sharing mechanism in the cloud computing platform
Ayday et al. Protecting and evaluating genomic privacy in medical tests and personalized medicine
Chen et al. Secure dynamic access control scheme of PHR in cloud computing
JP5008003B2 (en) System and method for patient re-identification
RU2602790C2 (en) Secure access to personal health records in emergency situations
US20130318361A1 (en) Encrypting and storing biometric information on a storage device
JP2017022697A (en) Equivalence checking method using relational encryption, computer program, and storage medium
Soceanu et al. Managing the privacy and security of ehealth data
Abbas et al. E-health cloud: privacy concerns and mitigation strategies
Noumeir et al. Pseudonymization of radiology data for research purposes
KR20140029984A (en) Medical information management method of medical database operating system
Rai et al. Security and privacy issues in healthcare information system
Ajagbe et al. Empirical evaluation of efficient asymmetric encryption algorithms for the protection of electronic medical records (EMR) on web application
KR102605087B1 (en) System and method for sharing patient's medical data in medical cloud environment
Frontoni et al. Security issues for data sharing and service interoperability in ehealth systems: the nu. sa. test bed
Bajrić Data security and privacy issues in healthcare
Rai et al. Pseudonymization techniques for providing privacy and security in EHR
Warren et al. Securing EHRs via CPMA attribute-based encryption on cloud systems
Sushma et al. Digital transformation of healthcare sector by Blockchain technology
Abouakil et al. Data models for the pseudonymization of DICOM data
Plateaux et al. A contactless e-health information system with privacy
Elngar et al. Data protection and privacy in healthcare: research and innovations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13762855

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 13762855

Country of ref document: EP

Kind code of ref document: A1