Only pseudonymized case data is processed in ARS, whereby the case reference cannot be retrieved by RKI.
To ensure this, a multi-level concept with organizational and technical measures was developed. This pseudonymization concept is set up in such a way that it can also be used by other RKI surveillance systems that are permitted to process pseudonymized case data in accordance with §13 IfSG. The aim is to achieve a high level of security for the pseudonymization procedure and at the same time to minimize the effort involved, especially for the data senders.
Procedure for pseudonymization for surveillance systems with processing of pseudonymized case data
Description of the procedure
Pseudonymization in DEMIS involves a two-stage process. Surveillance system-independent pseudonyms are first transmitted from the data sender to the DEMIS backend, where a second step of surveillance system-specific de-pseudonymization takes place.
Pseudonym creation by the data sender
The basis for creating a pseudonym at the level of the data sender involves using information that doesn't change and uniquely describes the patient. This could be the insurance number, but also information such as the combination of first name + surname + date of birth. It is up to the data sender to select the most reliable clear information in their data.
The concept involves the data sender transmitting two pseudonyms, i.e. a pair of pseudonyms, for each patient. Both pseudonyms are based on the same plain text information, the same cryptographic hash procedure HMAC-SHA256, but on different secret codes “Secret-1” and “Secret-2”.
In the final step, HMAC-SHA256 contains hashing based on the SHA256 algorithm. The German Federal Office for Information Security (BSI) describes SHA256 in its Technical Guideline "Cryptographic procedures: Recommendations and Key Lengths" with the abbreviation BSI TR-02102-1 in version 2023-01 as a cryptographically strong hash function.
The data sender is responsible for selecting and exchanging the secret codes. This process is not coordinated or controlled by the RKI. Different data senders therefore always use different secret codes. As a consequence, the RKI has no possibility to merge the data of a patient across different data senders. Only two secret codes ever need to be defined for the entire database; it is not necessary to define secret codes for each patient to be pseudonymized.
Over time, one of the two secret codes, Secret-1 and Secret-2, is exchanged alternately every 5 years. This means that ONE secret code is changed every 5 years, while the other secret code remains valid for a further 5 years and is then replaced.
Thanks to this procedure (keyword "overlapping pseudonyms"), pseudonyms for the same patient can be merged over time in the DEMIS backend despite the change of secret codes. This merging can no longer take place in the DEMIS backend if the transmissions are sufficiently far apart in time (keyword “clinical episodes”).
The data sender transmits the same pseudonym pair for a patient for all RKI projects and systems that use this pseudonymization concept.
Example:
The following example schematically shows the interaction of plain text information, secret codes and resulting pseudonym pairs over time at the data sender.
In this scenario, data from a fictional patient "Max Mustermann" with the insurance number "K004567123" is to be transmitted to ARS. The year refers to the time of data transmission.
2020: Starting in the first year, two secret codes, Secret-1 and Secret-2, are set by the data sender. These are used to generate a pseudonym P1 and P2 from the insured person number “K004567123” for the patient “Max Mustermann” using the hash procedure HMAC-SHA256. Both pseudonyms appear together as a pseudonym pair.
2021-2024: In the next four subsequent years, Secret-1 and Secret-2 are still valid and result in the same pseudonyms P1 and P2 as in the previous year.
2025: In the next year, Secret-1 is replaced by a newly defined Secret-3. Secret-2 is still valid. Pseudonyms P3 (new) and P2 (old) are created.
2026-2029: In the next four subsequent years, Secret-3 and Secret-2 are still valid and result in the same pseudonyms P3 and P2 as in the previous year.
2030: In the next year, Secret-2 is replaced by a newly defined Secret-4. Secret-3 is still valid. Pseudonyms P3 (old) and P4 (new) are created.
Transmission of pseudonyms to the DEMIS backend
To transmit data to the DEMIS backend in FHIR format, patient pseudonyms are embedded within the patient resource and transmitted together with other resources in a Bundle.
Patient |
id : 43511fe3-aa5f-73fb-7e59-2313ba0ca76c |
meta |
profile : https://demis.rki.de/fhir/ars/StructureDefinition/Patient |
identifier |
type |
coding |
system : http://terminology.hl7.org/CodeSystem/v2-0203 |
code : ANON |
system : https://demis.rki.de/fhir/sid/SurveillancePatientPseudonym |
value : f68ae1e4b4fb59f68b6e30ac51618aa8c5e7249735517ab5344fb900484e2a47 |
identifier |
type |
coding |
system : http://terminology.hl7.org/CodeSystem/v2-0203 |
code : ANON |
system : https://demis.rki.de/fhir/sid/SurveillancePatientPseudonym |
value : 2d00467c311c659aed5f64b3daeb61be7793ef5410545c3b69634a18c7c7b429 |
gender : male |
birthDate : 1983-06 |
Pseudonym Processing and Generation in the DEMIS Backend
The processing of pseudonyms in the DEMIS backend occurs without any further action required by the data sender. The following steps, carried out in the backend, are outlined here for completeness:
The surveillance system for which the data was submitted is determined. Pseudonyms are extracted from the Patient resource and re-encrypted using a surveillance system-specific secret code. This secret code is not known to the surveillance system itself. This ensures that different projects or systems within the RKI (Robert Koch Institute) receive system-specific pseudonyms, preventing data linkage between different RKI surveillance systems based on the patient pseudonym. At the same time, if necessary and defined in advance in accordance with data protection requirements, the same secret code can be used across multiple systems to enable cross-system pseudonym consistency. This additional encryption step also ensures that the RKI cannot access the originally submitted pseudonyms, further protecting against the re-identification of individuals.
The obfuscated pseudonym pairs (based on overlapping characteristics) are linked. For example, pseudonyms P1, P2, P3, P4 submitted over time for the same patient may be linked together. This could theoretically allow patient data to be linked over an indefinite time span. To mitigate this, a maximum linkage duration is defined for each surveillance system. For ARS (Antibiotic Resistance Surveillance), this duration is 5 years. Based on this, the system calculates time ranges for each patient and assigns a unique pseudonym for each period.
Only the final pseudonym generated in the DEMIS backend is transmitted to the RKI.