I. Introduction
Electronic Health Record (EHR) systems [1] have been increasingly used as an effective method to share patients' records among different hospitals. However, it is still a challenge to access scattered patient data through multiple EHRs because existing EHRs are regionally limited or belong to affiliated hospitals. Based on the report published by the Office of the National Coordinator for Health Information Technology (ONC) [2], the main barrier to access patient records lies in the difficulty to find provider's addresses. So far, there have been several projects to overcome these problems; however, the solutions they have produced are difficult and involve redesigning or upgrading of existing EHR systems, which would require substantial expenses. Among them, one of the most actively ongoing programs is run by CommonWell Health Alliance [3] in the United States, a nonprofit association. They support EHRs, care providers, and healthcare information technology (HIT) vendors to connect to their nationwide interoperability network via certified integration platforms and intermediaries. They use a centralized system that allows patients and doctors to search for a patient's scattered medical records [4]. Such a centralized architecture has some drawbacks that it may face the risk of single-point-of-failure and bottleneck of data flow when the system becomes larger.
In an EHR system, when patient records are accessed for some reason, the history of all such events must be recorded in a log file for later audit on access histories. The log file is used for reconstructing the past state of medical records, and it can be represented as a legal document [567]. Thus, we should firmly protect the log file from illegal access and make it immutable if possible.
In this paper, we propose a decentralized system to address problems in sharing patient records among EHRs without relying on a high-end centralized system. Our system has three major features: (1) a trusted directory of patient data in EHRs which guarantees access as well as the integrity of the data itself, (2) strengthened security in dealing with patient data by utilizing a particular encryption scheme and providing a transparent and undeniable audit trail based on an immutable access log, and (3) providing scalability to cover multiple existing EHRs of regional or core hospitals with the least modification and availability of the system without relying on a centralized supervisory system.
We design the system following the Health Insurance Portability and Accountability Act (HIPAA) technical safeguard [8] and ISO/TS 18308 [5] for the interoperability, data integrity, auditability, and availability of the system. To accomplish our goals, we adopt blockchain technology, especially the permissioned consortium type [9], using the Hyperledger Fabric (HLF) platform. Multiple hospitals gather to form a consortium having a private peer-to-peer network, and permission to join it is determined based on consensus among the members.
HLF is an open-source platform that has many essential components available in some programming languages. In addition, it provides the Byzantine fault tolerant consensus protocol [10] for ordering transactions to a block. Moreover, it allows end-to-end [11] throughput of more than 3,500 transactions per second. It is a project [12] hosted by the Linux Foundation, and contributions to the project are made by Digital Asset and IBM.
II. Methods
1. Hyperledger Fabric
In HLF, there are several key components (Table 1) that play pivotal roles in the system. In addition, it provides three phases of consensus (Table 2) to validate transactions before uploading them to the ledger. HLF provides a variety of special designated chaincodes called system chaincodes to perform certain privileged tasks. Examples of system chaincodes are Configuration, Life Cycle, Query, Endorser, and Validator system chaincodes. In our study, we designed several prerequisite chaincodes and implemented them in our prototype system.
2. System Conceptual Design
We built a private subnet of an HLF network where the same ledger is shared among the hospital members (Figure 1), which is called a channel. Organizations or departments within them can constitute independent channels with relevant ledgers according to their needs. In practice, medical data is usually too big to handle directly in a ledger; therefore, data is kept in an EHR, and only the address is recorded in the ledger. Such storage type is called on-chain or off-chain according to whether the data is in a ledger or not [15]. A ledger also contains the hash values of data. This guarantees data integrity because once a piece of data is written in a ledger, it becomes immutable, and this allows the user to check whether the data has been altered or not.
In our system, we assume that a client of HLF (Table 1) is a doctor, nurse, or clerk who helps patients to upload or share their medical records. Clients from medical institutions issue various types of transactions and store them in a ledger. The ledger consists of patient metadata, including demographics, and these data are used for retrieval requests to find transactions related to a specific patient during a specified period of timestamps of blocks in the ledger. Thus, the ledger functions as a registry of patient IDs for doctors to search for their patient's records stored in other EHRs. In addition, each transaction contains the client's request metadata, chaincode execution results, and medical record metadata, such as hospital ID, hash of medical records stored in an EHR, and so forth. In consequence, these data will be used for auditing purpose.
For an individual patient, the enrolment ID (eID) issued by a membership service provider (MSP) is used as the channel patient ID in the system. Each transaction in the ledger contains an eID, which is hashed after being concatenated with a random data so called salt [19] in the format as shown below:
This format is nearly the same as how the Linux system stores its user's hashed passwords with salts. Here, “$” is used as a delimiter between neighboring fields; “n” represents hash algorithm type; and 1, 5, and 6 correspond to MD5, SHA-256, and SHA-512, respectively. Salt is a string of random alphanumeric characters up to 16 letters.
3. Cryptographic Scheme
Before patient data is uploaded to the EHR system with the patient's consent, the data is encrypted using an adequate symmetric key. Then the symmetric key is asymmetrically encrypted using the patient's public key and attached to the encrypted data. This hybrid encryption makes the procedure efficient in terms of both speed and convenience because the encryption of large data can be done faster by symmetric-key than asymmetric-key, while the latter is more convenient in the encryption of small-size cryptographic key.
To read patient data, a proxy downloads it from the relevant EHR and sends it to the receiver. However, in case the receiver is different from the patient, the encrypted symmetric key at the data should be transformed, so that it can be decrypted by the receiver's private key. To do this, we use a proxy re-encryption scheme (Figure 2) in which the patient generates the proxy re-encryption key by mathematically combining their private key and the receiver's public key using the AFGH algorithm [2021]. After receiving the newly made re-encryption key, the proxy re-encrypts the symmetric key for the receiver. In that process, the symmetric key is not disclosed to the proxy. Otherwise, the proxy must send the data to the patient to make it encrypted using the receiver's public key.
4. Web-Based Application
Our system provides web-based application for clients in each hospital to make access requests to the ledger or EHR. Web-based application is the front-end side application program available in a hospital or clinic. A hospital can have a single peer or many peers according to their scale, while a small clinic functions as a client without peer. For identifying participants across the system, doctors in each hospital are assumed to have their ECerts.
Web-based application offers web-based user interfaces and essential interactive functions in communication between participants in the system. Patients use it to generate key pairs to register and enrol their identities to the system to obtain ECerts. In addition, they can generate proxy reencryption keys and send them to the proxy. On the other hand, the client uses this web-based application to create a transaction proposal and submit it to the blockchain system for the tasks such as identifying a patient's identity and creating, uploading, and sharing medical records, metadata and so forth.
III. Results
1. Developed Chaincodes
In our prototype system, we installed five chaincodes with which business logics are performed. Each chaincode has many programming functions in it, and they usually read and update the ledger state with all the business logic contained inside functions. In an actual system, each chaincode needs to get agreement among all the member hospitals before being deployed in the system. Table 3 presents details of the proposed chaincodes.
2. Use Case Scenarios
We simulated use cases using the prototype system. In Figures 3, 4, 5, which describe a practical situation, we assume that a patient, let's call her Alice, visits Hospital_A for the first time. There, Alice is diagnosed with cancer, and her doctor, Dr. Bob, recommends her to go to the central hospital to see a cancer specialist. Dr. Bob uploads Alice's records with her consent to the hospital's EHR. Then Alice moves to the central hospital, and the cancer specialist accesses Alice's data in the EHR that belongs to Hospital_A.
1) First visit to a hospital
Alice makes a first visit to Hospital_A (Figure 3). To enrol in the hospital, she provides her demographic information or the national insurance number to a clerk. This information will be used for registering her in the patient identity source of the hospital and issuing an ECert for her. The ECert and private key need to be stored in a secure storage device, for instance, an IC card or USB memory. After issuing the ECert by local certificate authority (CA), the clerk must store the hash value of Alice's eID and individual patient ID in the ledger.
2) Uploading patient's record with metadata and consent
When a patient's records are uploaded to the EHR system (Figure 4), Alice provides the doctor her consent with conditions for sharing her records with other third parties or her relatives. Then, the doctor encrypts Alice's record using an adequate symmetric key and encrypts the key this time using Alice's public key to attach it with the record. Finally, the doctor uploads Alice's record to Hospital_A's EHR system and writes the record's consent and the address of the data location to the ledger.
3) Requesting patient's record
Alice goes to see a specialist in the central hospital (Figure 5), where she registers as a new patient, if needed, and provides her ECert previously issued in Hospital_A. When treating Alice, the doctor wants to get Alice's previous records, so he sends a transaction proposal of a request to obtain Alice's records metadata during a certain period and the previous hospital's ID. Then, each endorsing peer simulates the transaction proposal executing chaincodes and returns each result of the chaincode to the proxy of the hospital where the client application is run by the doctor. The application compares the query results, and if they are all matched, it lets the doctor select the necessary records from them to make a list of the patient's records that he wants to obtain. After receiving the list, the proxy asks Alice to generate the proxy reencryption key. Then, the proxy downloads Alice's records in the list from relevant EHRs and re-encrypts every encrypted symmetric key at each record using the re-encryption. After that, the proxy sends Alice's records to the doctor.
3. Prototype System
A prototype system was built on a small scale for testing on a local network with four Window PCs for patients to use the patient web application, four Linux PCs for doctors to use the doctor web application, and four proxies for four hospitals. In addition, we used two Window PCs as EHRs. The HLF platform was run on Docker for executing chaincodes. For EHR records, we dealt with standardized data, such as HL7/CDA and DICOM image data. We changed the system configuration with various numbers of PCs to assess the performance including chaincode logic. As a result, it took a little more time with an increasing number of PCs when querying data in a blockchain as well as encrypting and decrypting the records and transferring files.
The above prototype is not the same as an actual working environment. The system and chaincode functionality may require specific modification to suit consortium privacy policies and the legal requirements set by the governing authority.
IV. Discussion
In implementation of the system, all the verification steps are essential for security purposes. To protect patient privacy, we adopted the Advanced Encryption Standard (AES) algorithm for symmetric-key encryption of patient data and the Elliptic Curve ElGamal (EC-ElGamal) algorithm for asymmetric-key encryption of the symmetric key in the proxy reencryption scheme. The asymmetric-key pair is also used for the signature on the transaction proposal. However, for the purpose of further strengthening security, a patient can have another key pair for a signature different from the one of the encryptions. The former is generated by using the HLF function, the latter by importing a function of EC-ElGamal encryption using EC cryptography. When a patient chooses to have two pairs of keys, he or she bears a greater burden to keep them secret. In the case that a patient loses these private key, a key escrow system is assumed to be used for retrieving the lost keys or symmetric keys from the ECert issuer or the hospital only for decryption of the patient data. After all, retrieved keys must be used temporarily before new keys and a new ECert are issued for the patient.
We hash eID with salt to avoid transactions of the records related with a patient having the same hash value of eID with which the patient records might be traced undesirably along the ledger. Meanwhile, this technique causes longer processing time to find out a patient in query of the data. To make the process faster, doctors can input many relevant query keywords for obtaining the data. These keywords include not only eIDs but timestamps and hospital IDs.
The proxy's roles are to connect different EHRs through a secured communication network, download the medical records and re-encrypt the patient's data. This scheme makes the processing time shorter in transferring a patient's data securely; otherwise, the data must be sent to the patient to decrypt using the patient's private key and encrypt again using the receiver's public key before it is sent back to the proxy and then to the receiver. For proxy re-encryption, we adopt the AFGH algorithm because it uses the receiver's public key rather than the private key as in BBS algorithm [22], where the receiver's private key is created and used transiently only for receiving the data.
To strengthen the privacy in access to records, patients can give consent with conditions in the transaction of records for sharing them to third party. Furthermore, the ledger retains events of sharing data and the relevant person's information, which facilitates the auditing procedure.
There have been several projects to establish a medical information-sharing system based on the blockchain. Among them, MedRec [23] is an early study applying the private Ethereum platform to EMRs. In Ethereum, an executable program run in the network is called a smart contract instead of a chaincode. Ethereum requires mining mechanisms to sustain the distributed ledger, which is a time-delayed process with miners competing in proof of work, although it is not difficult to make a private platform have a short block time less 10 seconds. Medical stakeholders, such as researchers, public health authorities, and so forth, need to be incentivized to participate actively as miners. To address these issues, MedRec 2.0 is currently under development [24].
Ancile [13] is another blockchain-based system using the private Ethereum platform, which applies a technique that is similar to ours for medical record management, adopting the on-chain and off-chain concept. Ancile uses distributed proxies for re-encryption, called blinding re-encryption, by splitting the ciphertext for re-encryption between multiple nodes.
On the other hand, Dubovitskaya et al. [25] uses HLF in the cloud system. In this system, the data structure consists of key and value pair. The key is a hash of a combination of the symmetric key and uniquely identifiable information (UII) of the patient, and the value is the record metadata. To reduce the vulnerability of the system, patients encrypt each piece of their data using different symmetric keys. However, this incurs a heavy burden of key management such that patients need to choose the corresponding symmetric key for generating a key number every time they query for the data.
Our system is a consortium network. If other medical institutions want to access this network, they must make a request to register as a member of this network. Otherwise, a non-member institution can communicate through the member institutions. Peers are the trusted elements from each medical institution. They need to strengthen their own security to protect peers from illegal access. At the same time, every medical institution needs to agree on the chaincode logic before deploying them in the system. Thus, our blockchain system also can be run effectively in the cloud system even though its fundamental standpoint is opposite in terms of decentralization. Cloud computing can provide a solution to the blockchain size problem that ledger size gets gradually bigger with time and peers will have difficulty to keep and process it.
In conclusion, our system can be used to constitute a large-scale EHR system. It is flexibly configurable to be a top layer of existing EHR systems to strengthen security in the management and exchange of medical records. Our system takes on the roles of a patient identifier, a trustee access log, and registry of patient records. Even though our system does not offer explicit incentives to participants as other blockchain-based systems do by issuing a cryptocurrency, it will benefit users and stakeholders too, including healthcare service providers and the government. We expect that our research can help patients to find their medical histories more easily when they visit other hospitals. As future work, we are going to test our system in a real hospital environment. We will prepare to deal with non-standardized data in a real-world field test.