%0 Journal Article %@ 2561-326X %I JMIR Publications %V 4 %N 5 %P e14064 %T Privacy-Preserving Deep Learning for the Detection of Protected Health Information in Real-World Data: Comparative Evaluation %A Festag,Sven %A Spreckelsen,Cord %+ Institute of Medical Statistics, Computer and Data Sciences, Jena University Hospital, Bachstraße 18, Jena, , Germany, 49 3641 9 398360, Cord.Spreckelsen@med.uni-jena.de %K privacy-preserving protocols %K neural networks %K health informatics %K distributed machine learning %D 2020 %7 5.5.2020 %9 Original Paper %J JMIR Form Res %G English %X Background: Collaborative privacy-preserving training methods allow for the integration of locally stored private data sets into machine learning approaches while ensuring confidentiality and nondisclosure. Objective: In this work we assess the performance of a state-of-the-art neural network approach for the detection of protected health information in texts trained in a collaborative privacy-preserving way. Methods: The training adopts distributed selective stochastic gradient descent (ie, it works by exchanging local learning results achieved on private data sets). Five networks were trained on separated real-world clinical data sets by using the privacy-protecting protocol. In total, the data sets contain 1304 real longitudinal patient records for 296 patients. Results: These networks reached a mean F1 value of 0.955. The gold standard centralized training that is based on the union of all sets and does not take data security into consideration reaches a final value of 0.962. Conclusions: Using real-world clinical data, our study shows that detection of protected health information can be secured by collaborative privacy-preserving training. In general, the approach shows the feasibility of deep learning on distributed and confidential clinical data while ensuring data protection. %M 32369025 %R 10.2196/14064 %U https://formative.jmir.org/2020/5/e14064 %U https://doi.org/10.2196/14064 %U http://www.ncbi.nlm.nih.gov/pubmed/32369025