期刊文章%@ 2369- 2960% I JMIR出版物% v8 %N卡塔尔世界杯8强波胆分析 9 %P 34472% T开放获取健康与人口监测系统数据的研究参与者隐私:数据匿名化需求分析[A Templ,Matthias, A Kanjala,Chifundo, A Siems,Inken] +苏黎世应用科学大学数据分析与流程设计研究所,瑞士温特图尔,8404,41 793221578,matthias.templ@zhaw.ch %K纵向数据和事件历史数据%K低收入和中等收入国家%K LMIC %K匿名化%K健康和人口监测系统%D 2022 %7 2.9.2022 %9原始论文%J JMIR公共卫生监测%G英文%X背景:数据匿名化和共享已经成为世界范围内个人、组织和国家的热门话题。只要能够保留数据的效用,并将披露的风险控制在可接受的水平以下,那么包含个人敏感信息的匿名数据的开放共享就最有意义。在这种情况下,研究人员可以在没有访问限制和限制的情况下使用这些数据。目的:本研究旨在强调健康监测事件历史数据共享的要求和可能的解决方案。挑战在于多个事件日期和时变变量的匿名化。方法:提出了在事件日期中加入噪声的顺序方法。这种方法维护事件顺序并保留事件之间的平均时间。此外,提出了一种基于噪声邻居距离的风险估计方法。 Regarding the key variables that change over time, such as educational level or occupation, we make 2 proposals: one based on limiting the intermediate statuses of the individual and the other to achieve k-anonymity in subsets of the data. The proposed approaches were applied to the Karonga health and demographic surveillance system (HDSS) core residency data set, which contains longitudinal data from 1995 to the end of 2016 and includes 280,381 events with time-varying socioeconomic variables and demographic information. Results: An anonymized version of the event history data, including longitudinal information on individuals over time, with high data utility, was created. Conclusions: The proposed anonymization of event history data comprising static and time-varying variables applied to HDSS data led to acceptable disclosure risk, preserved utility, and being sharable as public use data. It was found that high utility was achieved, even with the highest level of noise added to the core event dates. The details are important to ensure consistency or credibility. Importantly, the sequential noise addition approach presented in this study does not only maintain the event order recorded in the original data but also maintains the time between events. We proposed an approach that preserves the data utility well but limits the number of response categories for the time-varying variables. Furthermore, using distance-based neighborhood matching, we simulated an attack under a nosy neighbor situation and by using a worst-case scenario where attackers have full information on the original data. We showed that the disclosure risk is very low, even when assuming that the attacker’s database and information are optimal. The HDSS and medical science research communities in low- and middle-income country settings will be the primary beneficiaries of the results and methods presented in this paper; however, the results will be useful for anyone working on anonymizing longitudinal event history data with time-varying variables for the purposes of sharing. %M 36053573 %R 10.2196/34472 %U https://publichealth.www.mybigtv.com/2022/9/e34472 %U https://doi.org/10.2196/34472 %U http://www.ncbi.nlm.nih.gov/pubmed/36053573
Baidu
map