
In an evolving health landscape, emerging research continues to highlight concerns that could impact everyday wellbeing. Here’s the key update you should know about:
By giving participants wearables and internet access, the American Life in Realtime study is closing the gap in who digital health data truly represents, proving that inclusivity and rigorous design can make AI-driven healthcare fairer for all.
Study: American Life in Realtime: Benchmark, publicly available person-generated health data for equity in precision health. Image credit: Lomb/Shutterstock.com
In a recent article in PNAS Nexus, researchers developed a longitudinal and nationally representative health study called American Life in Realtime (ALiR) to collect person-generated health data (PGHD) through study-provided wearable and internet-connected devices.
Their approach addresses the limitations of existing PGHD studies that depend on personal devices and often exclude disadvantaged populations. ALiR can thus serve as a benchmark for fair and generalizable digital health research.
Addressing historical underrepresentation
Precision health aims to improve disease prevention and treatment by tailoring strategies to individuals’ unique biological, social, and environmental contexts. A key component of this approach is PGHD, which is collected through everyday digital tools such as smartphones and wearable devices.
These data provide continuous insights into behaviors and exposures responsible for most modifiable health risks, making them vital for identifying health inequities and improving outcomes among marginalized groups.
However, the field lacks benchmark PGHD datasets, i.e., standardized, representative, and validated data resources that enable fair and reproducible development of artificial intelligence (AI) models. The authors note that an ideal PGHD benchmark should represent population diversity, include repeatedly validated measures, be longitudinal, contain sufficient data quality and quantity, and be widely accessible, which are criteria that ALiR fulfills.
Current datasets, such as the National Institutes of Health’s All of Us and the UK Biobank, underrepresent Black, Indigenous, older, and lower-income populations, often relying on irregular or unstructured data. This limits model generalizability and risks worsening disparities through biased predictions.
The pandemic of the coronavirus disease 2019 (COVID-19) underscored these challenges, revealing how social inequities amplify disease burdens. Many PGHD-based COVID detection studies relied on convenience samples that excluded disadvantaged individuals, partly due to recruitment barriers like limited technology access or mistrust.
To overcome these biases, the ALiR study was established. It uses probability-based sampling and study-provided hardware to promote inclusion and create a benchmark for equitable precision health research.
Designing the study
The ALiR study was designed as a longitudinal and nationally representative digital health cohort using best practices in probability sampling, benchmarking, and FAIR (Findable, Accessible, Interoperable, Reusable) data standards.
Participants were randomly selected from the Understanding America Study (UAS), a large address-based panel of U.S. adults. Individuals consenting to participate received a wearable device and access to a custom mobile app for continuous biometric tracking and short, frequent surveys.
These surveys, conducted every one to three days, gathered information on physical and mental health, behaviors, demographics, environmental and social exposures, and structural determinants such as income, housing, and discrimination.
Data were linked to contextual datasets, including healthcare records, weather, air quality, and crime, to enrich environmental and health information. The study also provided electronic tablets to participants lacking Internet access to minimize selection bias and ensure the inclusion of underrepresented groups.
Between August 2021 and March 2022, 2,468 UAS members were invited, with oversampling of racial/ethnic minorities and lower-education groups. Of those, 1,386 consented (64%), and 1,038 enrolled (75%).
Logistic and random forest analyses identified that nonconsent was most associated with older age, while nonenrolment was linked to lower education.
ALiR’s performance
ALiR achieved broad representativeness across U.S. population characteristics, including personality traits, health, demographics, and socioeconomic status.
Racial and ethnic minorities were overrepresented (54% vs. 38% in the population), while White individuals were underrepresented (46% vs. 62%), aligning with deliberate oversampling to improve inclusivity.
Participants with low income or limited digital access were well represented, with 77% having no prior wearable device, and 2% having no internet access before study-provided hardware. Weighted adjustments corrected most minor demographic imbalances, though retirees and those with hypertension remained slightly underrepresented.
Compared to convenience-based wearable studies, such as the All of Us Fitbit “bring-your-own-device” (BYOD) dataset, ALiR demonstrated far superior population alignment and diversity. When used to train a COVID-19 infection classification model, ALiR-based models achieved robust performance both in-sample and out-of-sample, indicating strong generalizability across all demographic subgroups.
Specifically, ALiR’s model achieved an area under the curve (AUC) of 0.84 when tested both in-sample and out-of-sample, maintaining consistent performance across all subgroups.
In contrast, an identically trained model based on All of Us data achieved an AUC of 0.93 in-sample but dropped to 0.68 out-of-sample, a 35% loss in accuracy, with the sharpest declines (22 to 40%) among older females and non-White participants.
Conclusions
ALiR is the first longitudinal population-based study to integrate wearable device data with repeatedly validated health and behavioral measures, offering a benchmark for equitable precision health research.
Its probability-based sampling, hardware provision, and oversampling strategies effectively minimized bias, achieving broad U.S. demographic and socioeconomic representation, improving convenience and “bring-your-own-device” studies like All of Us.
ALiR’s COVID-19 model performed robustly across diverse groups, showing that smaller, high-quality, representative samples can yield more generalizable results than larger, biased datasets.
However, some biases persisted, particularly underrepresentation of older adults despite device provision, suggesting that barriers beyond technology access, such as mistrust or disinterest, affect participation. The study also focused on consent and enrollment, with ongoing work addressing long-term engagement. The authors emphasize that the ALiR dataset and accompanying study app code will be publicly available in late 2025, providing an open resource for developing and validating equitable AI models.
In summary, ALiR not only sets a public benchmark for inclusive digital health research but also demonstrates that thoughtful study design can overcome long-standing barriers to representation. By providing a methodologically sound framework, ALiR supports the development of more generalizable AI models and contributes to improving equity in digital and precision health research.
Journal reference:
- Chaturvedi, R.R., Angrisani, M., Troxel, W.M., Jain, M., Gutsche, T., Ortega, E., Boch, A., Liang, C., Sima, S., Mezlini, A., Daza, E.J., Boodaghidizaji, M., Suen, S., Chaturvedi, A.R., Ghasemkhani, H., Ardekani, A.M., Kapteyn, A. (2025). American Life in Realtime: Benchmark, publicly available person-generated health data for equity in precision health. PNAS Nexus 4(10). DOI: 10.1093/pnasnexus/pgaf295. https://academic.oup.com/pnasnexus/article/4/10/pgaf295/8275735