Background: Clinical data collection requires correct and complete data sets in order to perform correct statistical analysis and draw valid conclusions. While in randomized clinical trials much effort concentrates on data monitoring, this is rarely the case in observational studies- due to high numbers of cases and often-restricted resources. We have developed a valid and cost-effective monitoring tool, which can substantially contribute to an increased data quality in observational research.
Methods: An automated digital monitoring system for cohort studies developed by the German Rheumatism Research Centre (DRFZ) was tested within the disease register RABBIT-SpA, a longitudinal observational study including patients with axial spondyloarthritis and psoriatic arthritis. Physicians and patients complete electronic case report forms (eCRF) twice a year for up to 10 years. Automatic plausibility checks were implemented to verify all data after entry into the eCRF. To identify conflicts that cannot be found by this approach, all possible conflicts were compiled into a catalog. This “conflict catalog” was used to create queries, which are displayed as part of the eCRF. The proportion of queried eCRFs and responses were analyzed by descriptive methods. For the analysis of responses, the type of conflict was assigned to either a single conflict only (affecting individual items) or a conflict that required the entire eCRF to be queried.
Results: Data from 1883 patients was analyzed. A total of n = 3145 eCRFs submitted between baseline (T0) and T3 (12 months) had conflicts (40-64%). Fifty-six to 100% of the queries regarding eCRFs that were completely missing were answered. A mean of 1.4 to 2.4 single conflicts occurred per eCRF, of which 59-69% were answered. The most common missing values were CRP, ESR, Schober’s test, data on systemic glucocorticoid therapy, and presence of enthesitis.
Conclusion: Providing high data quality in large observational cohort studies is a major challenge, which requires careful monitoring. An automated monitoring process was successfully implemented and well accepted by the study centers. Two thirds of the queries were answered with new data. While conventional manual monitoring is resource-intensive and may itself create new sources of errors, automated processes are a convenient way to augment data quality.