We are currently living through an unprecedented global pandemic. All of our lives are impacted by COVID-19, whether we know someone personally affected by the virus or not. “Physical distancing” is quickly becoming the new normal in many communities.
The critical importance of public health is taking center stage, and as a supporter of the scientific community, the team at Evidence Partners is doing everything we can to help.
To offer support to researchers working on COVID-19, we have adopted two initiatives that we hope will help support and accelerate your important work:
- Researchers working on COVID-19 can now access a new version of DistillerSR for free. This soon-to-be-released version contains advanced AI features that accelerate the screening process for faster results.
- We applied the pre-release AI features from DistillerSR to the open-source CORD-19 dataset and other sources to create a tagged reference set of COVID-19 relevant articles. We are offering this as a free download for researchers who need access to COVID-19 literature quickly. The reference set is updated each week as new material becomes available.
You can find out more information about our initiatives right here.
In this post we’d like to share some of our methods for creating the COVID-19 tagged reference set. Many people are asking how we got it, so here’s an overview:
Preparing to Screen
As our reference sets are updated every week, we start the process by downloading the most recent edition of CORD-19 and ClinicalTrials.gov. Once the reference sets get imported, we run the DistillerSR deduplication tool to remove duplicate references from the data. For example, we deduped against the ‘Cord_uid’ field, which, on our April 17th update, removed 49,925 duplicate references, consisting of references that were duplicates of references uploaded in prior weeks. We also deduped against the ‘Pubmed_id’ and ‘DOI’ fields (on April 17th, that specific dedupe removed 975 additional references.) The rest of the list was cleaned using the “Extreme Precision” setting in DistillerSR.
Human + AI Screening
We used a DistillerSR AI classifier to identify references and systematic reviews that relate specifically to COVID-19 (precision: 64.5%, recall: 99%) and tagged those that meet inclusion and exclusion thresholds.
Next, a human reviewed the references labeled by the COVID-19 classifier. A conflict check was used to compare human responses to those of the AI and corrections were made to each response set as required. Humans then screened approximately 200 beyond what was screened by the AI to look for references that may have been missed.
This process leveraged AI-powered Continuous Reprioritization, a feature that uses AI to prioritize references based on the likelihood of inclusion. As you continue to screen, the AI continuously re-ranks references to bubble the most promising ones to the top.
We are updating the reference set on a weekly basis as new literature about COVID-19 becomes available. As we do this, our COVID-19 AI classifier is continuously retrained with the new data, increasing its accuracy. All new references are dual screened using the AI as one of the screeners.
We were able to create this reference set and maintain it with one person working less than three hours on it per week thanks to the AI tools and other features in DistillerSR.
All of this is in the hopes that DistllerSR and our efforts will assist researchers who are currently working tirelessly to combat COVID-19. In an unprecedented time, we feel that it’s incumbent upon us to do our best to be part of the solution.
If you are a researcher working on COVID-19, maybe we can help. You can learn more about our initiatives here.
- The Power of Systematic Reviews for Public Health
- 5 Reasons Why You Need Systematic Reviews for Economic Evaluations
- Systematic Literature Review Essentials for Regulatory Pros