Salesforce Research utilised the data from the CORD-19 Challenge to create COVID-19 Search, an AI-powered search engine to equip scientists and researchers with the most relevant COVID-19 research. Sponsored by the White House, and including a number of leading AI and health policy groups – the NIH, Georgetown, AI2, CZI, MSR – the ongoing CORD-19 Challenge aims to catalyze the development of search algorithms and engines designed for researchers and policymakers to better understand and combat COVID-19. It maintains the growing corpus of coronavirus-related publications and makes it easily accessible to the public.
“From February to May 2020, the number of scientific papers published on COVID-19 skyrocketed from 29 000 to more than 138 000. As people around the world step up to help, the number will continue to grow exponentially, with projections to swell to more than one million by the end of 2020,” says Andre Esteva, Phd, Head of Medical AI, Salesforce Research.
“That’s good news for the medical community and policymakers working on vaccines and treatments — but only if they’re able to efficiently search the growing body of research,” adds Anuprit Kale, Lead Data Scientist, Salesforce. “As papers are data-rich and can be hundreds of pages long, finding what you’re looking for, in the time crunch of a global pandemic, can be a challenge.”
Introducing COVID-19 Search
With deep experience in natural language processing (NLP), Salesforce Research pulled together a team of our experts to develop a search engine that would support research efforts as more information pours into public archives. Within a few months, Salesforce Research developed COVID-19 Search to help users easily search through rigorous scientific information in their efforts to stem the tide of the global pandemic.
“Searching scientific publications requires different techniques from traditional keyword-matching search engines,“ says Kale. “It’s critical that a COVID-19 search engine interprets the proper meaning in a given search, going beyond finding results based on the frequency with which words appear in documents. And with long documents, it’s valuable to quickly surface relevant passages in search results.”
“COVID-19 Search addresses this by combining text retrieval and NLP — including semantic search, state of the art question answering, and abstractive summarization — to better understand the question and surface the most relevant scientific results,” explains Esteva.
COVID-19 Search is designed to serve those on the front lines of medicine and policymaking to accelerate the search for effective vaccines and treatments. To learn more about this work, visit the COVID-19 Search website or read the full paper to learn about the research that went into this.