The COVID-19 pandemic has resulted in more than 100 million infections, and more than 2 million casualties. The global crisis spans across 200 countries. Large scale testing, social distancing, and face masks have been critical measures to help contain the spread of the infection. Even with the onset of the vaccination programs, the WHO highlights large scale testing and precautionary measures must be followed for the next couple of years. While the list of symptoms is regularly updated, it is established that in symptomatic cases COVID-19 seriously impairs normal functioning of the respiratory system. Does this alter the acoustic characteristics of breathe, cough, and speech sounds produced through the respiratory system? This is an open question waiting for scientific insights. A COVID-19 diagnosis methodology based on acoustic signal analysis, if successful, can provide a remote, scalable, and economical means for testing of individuals. This can supplement the existing nucleotides based COVID-19 testing methods, such as RT-PCR and RAT.
The DiCOVA Challenge is designed to find scientific and engineering insights to the question by enabling participants to analyze an acoustic dataset gathered from COVID-19 positive and non-COVID-19 individuals. The selected findings will be presented in a special session at Interspeech 2021, the flagship conference of the global speech science and technology community, to be held in Brno from Aug 31-Sept 3, 2021. The timeliness, and the global societal importance of the challenge warrants focussed effort from researchers across the globe, including from the fields of medical and respiratory sciences, signal processing, and machine learning engineers/researchers. We look forward to your participation!
The leaderboard saw participation from 29 teams.
Each team was given a maximum of 25 attempts to evaluate their system performance against the hidden blind test labels. The AUCs of many of these systems performed better than the baseline system.
There was a good diversity in kinds of features used by the teams. These features ranged from simple hand-crafted acoustic features (like, ZCR, energy) to advanced acoustic representations (embeddings) obtained using pre-trained DNNs.
The novelty of the task made teams also experiment with diverse kinds of classifiers.
The challenge task required handling class data imbalance. For this, several teams experimented with data augmentation (adding noise, reverberation, pitch shifting, etc., or cough files from other public datasets, like COUGHVID), and system fusion.
The best performance was posted by team T-1 with an AUC of 87.04%, significantly improving over the baseline system performance (69.85%). This performance was followed by two close competitors, team T-2 posting 85.43% AUC and team T-3 posting 85.35% AUC. It was wonderful to see nine teams scores above 80% AUC!
The evaluation was open for 22 days. In the initial days only a few teams evaluated their systems. As days passed, the leaderboard activity began to gain pace, and teams started improving their AUCs.
How did the best AUC on the leaderboard change over evaluation days?
Does more evaluation by a team imply a better AUC? There is some correlation :)!
How does the performance on the test set compare against the performance on the val set?
An important metric in evaluating a diagnosis tool is its specificity at some sensitivity. For the challenge we evaluated the specificity at 80% sensitivity. Below we show how different systems fared in this. The best specificity obtained was 83.33% by team T-1.
And finally, here are the ROCs of the 29 systems corresponding to the best system of each team.
This special session features two tracks and you can participate in one or both of them. The Track-1 is focussed only on cough sound recordings, and Track-2 is open for use of broader sound categories, like, cough, breath, sustained phonation, and continuous speech.
You are encouraged to submit your findings to the DiCOVA Special Session at Interspeech 2021 for peer-review and subsequent consideration for presentation (and publication) in the conference. For this we require you to participate in one or both the tracks.