Can we diagnose Covid-19 with speech?

On 11th February 2020, the World Health Organisation (WHO) announced, “Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)”, as a disease and later labelled the outbreak as a global ‘Pandemic’(Organization, 2020). Consequently, measures such as national lockdowns and social distancing rules were introduced around the world in attempt to contain such a transmissible disease (Singh & Singh, 2020). Arguably, the most reliable long-term protection against COVID-19 lies in prevention and deterrence of the disease as opposed to remedy and absolute antidotary solutions.

In the recent decade, scholars have been exploring the use of speech to aid diagnosis of illnesses. For example, soft and breathiness in speech, resulting from the lack of control of vocal tract, is a plausible trace to Parkinson’s disease (Brabenec, Mekyska, Galaz, & Rektorova, 2017); prosodic features such as rhythm, tone, pitch...etc. have been known to relate to conditions such as “post-traumatic stress disorder, traumatic brain injury, and depression” (Brown et al., 2020, p.1). Thus, it is plausible that speech can be used as a biomarker for illness diagnosis.

Therefore, in this project, I aim to build a diagnostic tool for COVID-19 detection based on speech data and carry out socio-phonetic experiments on these models. As the coronavirus is a respiratory disease with symptoms that prompt pulmonic paralinguistic sounds, such as coughing, wheezing, and voicing, it is predicted that these features will have acoustic significances and patterns that can be used to train two Machine Learning models, namely, a Convolutional Neural Network (CNN) and a Multilayer Perceptron (MLP). Such a technology is not only portable and requires less time to output results, but it will also reveal speech features that are affected by COVID-19 that yet remain unexplored. Furthermore, the existence of such a system in hand with laboratory testing would be a life-saving resource for economically developing areas of the world that cannot afford modern medical resources, such as RT- PCR test kits and vaccines.

Importantly, the majority of the literature surrounding this area often overlooks the (socio-)phonetic/linguistic foundations of the analysis (Banerjee et al., 2019; Brown et al., 2020). Therefore, what makes this research truly original, is the sociophonetic experiments that will be tested on the system. These include, exploring which vowel carries the most ‘COVID-19 information’, the allophonic composition contribution to classification and whether anatomical differences due to gender, i.e., males having longer vocal tract (VT) length than females, would lead to significant differences in results. Rather than contributing the shortcomings of the system to algorithmic solution designs, it is just as important to consider VT anatomy and articulatory phonetics, to better understand what the data can reveal about speech production in COVID-19 patients.

References:

Banerjee, D., Islam, K., Xue, K., Mei, G., Xiao, L., Zhang, G., . . . Li, J. (2019). A deep transfer learning approach for improved post-traumatic stress disorder diagnosis. Knowledge and Information Systems, 60(3), 1693-1724.

Brabenec, L., Mekyska, J., Galaz, Z., & Rektorova, I. (2017). Speech disorders in Parkinson’s disease: early diagnostics and effects of medication and brain stimulation. Journal of neural transmission, 124(3), 303-334.

Brown, C., Chauhan, J., Grammenos, A., Han, J., Hasthanasombat, A., Spathis, D., . . . Mascolo, C. (2020). Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data. arXiv preprint arXiv:2006.05919.

Organization, W. H. (2020). Naming the coronavirus disease (COVID-19) and the virus that causes it. Brazilian Journal of Implantology and Health Sciences, 2(3).

Singh, J., & Singh, J. (2020). COVID-19 and its impact on society. Electronic Research Journal of

Social Sciences and Humanities, 2.

Loading…