Abstract: This article provides a brief overview of examples from published work covering various areas of the use of artificial intelligence technologies in solving problems caused by the COVID-19 pandemic. Three groups of technologies have been identified, including monitoring of network resources, modeling of epidemic processes, and systems for monitoring the mobility of citizens. Monitoring of information flows provides an alternative to the traditional methods of detecting the onset of epidemics, a general picture of the dynamics of infections, and determining the mood in society for timely management decisions. Epidemic modeling makes it possible to predict the scale of a disaster, assess immediate and long-term consequences, and determine the effectiveness of planned protective and preventive measures. Various systems for monitoring the mobility of citizens serve the purpose of warning citizens about danger, identifying violations of the introduced restrictions. To effectively combat possible future pandemics, it is necessary to unite the efforts of many states at the international level and create effective systems of integrated information technology support for anti-epidemic services.
Keywords: COVID-19 pandemic, artificial intelligence, information flows, databases, monitoring of network resources, modeling.
Many experts pay attention to the threat of new epidemics with previously unknown characteristics of pathogens. These circumstances require the improvement of systems of anti-epidemic protection of the population. In the modern world, the use of information technologies based on artificial intelligence (AI) has gained great importance, which provide many opportunities to solve this problem. The COVID-19 pandemic has spread across the planet in a matter of months. Research and development in the field of epidemiology has intensified since its inception. Already in the first year of the spread of the COVID-19 pandemic, we received many examples of the use of a number of such technical solutions in anti-epidemic control systems. The number of thematic publications on this issue is constantly increasing.
The purpose of this review is to draw attention to the diversity of applications of software and tools based on artificial intelligence technologies in organizing both preventive work and response measures in the event of an epidemic spread of infectious diseases. The concentrated description of such examples can increase the interest of health authorities in the capabilities of systems based on artificial intelligence. As a result, become an incentive for the targeted development and financing of special programs to create integrated systems of information technology support for anti-epidemic services. Andy Chun (2020) notes that in China, AI is being used to fight the virus on all fronts. With its ability to learn quickly, AI saves humans time in sequencing the genome of Sars-CoV-2, designing lab tests, analysing CAT scans and making new vaccines.
Issues to been considered below include:
- monitoring of information flows on the Internet, which is used to timely detect the beginning of an epidemic, to track the dynamics of its spread, to assess the level of panic in society;
- modeling of epidemic processes to predict the development of a pandemic, as well as to evaluate the effectiveness of various interventions in order to develop recommendations for government bodies;
- means, systems and methods for preventing the spread of the disease, including monitoring compliance with the restrictive measures introduced by the authorities.
Monitoring of information flows
One of the key tasks of responding to the large-scale spread of infectious diseases is the timely detection of the very fact of the origin of the epidemic.
A key issue in the fight against epidemics for the health authorities in most countries has become a noticeable delay in the response to both the very fact of the onset of the epidemic and the identification of pathogens. Such a delay, as a rule, is due to the need to consistently perform a number of organizational procedures, which in total take a long time. The first stage is the collection of current statistical information and its comparative processing, which results in the identification and analysis of the facts of an increase in cases of diseases similar in their development. After ascertaining such events, a check is organized in specialized laboratories of analyzes obtained from sick people in order to isolate and study infectious agents. The isolated substances are tested for the resistance of pathogens to external conditions. This allows you to determine the environment and possible channels for their distribution. The development and testing of the effectiveness of measures to counteract the spread of infection is carried out based on the data obtained during the research.
Thus, traditional procedures take a significant amount of time to identify and develop responses to the emergence of new types of infectious agents. As a result, the development of timely measures to localize and suppress foci of infection is difficult. Even for a well-studied disease, such as the flu, whose symptoms do doctors routinely report as part of a surveillance program, it can take several weeks to identify and respond to an outbreak. For diseases that are not controlled in such a regular regime, the delay in general can be catastrophic.
- Eisenstein (2018) points out that in recent years, alternative ways to solve this problem have arisen using indirect data circulating in the information space. Based on a summary of the results of a number of successful examples of such use of data, he concludes that it is expedient to create systems for targeted monitoring of information flows. He analyzes a number of examples that demonstrate the capabilities of modern tools for monitoring information flows on the Internet for detection of epidemics. M.Eisenstein concludes that with appropriate attention, the data generated on the Internet can be used to more timely identify various outbreaks of infectious diseases.
The author focuses on research conducted by Google, which monitors the frequency of various search queries. Using a special algorithm, the company’s specialists identified the most relevant queries in relation to the emergence of an influenza epidemic, and then used them to identify the rate of spread of the disease. Comparison of the growth rates of such requests with the results of registration of patients made it possible to conclude that it is possible to predict an outbreak of influenza in certain places with an accuracy of one day, which was demonstrated in subsequent periods after the start of work in 2008.
The primary positive results of the proposed approach obtained at the initial stage initiated the research of other scientists in this field. For example, in Brazil, Twitter messages have been used to predict the spread of Dengue fever. Similarly, a database of Google and Twitter searches predicted the spread of Zika in Latin America weeks before official outbreak announcements were made by public health officials.
Unfortunately, it was found that the developed algorithm allows you to track the speed of the spread of the epidemic, but does not always serve as an indicator of its onset.
For example, the Google algorithm «missed» the beginning of the H1N1 pandemic in 2009. It is assumed that the reason was the algorithm’s focus on fixing the signs of one specific disease — the flu epidemic. During the current COVID-19 pandemic, this algorithm also failed. Experts who tried to deal with this problem concluded that the probable cause of the failure was the increased attention to this topic in the field of news information flows. The panic sown by media reports has significantly distorted the nature and frequency spectrum of queries from search engine users.
Therefore, the problem of obtaining relevant data based on the processing of information flows for use in order to timely detect the onset of epidemics remains. It is associated not only with the difficulties in identifying the spectrum of required data and tuning (training) algorithms. Another difficulty is related to the limited availability of these tools directly to epidemiologists. Due to the proprietary nature of the listed search services, specialists responsible for the field of epidemiology cannot get access to the data they need without involving the developers of these services themselves. And this reduces the efficiency and variability of research.
The promise of detecting epidemics by monitoring information flows is further supported by other examples cited in articles by Cory Stieg (2020) and John McCormick (2020). Thus, Cory Stieg (2020) writes that «A little after midnight on Dec. 30, artificial intelligence platform BlueDot picked up on a cluster of “unusual pneumonia” cases happening around a market in Wuhan, China, and flagged it. BlueDot had spotted what would come to be known as COVID-19, nine days before the World Health Organization released its statement alerting people to the emergence of a novel coronavirus… The key to BlueDot is big data. It uses natural language processing and machine learning to cull data from hundreds of thousands of sources, including statements from official public health organizations, digital media, global airline ticketing data, livestock health reports and population demographics». John McCormick (2020) describes HealthMap, a program affiliated with the nonprofit Boston Children’s Hospital that monitors infectious diseases. HealthMap has created a digital map using artificial intelligence and other technologies to constantly track the spread of the novel coronavirus.
Search engines are not the only source of information about the possible beginning of an infectious epidemic. Information that is much more relevant can be obtained from online communities, especially the target profile. One of such sources is already today the network community of doctors ProMED-mail, which unites more than 70,000 people from all over the world. The exchange of medical observations takes place in this community and their reports on infectious diseases are accumulated in one database. If used properly, the flow of such data on the Internet can give public health systems a significant head start in mobilizing an outbreak response. In particular, it was this network of professional community that drew attention to the emergence of new infectious diseases, when doctors first identified an outbreak of severe acute respiratory syndrome (SARS) in China in 2003, and also reported on MERS (Middle East respiratory syndrome) in Saudi Arabia in 2012.
All of the above examples indicate the possibility of identifying disease outbreaks by a variety of information features extracted from social networks on the Internet.
Yet, the most reliable and trustworthy source of information on infectious diseases should be direct data from medical records, which immediately reflect the symptoms, periods, and nature of the disease. It is necessary to create a single database of morbidity. The data must be presented in an anonymized generalized form so as not to violate the principle of confidentiality of information.
Such an array of data will make it possible to more quickly identify cases of infectious diseases and take timely action. It is clear that it is impossible to process it manually. However, with such a database, similar signs of diseases in certain local areas can be quickly detected using data processing by specialized AI-based algorithms. Unfortunately, there are no regulations for the timely entry of such data into a single database anywhere in the world, there is no prompt access to such information, and there are no automated means for their purposeful processing. As the experience of the COVID-19 pandemic has shown, the rate of spread of infectious diseases due to the high mobility of the planet’s inhabitants and the processes of constant migration can be extremely high, and this increases the risk of new pandemics. Therefore, the target task of WHO should be the creation of such an information base on the incidence of the population on a planetary scale, regulations for working with information and algorithms for its processing on a global scale.
Until this task is solved, the complex integration of other heterogeneous data sources within one or more monitoring systems can play a positive role in creating a system for the timely detection of epidemics and improving the responsiveness.
Monitoring of network resources can be used to not only identify, control and predict the dynamics of the spread of infectious disea.
- Hou et al (2020) describe an equally interesting example of using data by monitoring information flows. The researchers proposed a complex technology for analyzing a variety of information to assess the mood in society. An analysis of the behavior of residents was used to develop recommendations for the formation of the current information policy of the management in the crisis conditions of the pandemic.
The study by Chinese scientists aimed to assess public attention, risk perception, emotions, and behavioral responses to the COVID-19 outbreak in real time based on social media surveillance data. To this end, the authors collected and processed data from the most popular Chinese social networks: Sina Weibo, Baidu search engine and Ali e-commerce marketplace for the period from December 1, 2019 to February 15, 2020. Weibo message graphs and Baidu search were used to create public attention score indexes. Public intent and actual adoption of recommended protective measures or panic buying driven by rumors and misinformation were measured by the Baidu and Ali indices. Qualitative messages on Weibo were analyzed by a linguistic research and text analysis program by word count. An assessment was made of public opinion and emotional reactions to both the available reports of epidemiological events and statements by government bodies.
Research has identified missed opportunities for early response to the COVID-19 outbreak. Negative public emotions were caused by untimely publication of objective information. Because of this, more energetic intervention by government agencies was subsequently required to eliminate panic. There have been rumors and misinformation about remedies and treatments that have led to panic buying during the outbreak. Subsequently, the public reacted quickly to government announcements and adopted the recommended behaviors in accordance with the published guidelines. Thus, it is shown that in times of crisis, timely detection and clarification of rumors effectively reduces irrational behavior in society.
The authors conclude that competent real-time monitoring of information in social networks can provide both a prompt assessment of the public’s reaction to the measures taken to combat the epidemic and the quality of risk communication, and the timely detection of rumors. This knowledge allows you to purposefully manage the behavior of the population and reduce the negative consequences.
Therefore, this kind of monitoring should also be included in epidemic preparedness and response systems.
The above examples show the possibilities of using specialized algorithms for processing current information flows to analyze the situation and respond in a timely manner to its change.
Modeling of epidemic processes
The second area of work to combat the pandemic is modeling the situation and forecasting its development under various response scenarios.
The results of forecasting the spread of the pandemic in various countries based on the analysis of incoming data on the growth rate of the incidence are presented by B.M.Ndiaye at al (2020). The authors described the results of using machine-learning tools to analyze the coronavirus pandemic on a local and global scale. The well-known standard SIR model was adopted as the basic model for the development of the epidemic. Based on open data on infection cases posted by that time on a network resource COVID-19 Data Hub, the authors evaluated the main key parameters of the model using two types of neural network. The obtained characteristics were used to make a forecast regarding the inflection points and the possible time of the development of the pandemic in countries such as China, Italy, Iran, Senegal, as well as for the world as a whole. The authors of the article emphasized that, unfortunately, the data available at that time were insufficient for the correct training of neural networks. It seems that such an approach would be more adequate if data on morbidity are considered in combination with data on the density of economic ties and population mobility in a global dimension.
Mathematical modeling of the spread of infection was used not only to predict the extent of the development of the disease, but also to assess the effectiveness of anti-epidemic measures. Thus, the work of Italian researchers G.Giordano et al (2020) presents the results of modeling the development of the epidemic with an assessment of the impact of various scenario measures of sanitary distancing. Taking into account the features of the spread of COVID-19 infection established at the initial stage, the authors, based on SIR modeling approaches, proposed an extended model called SIDARTHE, in which, in addition to the traditional categories, the registration of detected and undetected cases of infection was introduced, and differences in the severity of the disease are taken into account, requiring the placement of some patients in intensive care units. Re-infection in the model is not taken into account due to the extremely small number of detected cases. It is shown that in the long run the model is not sensitive to the initial data. To calibrate the model parameters, the national data on the evolution of the epidemic obtained by the time of its application were used. Based on a model calibrated against real data, the authors considered possible long-term scenarios demonstrating the impact of various countermeasures to contain the spread of infection. Based on the model, studies were carried out and predictive graphs of the population dynamics of the spread of morbidity under various restrictive requirements for maintaining social distance were presented, which later served as a rationale for developing management decisions.
The results of studies similar to the topic, but based on other modeling principles, are presented by Silva P.C.L. et al (2020).
The authors discuss the principles of building a multi-agent model of society and the results of modeling the processes of the spread of the disease, as well as their impact on economic performance under various scenarios of response from government agencies.
In their work, the authors use the agent-based modeling paradigm not only to assess purely epidemiological processes, but also the economic consequences of the COVID-19 epidemic. The following categories are represented in the model as agents: people, households (families), manufacturing or service enterprises, government bodies, the healthcare system with its medical institutions. The paper considers several different scenarios of a possible state response: from non-intervention to very severe restrictions on the movement of people and the functioning of industries. These scenarios assessed both the rate of spread of the epidemic and the economic impact.
Despite the fact that the interaction of various categories of agents is considered rather conditionally, especially in its economic aspects, such modeling allows one to obtain a qualitative picture of the effectiveness of certain response measures for comparing them with each other. The results obtained allow us to develop appropriate recommendations for the authorities.
Thus, mathematical modeling of epidemics is of interest for developing and evaluating the effectiveness of various scenarios for government response in the event of epidemics. Among the recommended measures, in addition to purely medical ones, various measures are used to limit the mobility of citizens in order to curb the spread of infection. However, control over the actual implementation of such measures requires additional efforts on the part of the state. Various technological solutions have been proposed in different countries for these purposes.
Kashkin S.Yu. et al (2020) describes some of the interesting examples of the practical application of such technologies, based on the use of artificial intelligence..
For example, in China, a remote recognition system for sick faces was quickly implemented. For this purpose, law enforcement officers were equipped with specially designed helmets that can detect and mark people with a high body temperature in a certain way. In addition, for the period of quarantine, the Health Check system was introduced, which operates on the popular Alipay and WeChat platforms, with the help of which special QR codes were automatically generated. Depending on the person’s status, the color of the pass determined the degree of freedom of movement: green (freedom of movement), orange (seven days of quarantine) or red (14 days of quarantine).
In Israel, a phone application was launched, the purpose of which was to inform mobile device users about the danger of contact with potential carriers of a viral infection.
In South Korea, a comprehensive system of total monitoring of infected people and citizens in contact with them has been deployed. Continuously, with the help of artificial intelligence, the GPS coordinates of the citizens under surveillance, as well as their operations with bankcards, the use of transport and data from video surveillance cameras, were analyzed. In case of violation of the regime prescribed by him, information was immediately received by a specially created monitoring center. Thanks to the timely implementation of this system, South Korea became one of the few countries that did without general self-isolation.
Singapore implemented the TraceTogether mobile application, created to combat the spread of infection, which also made it possible to abandon the mass isolation of citizens. A distinctive feature of this approach is the data collection method, which is focused on identifying and fixing Bluetooth contacts between mobile devices. Due to this, meetings of people were tracked only in the zone of danger of infection, which made it possible not to conduct a total monitoring of all the movements of each person. In the event that a person fell ill, the system automatically alerted everyone with whom he had contact in the last 14 days to take preventive measures.
Italy also launched an application that helped track the route of a person infected with the virus and warn people who come into contact with him. An important aspect of this system was the fulfillment of confidentiality requirements, since when distributing alerts, subscribers’ data is not disclosed, but only the movements of their smartphones are recorded.
Similar control measures using ICT (information and communication technology) tools have been implemented in Moscow and the surrounding regions. One of them was the control over the movements of citizens using video surveillance systems during the period of strict self-isolation, including the analysis of the movement of personal cars. Movements were allowed only for special categories of workers who had special passes.
Violators automatically received a fine. The second measure was the use of «social monitoring» technology. For this, the location of patients with confirmed coronavirus who were prescribed home treatment was determined by the geolocation of their mobile phone.
The use of artificial intelligence based on various means of communication and tele monitoring makes it possible to control the epidemiological situation, predict the evolution of disease outbreaks, and protect the population, even if by restricting freedom of movement.
Some more examples are given in the review CAHAI (2020).
When developing systems using AI based on neural networks, great attention should be paid to the quality of the initial data. It has already been noted above that the B.M.Ndiaye at al (2020) questioned their own results due to the extremely small amount of data available at that time for training the neural networks used. Some additional aspects of the quality of publicly available data are addressed by Danilova I. (2020) although from a slightly different angle. She was interested in assessing by society the effectiveness of the efforts of state bodies in the fight against the pandemic by comparing factual data from different countries. At the same time, attention is focused on the problem of their comparability.
The author notes that information on the number of cases and deaths from COVID-19 is now collected by countries online and becomes quickly available to everyone. Open data about the epidemic is essential for controlling the disease and informing the public. They are used both by researchers and governments to analyze the epidemic and model its future development, as well as by the media, bloggers and other opinion makers. Based on the published quantitative data on illnesses and deaths from COVID-19, many different conclusions are drawn, which are quickly disseminated on the Internet through social media platforms. But, despite the availability and openness of these data themselves, which exist today, much less is known about the criteria by which these data are collected and what their limitations are. The analysis of available information on how data on the number of cases and deaths in different countries is formed, given in the mentioned article, shows that very often the data are not comparable with each other. Different countries use different criteria both for testing for the virus and identifying cases, and for determining deaths from COVID-19. Moreover, the criteria themselves may also change over time as the situation is reassessed. The results may significantly depend on the quality of the tests used, on the number of tests carried out, on the rules for registering certain parameters. Thus, there were cases when all deaths were recorded in the number of deaths from COVID-19 only based on the considerations that additional funding was allocated to medical clinics for working with such patients.
Misunderstanding of the level of limitation of the data used, the degree of their comparability can lead to false conclusions and interpretations. Such data are usually used to calibrate models by researchers, including models based on training neural networks, which in itself can cause systematic errors and influence the development of erroneous management decisions. In addition, comparing countries by COVID-19 incidence and mortality based on data collected according to different criteria can have far-reaching political implications, namely, in terms of how people evaluate the efforts made by the government of their country based on this information. countries to reduce morbidity and mortality. Therefore, when using any data from different sources, there should be a deep analysis of the methods and conditions for obtaining them, especially if these sources reflect data obtained in different countries.
The analysis of scientific publications shows a wide variety of possible practical applications of modern technologies related to the processing of big data by artificial intelligence methods in a broad interpretation of these concepts. These include the use of data on the dynamics of certain user activity on the Internet for early detection of the emergence and course of epidemic processes, sentiment analysis in online communities to adjust the information policy of government authorities, models for predicting the development of a pandemic using neural networks based on the dynamics of messages about new cases of infection and other information on the development of diseases, specialized methods and systems for monitoring the movements of infected persons, as well as warnings about the danger of people potentially in contact with them, and many others. Based on such technologies, complex systems of information technology support for epidemiological services can be created and deployed. Given the accumulated positive experience in the implementation and operation of individual systems in various countries, it can be expected that international cooperation in this area can become the most effective for creating complex systems.
The reported study was funded by RFBR and CNPq, FASIE, DBT, DST, MOST, NSFC, SAMRC according to the research project № 20-51-80002.
References1. Andy Chun. In a time of coronavirus, China’s investment in AI is paying off in a big way. SCMP. Mar 18 2020. https://www.scmp.com/comment/opinion/article/ 3075553/time-coronavirus-chinas-investment-ai-paying-big-way?fbclid=IwAR3Jdx PGOGaZ641HBCA-t2aasnXM9VgOSSZMYCtSfb2eGZDinOOpSWyJeVo.
2. CAHAI (2020). AI and control of Covid-19 coronavirus. Overview carried out by the Ad hoc Committee on Artificial Intelligence secretariat. https://www.coe.int/en/web/artificial-intelligence/ai-and-control-of-covid-19-coronavirus.
3. COVID-19 Data Hub, available on https://www.tableau.com/ covid-19-coronavirus-data-resources.
4. Cory Stieg. How this Canadian start-up spotted coronavirus before everyone else knew about it. Health and Wellness, Mar 3 2020. https://www.cnbc.com/2020/03/03/bluedot-used-artificial-intelligence-to-predict-coronavirus-spread.html.
5. Danilova I. (2020). Morbidity and mortality from COVID-19. The problem of data comparability. Demographic Review, 7(1), 6-26. https://doi.org/10.17323/demreview.v7i1.10818.
6. Eisenstein M. (2018) Infection forecasts powered by big data. Nature. 2018 Mar 8;555(7695):S2-S4. doi: 10.1038/d41586-018-02473-5. PMID: 29517020.
7. John McCormick. Online Map Tracks Coronavirus Outbreak in Real Time. WSJ. March 5, 2020. https://www.wsj.com/articles/online-map-tracks-coronavirus-outbreak-in-real-time-11583354911
8. Giordano G., Blanchini F., Bruno R., Colaneri P., Di Filippo A., Di Matteo A., Colaneri M. et al. (2020), “A SIDARTHE model of COVID-19 epidemic in italy,” arXiv preprint arXiv:2003.09861.
9. Kashkin S.Yu., Tishchenko S.A., Altukhov A.V. (2020) Legal Regulation of the Artificial Intelligence Application for Combatting the Spread of COVID-19: Problems and Prospects based on World Experience. Lex Russica. 2020;(7):105-114. (In Russ.) https://doi.org/10.17803/1729-5920.2020.164.7.105-114.
10. Ndiaye B. M., Tendeng. L., and Seck D.. (2020) Analysis of the COVID-19 pandemic by SIR model and machine learning technics for forecasting / arXiv:2004.01574v1 [q-bio.PE] 3 Apr 2020.
11. Silva P.C.L., Batista P.V.C., Lima H.S., Alves M.A., Guimarães F.G., Silva R.C.P. (2020) COVID-ABS: An agent-based model of COVID-19 epidemic to simulate health and economic effects o f social distancing interventions // Chaos, Solitons & Fractals. 2020. P. 37. E-print: arXiv:2006.10532 [cs.AI]. URL: https://doi.org/10.1016/j.chaos.2020.110088.
12. Hou Z., Du F., Jiang H., Zhou X., and Lin L. (2020) “Assessment of public attention, risk perception, emotional and behavioural responses to the COVID-19 outbreak: social media surveillance in China,” Risk Perception, Emotional and Behavioural Responses to the COVID-19 Outbreak: Social Media Surveillance in China (3/6/2020), 2020.