Abstract
In the global epidemic, distance learning occupies an increasingly important place in teaching and learning because of its great potential. This paper proposes a web-based app that includes a proposed 8-layered lightweight, customized convolutional neural network (LCCNN) for COVID-19 recognition. Five-channel data augmentation is proposed and used to help the model avoid overfitting. The LCCNN achieves an accuracy of 91.78%, which is higher than the other eight state-of-the-art methods. The results show that this web-based app provides a valuable diagnostic perspective on the patients and is an excellent way to facilitate medical education. Our LCCNN model is explainable for both radiologists and distance education users. Heat maps are generated where the lesions are clearly spotted. The LCCNN can detect from CT images the presence of lesions caused by COVID-19. This web-based app has a clear and simple interface, which is easy to use. With the help of this app, teachers can provide distance education and guide students clearly to understand the damage caused by COVID-19, which can increase interaction with students and stimulate their interest in learning.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
At the beginning of the COVID-19 pandemic outbreak in late 2019 [1,2,3], most people did not realize that it would be a global pandemic of great magnitude [4, 5]. The pandemic has spread worldwide and has already posed an enormous threat to people’s lives, with the number of confirmed cases still rising rapidly in all regions [6, 7]. Mutating viruses such as delta [8, 9] and omicron [10, 11] have put the pandemic situation under tension again and again. Fortunately, with the joint efforts of governments [12, 13], medical personnel [14, 15], and citizens [16, 17], the pandemic has been largely contained in many areas, and work and production are resuming in an orderly manner [18, 19].
COVID-19 is a lung disease caused by SARS-CoV-2 [20]. The virus is transmitted mainly by airborne droplets and contact [21] but can also be transmitted through objects or other surfaces [22]. Symptoms of infection with COVID-19 include fever, cough, malaise, and breathlessness [23]. Some patients may have more severe symptoms [24], such as pneumonia [25], lung infections [26], and loss of taste or smell [27, 28]. Severe infections can also cause infectious shock [29], sudden blood drops, a lack of oxygen to the body’s organs, and death [30]. People over 60 years of age with a smoking history and high blood pressure are relatively more likely to be infected [31, 32].
As a result of the pandemic, educational arrangements and requirements [33, 34] have been adjusted in schools around the world to meet the demands of the pandemic. Many courses taught offline have been changed to online distance education. In fact, for teachers [35, 36], transitioning from traditional face-to-face to online distance education is quite challenging [37]. Facing the unfamiliarity of the teaching methods, teachers need to constantly explore and improve their teaching approaches rather than simply copying the original teaching solutions. If teachers do not have adequate IT knowledge [38], it may not be easy to complete online teaching. For students [39, 40], during distance online education, students need to learn alone, which lacks the engagement of classroom lessons and can easily lead to fatigue. Distance education is limited because it cannot be taught face-to-face, so teachers must use rich and varied online resources to fill some teaching gaps [41]. If the teaching design is weak in interactivity for the teacher, this will result in a less engaging classroom for the students and a poorer overall outcome. This requires teachers to make more use of network resources, including images, audio, video and supplementary teaching platforms, to optimize the design of new teaching programs and improve teaching effectiveness in distance education.
Distance education often uses computer multimedia technology [42], computer network technology [43], and communication technology [44]. Distance education is a cross-regional mode of teaching and learning. There is no requirement for the location of students or teachers in this mode [45, 46]. The way information is transmitted, and the place of learning are flexible. It allows students to learn without the hindrance of time or space, thus allowing for personalized learning. The advantage of distance learning over face-to-face education is that distance learning offers students in poorer areas more opportunities to learn at a low cost.
Computer multimedia technology and network technology can provide teachers and students with a wide range and quality of teaching resources. Software, computers, mobile phones, and other hardware can support distance learning. For example, abstract theories can be visualized by drawing images with the powerful computing capabilities of computers. In the process of communication with people, computers enhance the sharing of resources, collecting and organizing information, and creation of databases. Database resources in text, audio, and video are increasingly used in education as new computer resources.
Currently, many studies are attempting to use computer network resources to help with distance education. Severino et al. (2021) [47] developed an online platform to help students up to second grade with basic learning. In her learning plan, she segmented multiple lessons according to the learning abilities of students of different ages, allowing for an improved user experience. Lowry et al. (2022) [48] developed a high-fidelity simulation platform. It combined with instructional videos and allows students to collaborate remotely to simulate laparoscopic surgery. The students made corrections and practice again and again based on feedback from the platform. The difference in performance between the students instructed by the teacher and those who practiced on the platform was small. There was a significant improvement in the students’ surgical performance after practicing with the platform. Zheng et al. (2022) [49] designed a simulation teaching resource for non-electrical students with the theme of safe electricity use. Students can practice the theory they have learnt by conducting realistic and more dangerous experiments on the platform. Hopefully, this will improve and compensate for the shortcomings of traditional teaching resources and teaching models. Lin (2022) [50] combined a variety of theoretical knowledge that students need to master to build a stable simulation platform. Students used this platform to simulate and practice foreign trade transactions. The technology in this learning platform can be changed according to the needs of teaching and learning to ensure that the content does not become outdated. Computer resources are also used in many areas, including transportation systems [51], emotion recognition [52], and action recognition [53].
We reviewed some advanced COVID-19 detection methods. Wang et al. (2020) [54] proposed a weakly supervised deep learning framework using pre-trained UNet for segmentation and feeding 3D images into a 3D deep neural network to obtain DeCovNet. Wu (2020) [55] combined wavelet Renyi entropy [56], feedforward neural network, and the 3SBBO algorithm. A better-performing WRE + 3SBBO was obtained. The paper [57] is similar to the idea in the paper [55]. It has three stages in the framework. El-kenawy et al. (2020) [57] proposed FSVC combining CNN, guided whale optimization algorithm [58]. The dataset used in the paper [59] was chest X-ray images. The proposed COVID-Net method has portability, availability, and rapid triaging. The 6 L-CNN [60] was a six-layer convolutional neural network that combines max pooling and batch normalization. The Adam algorithm [61] improved the detection of COVID-19 patients. The WE-CSO was based on wavelet entropy and cat swarm optimization [62] in the papers [60, 63]. The CNN in DLM [64] was trained to have a higher accuracy rate. The method GLCM-PSO proposed in the paper [65] combined grey-level cooccurrence matrix and PSO [66].
These approaches above still have some common disadvantages. These models are not explainable and have relatively low accuracy in recognizing whether the input image contains lesion regions or not. Although these models can assist in diagnosing COVID-19, they impose demanding requirements for the user. The user needs to have some medical knowledge. Also, their software interfaces are not easy to work on.
Our proposed lightweight, customized CNN (LCCNN)-based model and distance education app for COVID-19 recognition can solve the above problems. Our contributions are as follows:
-
We propose an 8-layer lightweight, customized convolutional neural network.
-
Five-channel data augmentation is proposed and used to help the model avoid overfitting.
-
Our LCCNN model performs better than eight state-of-the-art models.
-
Our LCCNN model costs fewer resources than six transfer learning models.
-
Our LCCNN model is explainable for both radiologists and distance education users.
-
Our LCCNN model is integrated into a distance education-based web app.
The rest of the paper is structured as follows. Section 2 describes the dataset we used in the course of our experiments. Section 3 describes the proposed 8-layer lightweight, customized convolutional neural network model (LCCNN). Section 4 discusses the results of the experiments conducted using our proposed model and the validation and compares the results with other existing state-of-the-art approaches. In Section 5, we conclude the research of this paper.
2 Dataset
The dataset we used is from the paper [67]. A local hospital generated the dataset from 142 COVID patients (95 males and 47 females) and 142 healthy people (88 males and 54 females). After CT medical images are taken of both experimenters, the images are transferred to the medical image PACS. Two experienced physicians select clear and appropriate images for the dataset. The total dataset obtained consists of 640 images. Figure 1 shows two samples from the dataset this study used. The resolution of all images is 1024 × 1024.
3 Methodology
Table 1 gives the abbreviation list in this study. Convolutional neural networks (CNNs) are designed based on the neural system that transmits signals in the human body. Neurons can respond to a part of the surrounding units in the range of the reach. CNN belongs to feed-forward neural networks and has outstanding performance in large-scale image processing.
In the CNN, the signals enter the input layer, are treated with linear combinations and activation functions, then flow to the next layer. The signals are processed through each hidden layer and output to the output layer. Such signal delivery is a forward propagation process. The equation (Eq. 1) for forward propagation is as follows:
where \(n\) represents the number of feature maps in the last layer, i.e., \(\left(l-1\right)\)th layer. \({a}_{j}^{l}\) is the activation value of the \(j\)th neuron in the \(l\)th layer, which is also the output of the activation function. \({w}_{jk}^{l}\) denotes the weight of the \(k\)th neuron in the \(\left(l-1\right)\)th layer to the \(j\)th neuron in the \(l\)th layer. \({b}_{j}^{l}\) is the bias of the \(j\)th neuron in the \(l\)th layer. \(\sigma\) represents the non-linear activation function.
Backpropagation is updating the weight of the parameters in the direction of the output layer to the input layer. The backpropagation is used for feedforward neural network parameter training, hoping to continuously iterate to optimize the model parameters based on the calculated error. In training the neural network, forward and backward propagation rely on each other. The usual cost function is the quadratic cost function. On this basis, assuming that the _target data is one-dimensional, the equation (Eq. 2) [68] for calculating the cost \(C\) between the output value and the actual value is:
where \(m\) denotes the number of samples, \({y}_{oi}\) denotes the output of the \(i\)th sample, \({y}_{i}\) denotes the _target data of the \(i\)th sample, and \({C}_{i}\) denotes the squared error of the output data and the _target data of the \(i\)th sample.
The backpropagation process can be expressed by the equation (Eq. 3) [68]:
where \({z}_{i}^{l}\) denotes the input data at \(l\)th layer of the \(i\)th sample, \({a}_{i}^{l}\) denotes the output data at \(l\)th layer of the \(i\)th sample, \({W}^{l}\) denotes the weight of the \(l\)th layer. \(f\) denotes the active function. \(L\) denotes the output layer, and \(\delta {l}_{i}\) denotes the error at \(l\)th layer of the \(i\)th sample. The superscript \(T\) represents the matrix transpose.
3.1 Convolutional layers
The function of the convolutional layer is to extract features from the input data. The shallower convolutional layers extract local information, while deeper layers capture global information. A convolutional layer contains several convolutional kernels to create different feature maps. In a convolution layer, a kernel, known as a filter, slides over the input image according to the stride size. A convolutional kernel is a small matrix. During the convolution operation on an image, each value in the kernel is multiplied by the corresponding pixel value covered by the kernel. These multiplied values are then added together to obtain the value of the _target pixel in the feature map. Figure 2 shows a schematic diagram of a single convolution operation. After one convolution, a new feature map of the image is generated.
Single-layer convolution refers to a convolution operation using one convolution kernel. If the input image has three color channels (red, green, and blue), the convolution kernels must also have three channels. In three-channel convolution, the convolution kernels have length and width and a number of channels. The three-channel convolution performs convolution on each of the three channels of the image, resulting in the final output feature map. Figure 3 shows the process of a three-channel input image undergoing convolution with the stride step set to 1. After multiple convolutions, the output feature map can contain much information about the image, which can be used for image recognition tasks.
3.2 Hyperparameters in convolutional layers
The hyperparameters in the convolution layer include the convolution kernel size, the stride size, and the padding way. The distance that the center of the convolution kernel moves once in two adjacent convolution operations is the stride size. Researchers can control the accuracy of feature extraction by adjusting the stride size. These hyperparameters determine the size of the output feature map of the convolution layer. The convolution kernel size can be specified as any value smaller than the size of the input image. The size of the convolution kernel determines the size of the receptive field. The receptive field represents the range of the convolution kernel’s effect. A larger convolution kernel has a wider receptive field. It can capture more complex features, while a smaller convolution kernel has a smaller receptive field and can only capture simple features. The equation (Eq. 4) for calculating the receptive field size is as follows:
where \({R}_{i}\) denotes the receptive field size of the \(i\)th layer, \(i\) denotes the index of the current feature layer, \(t\) is the stride size of the convolution, and \({K}_{{size}_{i}}\) is the size of the convolution kernel in this layer.
If the stride size is set small, there will be duplicate areas between adjacent step fields. If the stride size is set larger, there will not be duplicate areas between adjacent step fields, but maybe parts that are not covered, causing the information of the original image to be missed. Typically, the convolution kernel is a square with an odd number of side lengths and is located using the center.
As demonstrated in Fig. 2, the shape of its output is reduced after the convolution operation is performed on the image. Scholars apply padding operations to preserve the image’s size while still undergoing convolution. Padding involves adding additional rows or columns, usually filled with zeros, around the edges of the input. This can be visualized in Fig. 4, where the image’s shape remains constant after the convolution operation. By padding the image and then convolving it to extract features, we can focus on the features at the edges, making them as important as the features in the middle.
In addition to the common single-channel convolution and multi-channel convolution, researchers have also attempted convolution by using multi-dimensional data. 3D CNN models use 3D images as input. The structure of a 3D CNN is similar to a standard 2D CNN model. In this case, the convolution layer uses 3D kernels for filtering. In a 2D CNN, the convolution kernel moves in two directions, while in a 3D convolution, it moves in three directions. 3D convolution requires more computational power and memory space for storing parameters and feature space than 2D CNN.
3.3 Pooling layers
Pooling layers are usually located in the middle of successive convolutional layers. Unlike convolutional layers, pooling layers do not have parameters to learn. Common pooling operations are average pooling, max pooling, and random pooling.
Average pooling aims to keep more of the image background information. Max pooling is used to reduce the bias in the mean value of the estimate caused by parameter errors in the convolution layer to preserve more of the texture information.
Using max pooling as the example, we first set up a sliding window and obtain the largest value from the image corresponding to the window as the output feature’s corresponding position value. This window is then slid to the next position in the general order of the set step from left to right, top to bottom. The example in the diagram shows a 2 × 2 window with a stride set of 2. The initial window is in the blue area of the diagram. The maximum value of 28 is taken from the window and passed to the following feature map, as shown in Fig. 5.
Average pooling mainly reduces the data size by taking the average value in each neighborhood, thus reducing computational costs. On the other hand, max pooling reduces the data size by taking the maximum value in each neighborhood, thus preserving important features. This study chooses max pooling. The max-pooling operation only extracts the maximum value in each rectangular region, the part with the strongest response, into the next layer. Using max-pooling means that features can be identified no matter where in the image they are located. With many images containing objects on the input, a model with good performance can be obtained.
When pooling is performed, the output results are less affected, even if there are minor deviations in the input data. The number of channels of input and output data will not change after the pooling. The number of parameters that need to be computed by the model is reduced, redundant information is cut, and the network’s complexity is decreased.
3.4 ReLU activation function
The activation function, an improvement proposed to solve linearly indistinguishable problems, maps a neuron’s input to the output side. In solving real problems, the data distribution is overwhelmingly non-linear, and it isn’t easy to rely solely on using linear neural network computations to solve them. It is, therefore, necessary to incorporate non-linear activation functions to enhance the learning capability of the network to make the neural network applicable to a wider range of models.
Various activation functions are available, which correspond to different properties and are suitable for different situations. The Sigmoid activation function is the first activation function used in neural networks and is often used in binary classification problems. Sigmoid activation functions predict results clearly but are prone to gradient disappearance. The Sigmoid activation function is not 0-centered, and convergence is slow when the number of layers is too much, making deep training impossible.
The Tanh function speeds up convergence based on the Sigmoid activation function but requires more computation for both forward and backward propagation. The ReLU activation function is prevalent in deep learning. It doesn’t have the gradient disappearance problem. Its speed of convergence and computational speed is fast. Its generality allows it to be used in several studies with a wide range of uses. The equations (Eq. 5) of Sigmoid, Tanh, and ReLU function activation, in turn, are as follows, and the function images are shown in Fig. 6.
3.5 Proposed LCCNN model
The more important layers in our proposed model are the six convolutional layers and the two fully connected layers. Our input image only has a single channel. The input layer size is 256 × 256 × 1. Con1 is a convolutional layer using 16 feature maps of size 3 × 3 × 1 with a stride size of 2. After the max pooling operation with the window size of 2 × 2, the output size is 64 × 64 × 16. Con2 is a convolutional layer using 32 feature maps of size 3 × 3 × 16 with a stride size of 1. After the max pooling operation with the window size of 2 × 2, the output size is 32 × 32 × 32. Con3 is a convolutional layer using 64 feature maps of size 3 × 3 × 32 with a stride size of 1. After the max pooling operation with the window size of 2 × 2, the output size is 16 × 16 × 64.
Con4 is a convolutional layer using 64 feature maps of size 3 × 3 × 64 with a stride size of 1. After the max pooling operation with the window size of 2 × 2, the output size is 8 × 8 × 64. Con5 is a convolutional layer using 128 feature maps of size 3 × 3 × 64 with a stride size of 1. The output size is 8 × 8 × 128. Con6 is a convolutional layer using 128 feature maps of size 3 × 3 × 128 with a stride size of 1. The output size is 8 × 8 × 128. The two fully connected layers are in size 200 × 8192 and 2 × 200, the details of the model are shown in Table 2.
3.6 Explainability of LCCNN
Common methods used to interpret CNN models are class activation mapping (CAM) and Gradient-weighted class activation mapping (Grad-CAM). Class activation mapping uses different colors to indicate the regions associated with the _target class.
The Grad-CAM improves on the CAM by looking to derive weights from gradients. Grad-CAM is a weighted sum of feature maps, followed by a ReLU operator. It (Eqs. 6, 7) can be calculated in Ref. [69]:
\({\alpha }_{k}^{c}\) denotes the weight obtained after the gradient of the return flow has been globally averaged for a _target class \(c\). \({A}_{ij}^{k}\) represents the \(k\)th channel of feature map \(A\), where the spatial index is (\(i,j\)). \({y}^{c}\) represents the score predicted by the network for class \(c\) before the softmax. \(Z\) is the spatial resolution of the feature map, which equals the width of the feature layer multiplied by its height. \(\partial {y}^{c}/\partial {A}_{ij}^{k}\) represents the gradient via backpropagation.
CAM requires the CNN based on global average pooling architecture, making the method difficult to apply to most CNN architectures. The structure of the CNN does not affect the Grad-CAM operation. Grad-CAM uses the gradient information flowing into the last convolutional layer of the CNN to understand the importance of each neuron to the decision.
3.7 Cross validation
Cross-validation is one of the essential methods used by scientists when performing statistical analysis. It is often necessary for the practice to verify a model’s stability and generalization ability to a new dataset. It is needed to ensure that the model obtained from the dataset has most of the correct information about the dataset without containing too much noise. In other words, the models’ bias and variance are small.
In the process of \(K\)-fold cross-validation, the dataset must be divided into almost equal K smaller datasets. The process of extracting different smaller folds as a test set and calculating the average can be represented by the equation (Eq. 8):
\({E}_{i}\) represents the output result of the \(i\) th test. \(e\) represents the average of the results of the \(K\) tests.
The obtained dataset is divided into training and testing sets in cross-validation. As shown in Fig. 7, we divide the data set into ten folds (D1, D2, …, D10) and then perform ten tests, changing a different testing set each time. In the end, we get ten models’ results and take the average as a result. \({E}_{i}\) denotes the output result of the \(i\)th test and \(e\) denotes the average of the results of the \(i\) tests.
3.8 Proposed five-channel data augmentation
Scholars struggle to have enough data to complete their tasks in many practical projects. We can expand our existing data by data augmentation. Data augmentation allows the model to classify more stable and helps the model recognize images in different conditions. Data augmentation allows existing data sets to be used to greater effect by increasing the number of times the model is trained.
Several common data augmentation methods include adding noise, cropping, flipping, rotation, scaling, and brightness. We hope that the proposed FDA method will be easy to achieve. Geometry-based data augmentation (GDA) methods are easy to perform. Unfortunately, it is often necessary to observe the data generated by the GDA method manually to ensure whether the image labels need to be redefined. In addition to hoping that the data augmentation approaches would be easy to realize, we also wanted to enhance the model’s generalization ability. By noise-injection data augmentation (NIDA), the robustness and generalization of the model can be improved.
Based on the above reasoning, we have chosen five methods of data augmentation from GDA and NIDA: translation, scaling, rotation, horizontal shear, and Gaussian noise injection, with the help of combining GDA and NIDA. The five-channel data augmentation (FDA) proposed consists of these five methods, as shown in Table 3.
Figure 8 shows the flowchart of the FDA. In the experiment, an input image is processed through multiple data augmentations. Each data augmentation method outputs 30 images.
4 Experiment results and discussions
4.1 Five-channel data augmentation
The results of the FDA are shown in Fig. 9, and we can observe five results: translation, scaling, rotation, horizontal shear, and Gaussian noise injection. Figure 9 shows that FDA can increase the variety of the dataset and compensate for the disadvantage of having fewer data in the dataset.
4.2 The results of the LCCNN
We obtained Table 4 after validating the performance of the LCCNN using 10\(\times\)10-fold cross-validation. The results of each indicator are expressed in terms of mean and standard deviation (MSD). The sensitivity is 91.44 ± 1.78, the specificity is 92.12 ± 1.37, the precision is 92.10 ± 1.15, the accuracy is 91.78 ± 0.44, the F1 score is 91.75 ± 0.52, the MCC is 83.60 ± 0.88, and the FMI is 91.76 ± 0.52. Table 4 shows that the experiment performed well, with the obtained performance indicators being less far from the mean and with smaller standard deviations. The results of each run had slight differences and performed flatly. The results fluctuated less for different subsets as the testing set. Sensitivity fluctuated the most, reaching 1.78. Among the multiple metrics, the accuracy values fluctuated the least, and the mean values were high, indicating that the LCCNN classifier was effective.
4.3 Convergence plot
Accuracy is one of the essential standards for evaluating the performance of a model. Figure 10 shows the process of accuracy change in the iterations. From Fig. 10, the light blue line shows the actual training results, the dark blue line shows the trend of the smoothed training results, and the black dots indicate the test results. As can be seen from the graph, the accuracy rate increases but tends to rise more slowly for training rounds less than the 200th iteration. The accuracy rate rises fast in training rounds in the 200th-400th iteration. The curve changes more slowly in the 400th-1800th iteration. In fact, after the 1800th iteration, we can see that the total performance has leveled off and remains at a high value.
4.4 Structure comparison
Although many studies are increasingly inclined to investigate deeper CNN models, the fact is that it is not beneficial to have too many layers in a CNN model. With increasing layers, more problems need to be solved, including computational power, gradient, activation function, etc., and even new problems may appear. The improvement in effectiveness is not significant after the number of layers is increased to a certain number. In building the model, we hope to set the most suitable number of layers to help LCCNN achieve the best results.
For this purpose, we set up models with different numbers of layers to test them separately. The final number of layers of LCCNN was determined by comparing the data obtained from the experiments. Table 5 shows the data obtained by running 10\(\times\)10-fold cross-validation with the separate 7-layer and 9-layer model settings. The comparative bar chart is based on Tables 4 and 5.
By observing Fig. 11, we can see that the performance of the 8-layer model is improved over that of the 7-layer model. The performance of the 8-layer model is slightly better than that of the 7-layer or 9-layer model. However, there is no improvement when the number of layers is increased over the 8-layer model, but rather a decrease in performance. In the end, we concluded that the best results were obtained when the number of layers was 8.
4.5 Comparison of state-of-the-art approaches
To understand the level of the LCCNN model, we selected eight state-of-the-art approaches to compare with our proposed 10\(\times\)10-fold cross-validation metrics for LCCNN. The advanced methods include DeCovNet [54], WRE + 3SBBO [55], FSVC [57], COVID-Net [59], 6 L-CNN [60], WE-CSO [63], DLM [64], GLCM-PSO [65]. Indicators include Sen, Spc, Prc, Acc, F1, MCC, and FMI.
Details are listed in Table 6. we plotted Fig. 12, and the data used do not include fluctuation data. The bar charts help us to see the comparison results more visually. The sensitivity of LCCNN is slightly higher than that of FSVC, but the fluctuations in the data are also relatively large. In terms of accuracy, which is more critical, LCCNN is less volatile and more accurate than DeCovNet and is more stable.
4.6 Network complexity comparison
The designers of early classical CNNs such as AlexNet, VGG16, GoogleNet, ResNet, and MobileNet focused on improving the accuracy of their respective classifications. Designers rarely consider the number of parameters and memory storage.
As shown in Table 7; Fig. 13, this study provides two metrics, the number of learnable parameters and memory storage. These two metrics are used to indicate the complexity of the network structure. Our proposed LCCNN model has the smallest number of learnable parameters and memory storage compared with other classical models. Remarkably, the number of learnable parameters in LCCNN is only 1.3% of that in VGG16.
4.7 Distance education app
In offline teaching, it is often difficult for teachers to clearly explain abstract concepts or knowledge. Therefore, the use of a remote education application is important. For this purpose, we have designed a web-based app based on LCCNN to diagnose COVID-19 CT images. This app assists teachers in providing detailed explanations and helps students better understand the course content.
Figure 14(a) shows the home page of our app. The user can access the web app by clicking on the graphical icon and start using the app.
In the web app interface, as shown in Fig. 14(c), the leftmost images are those waiting to be recognized. In the middle are two larger CT images, the left of which is being classified, and the right is the heat map generated from the left-side CT image.
There are two buttons on the rightmost side of the interface. Clicking on the topmost ‘upload custom image’ button allows you to upload a new unclassified image, and the pop-out window is shown in Fig. 14(b). Once the upload is complete, click the ‘diagnosis’ button to start the diagnosis. The label pointed by the red Knob shows the result of the diagnosis.
4.8 Explainability
As the app is aimed at helping teachers to show students the effects of COVID-19 on the lungs, our model needs to be explainable. Figure 15 shows the heat map obtained by our LCCNN model. From left to right in Fig. 15, the heat maps are produced from the first to the third runs.
Observing the positions and areas of the different colors in the images, we can find that though the run index may vary, the model remains accurate in identifying problematic areas of the images. The red highlighted area in Run 1 is located to the right of the lesion area. The red area in Run 2 is closer to the actual lesion area than in Run 1. In Run 3, the red highlighted area almost coincides with the actual lesion area.
5 Conclusions
In this paper, we design a distance app based on an 8-layered LCCNN for COVID-19 recognition. We hope this app will help teachers demonstrate the damage caused by COVID-19 in the lungs to their students in the classroom.
The distance education app provides a more visual way of teaching and taking advantage of multimedia equipment to give students a better understanding of the disease and the importance of protection. The core component of the app is an 8-layered LCCNN for COVID-19 recognition named LCCNN. It can classify the input lung CT images more accurately and quickly identify the CT images containing the lung’s diseased part. For data augmentation, we propose the five-channel data augmentation. FDA is easy to operate and can improve the robustness of the model. We evaluated its effectiveness through 10\(\times\)10-fold cross-validation. The test results were then compared with eight state-of-the-art image classification methods: DeCovNet, WRE + 3SBBO, FSVC, COVID-Net, 6 L-CNN, WE-CSO, DLM, and GLCM-PSO. The comparison results show that the LCCNN model has improved in each evaluation metric compared to the above eight state-of-the-art methods. The differences in classification accuracy obtained from testing on different test sets were small.
Although our proposed 8-layered LCCNN for COVID-19 recognition is a lightweight deep neural network model, we will develop a standalone app that can be installed and run on any mobile phone. Apart from that, although our web-based app works correctly, the interface is relatively clean, and we will beautify it in the future. For the LCCNN model, we will look for other publicly-available datasets to train on and improve its recognition accuracy.
Data availability
The datasets analyzed during the current study are available from the corresponding author upon reasonable request.
References
Singh S et al (2021) How an outbreak became a pandemic: a chronological analysis of crucial junctures and international obligations in the early months of the COVID-19 pandemic. Lancet 398(10316):2109–2124
Worobey M et al (2022) The huanan seafood wholesale market in Wuhan was the early epicenter of the COVID-19 pandemic. Science 377(6609):951–959
Sachs JD et al (2022) The Lancet Commission on lessons for the future from the COVID-19 pandemic. Lancet 400(10359):1224–1280
Aknin LB et al (2022) Mental health during the first year of the COVID-19 pandemic: a review and recommendations for moving forward. Perspect Psychol Sci 17(4):915–936
Samji H et al (2022) Mental health impacts of the COVID-19 pandemic on children and youth–a systematic review. Child Adolesc Mental Health 27(2):173–189
Wang H et al (2022) Estimating excess mortality due to the COVID-19 pandemic: a systematic analysis of COVID-19-related mortality, 2020–21. Lancet 399(10334):1513–1536
Rossen LM et al (2021) Notes from the field: update on excess deaths associated with the COVID-19 pandemic—United States, January 26, 2020–February 27, 2021. Morb Mortal Wkly Rep 70(15):570
Antonelli M et al (2022) Risk of long COVID associated with delta versus omicron variants of SARS-CoV-2. Lancet 399(10343):2263–2264
Rahman FI et al (2022) The “Delta Plus” COVID-19 variant has evolved to become the next potential variant of concern: mutation history and measures of prevention. J Basic Clin Physiol Pharmacol 33(1):109–112
Chenchula S et al (2022) Current evidence on efficacy of COVID-19 booster dose vaccination against the Omicron variant: a systematic review. J Med Virol 94(7):2969–2976
Arbel R et al (2022) Nirmatrelvir use and severe Covid-19 outcomes during the omicron surge. N Engl J Med 387(9):790–798
Mizrahi S et al (2021) How well do they manage a crisis? The government’s effectiveness during the Covid-19 pandemic. Public Adm Rev 81(6):1120–1130
Salem IE et al (2022) A content analysis for government’s and hotels’ response to COVID-19 pandemic in Egypt. Tour Hosp Res 22(1):42–59
Gavrishev A et al (2022) New technological approaches to the organization of the work of medical personnel performing auscultation of patients with COVID-19. Biomed Eng 56(3):211–215
Martínez MM et al (2022) Health outcomes and psychosocial risk exposures among healthcare workers during the first wave of the COVID-19 outbreak. Saf Sci 145:105499
Vanden Bossche D et al (2022) Understanding trustful relationships between community health workers and vulnerable citizens during the COVID-19 pandemic: a realist evaluation. Int J Environ Res Public Health 19(5):2496
Alamsyah N et al (2022) We shall endure: exploring the impact of government information quality and partisanship on citizens’ well-being during the COVID-19 pandemic. Government Inform Q 39(1):101646
Hürlimann O et al (2022) Return to work after hospitalisation for COVID-19 infection. Eur J Intern Med 97:110–112
Garzillo EM et al (2022) Returning to work after the COVID-19 pandemic earthquake: a systematic review. Int J Environ Res Public Health 19(8):4538
Wolters F et al (2020) Multi-center evaluation of cepheid xpert® xpress SARS-CoV-2 point-of-care test during the SARS-CoV-2 pandemic. J Clin Virol 128:104426–104426
Bahl P et al (2022) Airborne or droplet precautions for health workers treating Coronavirus Disease 2019? J Infect Dis 225(9):1561–1568
Guo LY et al (2021) Study on the decay characteristics and transmission risk of respiratory viruses on the surface of objects. Environ Res 194:110716
Mahmoud N et al (2023) Post-COVID-19 syndrome: nature of symptoms and associated factors. J Public Health-Heidelberg. https://doi.org/10.1007/s10389-022-01802-3
Hamidi Z et al (2023) A comprehensive review of COVID-19 symptoms and treatments in the setting of autoimmune diseases. Virol J 20(1):1
Lee J et al (2023) Protracted course of SARS-CoV-2 pneumonia in moderately to severely immunocompromised patients. Clin Exp Med. https://doi.org/10.1007/s10238-022-00984-0
King CS et al (2022) Lung transplantation for patients with COVID-19. Chest 161(1):169–178
Ciofalo A et al (2022) Long-term subjective and objective assessment of smell and taste in COVID-19. Cells 11(5):788
Chudzik M et al (2022) Persisting smell and taste disorders in patients who recovered from SARS-CoV-2 virus infection-data from the Polish PoLoCOV-CVD study. Viruses-Basel 14(8):1763
Al Balushi A et al (2022) COVID-19-Associated mucormycosis: an opportunistic fungal infection. A Case Series and Review. Int J Infect Dis 121:203–210
Irawati ID et al (2022) Self-oxygen regulator system for COVID-19 patients based on body weight, respiration rate, and blood saturation. Electronics 11(9):1380
Zubovic J et al (2022) Smoking patterns during COVID-19: evidence from Serbia. Tob Induc Dis: 20. https://doi.org/10.18332/tid/148169
Gerhards SK et al (2022) Coping with stress during the COVID-19 pandemic in the oldest-old population. Eur J Ageing 19(4):1385–1394
Sy MP et al (2022) Emergency remote teaching for interprofessional education during COVID-19: student experiences. Br J Midwifery 30(1):47–55
Sankar JP et al (2022) Effective blended learning in higher education during COVID-19. Inform Technol Learn Tools 88(2):214–228
Harris L et al (2022) Catering for ‘very different kids’: distance education teachers’ understandings of and strategies for student engagement. Int J Incl Educ 26(8):848–864
Aykan A et al (2022) The integration of a lesson study model into distance STEM education during the covid-19 pandemic: Teachers’ views and practice. Technol Knowl Learn 27(2):609–637
Maatuk AM et al (2022) The COVID-19 pandemic and E-learning: challenges and opportunities from the perspective of students and instructors. J Comput High Educ 34(1):21–38
Dong Y et al (2020) Exploring the structural relationship among teachers’ technostress, technological pedagogical content knowledge (TPACK), computer self-efficacy and school support. Asia-Pacific Educ Res 29(2):147–157
Besser A et al (2022) Adaptability to a sudden transition to online learning during the COVID-19 pandemic: understanding the challenges for students. Scholarsh Teach Learn Psychol 8(2):85
Brown A et al (2022) A conceptual framework to enhance student online learning and engagement in higher education. High Educ Res Dev 41(2):284–299
Sadeghi M (2019) A shift from classroom to distance learning: advantages and limitations. Int J Res Engl Educ 4(1):80–88
Abduraxmanova SA (2022) Individualization of professional education process on the basis of digital technologies. World Bull Social Sci 8:65–67
Akhmedov B (2022) A new approach to teaching information technologies in education. Cent Asian J Educ Comput Sci (CAJECS) 1(2):73–78
Abdullayev AA (2020) System of information and communication technologies in the education. Sci world Int Sci J 2:19–21
Al-Husseini S et al (2021) Transformational leadership and innovation: the mediating role of knowledge sharing amongst higher education faculty. Int J Leadersh Educ 24(5):670–693
Mahendraprabu M, Kumar KS, Mani M, Kumar PS (2021) Open educational resources and their educational practices in higher education. Mukt Shabd J 10(2):527–540
Severino L et al (2021) Using a design thinking Approach for an asynchronous learning platform during COVID-19. IAFOR J Educ 9(2):145–162
Lowry B et al (2022) Merged virtual reality teaching of the fundamentals of laparoscopic surgery: a randomized controlled trial. Surg Endosc 36(9):6368–6376
Zheng Q-S et al (2022) Design and development of virtual simulation teaching resources of “safe electricity” based on unity3D. J Phys: Conf Ser 2173(1):012012
Lin J (2022) The design and application research of foreign trade simulation practice technology platform. Adult Higher Educ 4(2):1–7
Liu S et al (2023) Efficient visual tracking based on fuzzy inference for intelligent transportation systems. IEEE Trans Intell Transp Syst: 1–12. https://doi.org/10.1109/TITS.2022.3232242
Liu S et al (2023) Multi-modal fusion network with complementarity and importance for emotion recognition. Inf Sci 619:679–694
Liu S et al (2022) Human-centered attention‐aware networks for action recognition. Int J Intell Syst 37(12):10968–10987
Wang XG et al (2020) A weakly-supervised Framework for COVID-19 classification and lesion localization from chest CT. IEEE Trans Med Imaging 39(8):2615–2625
Wu X (2020) Diagnosis of COVID-19 by Wavelet Renyi Entropy and three-segment biogeography-based optimization. Int J Comput Intell Syst 13(1):1332–1344
Wang S-H et al (2018) Identification of alcoholism based on Wavelet Renyi Entropy and Three-Segment Encoded Jaya Algorithm. Complexity 2018:3198184
El-kenawy ESM et al (2020) Novel feature selection and voting classifier algorithms for COVID-19 classification in CT images. IEEE Access 8:179317–179335
Got A et al (2020) A guided population archive whale optimization algorithm for solving multiobjective optimization problems. Expert Syst Appl 141:112972
Wang LD et al (2020) COVID-Net: a tailored deep convolutional neural network design for detection of COVID-19 cases from chest X-ray images. Sci Rep 10(1):19549
Hou S (2022) COVID-19 detection via a 6-Layer deep convolutional neural network. Comput Model Eng Sci 130(2):855–869
Jais IKM et al (2019) Adam optimization algorithm for wide and deep neural network. Knowl Eng Data Sci 2(1):41–46
Bahrami M et al (2018) Cat swarm optimization (CSO) algorithm, in Advanced optimization by nature-inspired algorithms. Springer, Berlin, pp 9–18
Wang W (2022) Covid-19 detection by Wavelet Entropy and Cat Swarm optimization. Soc Inf Telecommu Eng 415:479–487
Gafoor SA et al (2022) Deep learning model for detection of COVID-19 utilizing the chest X-ray images. Cogent Eng 9(1):2079221
Wang J et al (2022) COVID-19 diagnosis by Gray-Level Cooccurrence Matrix and PSO. Int J Patient-Centered Healthc 12(1):309118
Marini F et al (2015) Particle swarm optimization (PSO). A tutorial. Chemometr Intell Lab Syst 149:153–165
Zhang Y-D et al (2021) Covid-19 diagnosis via DenseNet and optimization of transfer learning setting. Cognit Comput: 1–17. https://doi.org/10.1007/s12559-020-09776-8
Zhang N et al (2019) Investigation on performance of neural networks using quadratic relative error cost function. IEEE Access 7:106642–106652
Selvaraju RR et al (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: Proceedings of the IEEE international conference on computer vision, p 618–626
Funding
The paper is partially supported by British Heart Foundation Accelerator Award, UK (AA/18/3/34220); Royal Society International Exchanges Cost Share Award, UK (RP202G0230); Hope Foundation for Cancer Research, UK (RM60G0680); Medical Research Council Confidence in Concept Award, UK (MC_PC_17171); Sino-UK Industrial Fund, UK (RP202G0289); Global Challenges Research Fund (GCRF), UK (P202PF11); LIAS Pioneering Partnerships Award, UK (P202ED10); Data Science Enhancement Fund, UK (P202RE237); Fight for Sight, UK (24NN201); Sino-UK Education Fund, UK (OP202006); Biotechnology and Biological Sciences Research Council (BBSRC), UK (RM32G0178B8).
Author information
Authors and Affiliations
Contributions
Jiaji Wang: Conceptualization, Methodology, Software, Formal analysis, Investigation, Resources, Data Curation, Writing - Original Draft.
Suresh Chandra Sataphaty: Methodology, Validation, Investigation, Resources, Writing - Review & Editing, Visualization, Supervision, Project administration.
Shuihua Wang: Conceptualization, Validation, Formal analysis, Data Curation, Writing - Original Draft, Writing - Review & Editing, Supervision.
Yudong Zhang: Conceptualization, Methodology, Software, Formal analysis, Investigation, Resources, Data Curation, Writing - Original Draft, Writing - Review & Editing, Project administration, Funding acquisition.
Corresponding authors
Ethics declarations
Ethics approval
The ethical check is exempted since we use open-access datasets.
Competing interests
There are no conflicts of interest regarding the submission of this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, J., Satapathy, S.C., Wang, S. et al. LCCNN: a Lightweight Customized CNN-Based Distance Education App for COVID-19 Recognition. Mobile Netw Appl 28, 873–888 (2023). https://doi.org/10.1007/s11036-023-02185-9
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11036-023-02185-9