1. Introduction
Cybersecurity in modern vehicles has become an active field of research as the evolution of the automotive sector has improved the computing and connectivity capabilities of modern vehicles, which support a large variety of automotive applications for traffic management, maintenance and so on. On the other side, the evolution of vehicles to the “computer on wheels“ concept has also exposed vehicles to the risk of cybersecurity threats as described in various surveys in this topic [
1,
2,
3]. These surveys identify the key threats in the automotive sector and the correspondent mitigation techniques, which can be based on cryptographic solutions or analysis of the in-vehicle networks traffic with Intrusion Detection System (IDS). This study focuses on the design of a novel IDS approach for in-vehicular networks based on the CAN-bus standard.
In the ICT domain, the application of IDSs! (IDSs!) to mitigate cybersecurity attacks is well known and they have been used for more than 30 years. IDSs are usually based on the analysis of the network traffic to highlight anomalies or specific traffic patterns, which may point to an attack [
4]. The main metrics of evaluation of IDSs are the detection accuracy and the time to detect an attack in the shortest time possible so that an appropriate countermeasure can be implemented. In ICT infrastructure, computers and network components (e.g., routers) are usually the main assets to protect from attacks. In modern vehicles, the assets to be protected are sensors (e.g., engine or tyre sensors), actuators (e.g., bracking systems) and the Electronic Control Unit (ECU)s, which are the computing platforms used to control and monitor the engine and transmissions. The various electronic components in the vehicles are connected through various in-vehicle networks like CAN-bus, FlexRay and LIN [
2,
5]. This paper focuses specifically on attacks on the CAN-bus as it is the most widely deployed in-vehicle network standard in the world. A description of the CAN-bus protocol is provided later in this paper.
Contribution of this paper: The implementation of an IDS in in-vehicular networks has been proposed in the literature using different techniques as described in
Section 2. One category of IDS is based on the creation of dictionaries or a model of legitimate/normal CAN-bus traffic (without attacks), against which the traffic with attacks is evaluated. In particular, the frequency of appearance or the entropy of CANIDs (one of the fields of the CAN-bus protocol, which identifies the message) is calculated in sliding windows of CAN-bus traffic. The problem with these approaches is the detection accuracy may not be optimal even if they are computing efficient [
6]. Other approaches are based on the analysis of the sequences of CAN-bus messages, but the selection of the optimal sequences can also be challenging because it depends on the attack implementation [
7]. Recently, various authors have applied Deep Learning (DL) with excellent detection performance at the cost of a significant computing complexity especially with large data set with millions of CAN-bus messages [
7,
8].
As described in the subsequent sections, this paper proposes a combination of these methods by (1) adopting a multi-scale histogram method to build the dictionary of sequences of CANIDs instead of relying only on the frequency of single CANIDs, and (2) using the created dictionaries to generate a feature space on which DL is applied. The advantage and novelty of the proposed approach is based on the combination of the methods to overcome the limitations of the frequency-based single CANIDs approach, meaning it exploits the strength of the CANIDs sequence methods (through the multi-scale histograms) and the power of DL by mitigating the disadvantage of the DL (which requires significant computing effort), because the DL is applied to the reduced feature space rather than all the CAN-bus traffic. To the knowledge of the author, a multi-scale histogram approach for IDSs has not been applied in the literature and combined with DL.
Scope of this paper: The author would like to highlight that the scope of this paper is to propose a new approach for an IDS based on multi-scale histograms and DL, which is _targeted to Denial of Service (DOS) and spoofing attacks as described in the data sets used to evaluate the approach. This paper does not aim to address attacks related to the users like chatbots or deep fakes or other attacks outside the ones defined in the used data sets. The two data sets have been chosen for the following reasons. The first data set is the Car Hacking data set created by the Hacking and Countermeasures Research Lab [
9,
10]. This data set has been extensively used by the research community working on IDSs for in-vehicular networks and it is also used in this paper for benchmark reasons. The data set addresses DOS, Fuzzy and spoofing attacks. The Car Hacking data set was one of the first data sets created for IDSs for in-vehicular networks, but it has some limitations, which have been analyzed in literature [
11]. For this reason, the proposed approach has also been validated on the ROAD dataset [
11,
12], which contains ambient data recorded during a diverse set of activities, and attacks of increasing stealth with multiple variants and instances of real fuzzing, fabrication and unique advanced attacks, as well as simulated masquerade attacks. Both data sets have been created using real automotive vehicles with CAN-bus protocol implementations. More details are in
Section 3.3 and
Section 3.4.
Structure of this paper: The structure of this paper is as follows:
Section 2 provides an overview on the related work, which is relevant for this study.
Section 3 describes the overall methodology of the proposed IDS approach. A brief description of the CAN-bus protocol is provided in
Section 3.1. The main workflow of the methodology is described in
Section 3.2. One key element of the proposed approach is the architecture of the Convolutional Neural Network (CNN) described in
Section 3.5.
Section 3 also describes the materials used to evaluate the approach in sub-sections: the Car Hacking data set in
Section 3.3 and the ROAD data set in
Section 3.4. This paper uses two recent public data sets of in-vehicle network traffic with both legitimate traffic and attacks. Each public data set has a different set of attacks. Finally,
Section 3.6 identifies the metrics of evaluation and concludes
Section 3.
Section 4 presents the results of the application of the proposed approach on the two different data sets. This section also provides a comparison with the results obtained by other studies presented in the research literature on the same data sets. Finally,
Section 5 concludes this paper and points towards future developments.
2. Related Work
IDS in vehicular networks is based on similar concepts to the IDS of generic communication networks, where an intrusion/attack must be detected with great accuracy and in a short time so that appropriate actions can be taken (e.g., by a network administrator) to block the attack. Research activities in IDSs of vehicular networks are quite recent and they are due to the growing interest in vehicle cybersecurity originating from the reports of various and remote attacks to vehicles as described in [
1,
2,
3,
13].
A classification of IDSs for in-vehicle networks [
6,
7,
8] shows that IDSs can be implemented using signature identifications, machine learning, anomaly detection and other means. The approach proposed in this paper is part of the category of anomaly detection with a sliding window, where the IDS algorithm extracts features from a sliding window on the in-vehicle traffic (i.e., a set of CAN-bus messages). Features can be the distribution of CANID values or features extracted from the payload. A model is created from the legitimate traffic (i.e., network traffic without attacks) and it is used in a subsequent step to detect anomalies, which may correspond to a potential attack. A similar pre-processing step is used in the automotive domain for similar problems as in [
14] to improve the reliability of the data flow. A sliding window approach can be more time-efficient than approaches where each single CAN-bus message is analyzed but a potential weakness is found in the lower discriminating power as discussed in [
15,
16,
17], where entropy measures are used as features. In particular, in [
15], the Shannon entropy is calculated on a sliding window of in-vehicle network traffic and the timing of the messages. The authors in [
16] have shown that an approach based on the counting of the messages is more effective than an approach based on the timing of the messages. For this reason, this paper uses a similar approach to [
16] based on the number of received CAN-bus messages but with different features (i.e., histograms of the CANIDs instead of entropy measures). In [
15,
16], the authors have used the CANID field to calculate the entropy measure and this approach also uses the CANID field only. The performance of different entropy measures including dispersion entropy and Renyi entropy are instead evaluated in [
17] on the Car Hacking data set. A similar approach based on natural-language-processing concepts (i.e., bag of words) and the comparison of dictionaries was also proposed in [
18].
The number of publications proposing IDSs in in-vehicular networks has grown considerably in recent years with different methods and sources of data. The rest of this survey identifies selected works, which focus on the frequency-based or histogram-based approaches and with CANID as source of information as in this paper. In other words, the focus is mostly on CANID-based IDSs.
The authors of [
19] have investigated the use of a sequence of CANIDs as a discriminating feature but with a different goal than the implementation of IDSs because the _target is the prediction of the next CANID. A transfer learning-based technique is used to retrain the IDS using streaming CAN data on a resource-constrained Raspberry Pi device to improve the IDS. The approach in [
19] is based on a specific window sequence of only five CANIDs, while this paper investigates different window sizes. The approach is applied to the same ROAD dataset [
11,
12] also used in this paper. Similar concepts to the approach proposed in this paper are presented in [
20], where the frequency of n-grams of the CANID sequences have been investigated. The use of n-grams is similar to the multi-scale histograms used in this paper. The authors of [
20] use n-grams with
n = 1, 2, 3, 4. The authors acknowledge that a higher value of n would hugely increase the computing complexity of the analysis. On the other side, the authors of [
20] do not exploit the n-gram frequencies to implement the IDS using the feature space concept as proposed in this paper and they do not use classifiers for attack detection. In addition, they only analyze the Car Hacking data set [
9,
10]. An histogram-based IDS for in-vehicular networks was proposed in [
21] for the Car Hacking data set, but with the significant difference that it was applied to the payload data rather than the CANID data. The approach proposed in this paper cannot be applied to the payload data because the dictionary would have a huge size even for the single-scale case (it would be all the possible permutations of the 64-bit data) and it would be unmanageable for the scales of larger orders. In addition, Ref. [
21] uses KNN instead of DL because the authors reach already excellent results with shallow machine learning. Another paper based on the data mining of sequences of the CANIDs, which are used to implement IDS is [
22]. In the training phase, discriminant sequences of CANIDs for attack detection are identified, which are then used in the testing phase. In other words, the training phase is used to build a dictionary to model the normal behaviour of the vehicle, which is a similar concept applied to this paper. The proposed approach in [
22] compares well with frequency- and dictionary-based approaches and the authors discuss the potential limitations including the increasing computational complexity with longer sequences, which is the same finding of this paper. The approach is applied to a data set generated by the authors themselves and not a public data set. Then, it is difficult to compare results. The creation of a multi-scale dictionary for IDS as in this paper was also proposed by the same author in [
23] but to a completely different protocol (MIL-STD-1553) and a significant different coding scheme. In addition, DL was not used for the IDS implementation. Another recent paper proposing an IDS for in-vehicular networks using the statistical characteristics of the attacks is [
24], where a fixed-window approach is used, as in this paper. A model of the legitimate traffic is created and then statistical analysis is used to detect attacks on the basis of a moving threshold. The approach optimizes in an adaptive way the optimal values of the threshold. The approach is applied to the Car Hacking data set as in this paper.
Regarding the application of DL to this context, the authors in [
25] used DL as in this paper to analyze the sequences of CANIDs. In particular, they used Long Short-Term Memory (LSTM) as it is ideally suited to analyze the temporal correlation of sequences. On the other side, the goal of [
25] is different from this paper because it is related to the prediction of the next CANID, and then the algorithm compared the predicted ID with the actual ID. In addition, the results achieved a worse accuracy than other studies in the literature. Another DL approach based on the combination of CNN and LSTM is used in [
26], where the DL algorithm learns the spatio-temporal behavior of legitimate CAN-bus traffic and then detects attacks based on the deviation of message sequences from this behavior. Then, the approach is based on similar concepts even if the sequence selection is based on LSTM. The approach is applied only to the Car Hacking data set rather than two data sets as in this paper. This makes it difficult to understand if it can be generalized.
An approach based on the autoencoders for the Car Hacking data set is proposed in [
27]. As in other papers and this paper as well, a pre-processing step was used to transform the initial data into numerical form via pre-processing. The autoencoders were combined with shallow machine learning algorithms like decision tree and K Nearest Neighbor (KNN).
Another approach used in literature is to convert CANID to an image representation and then transform the IDS implementation with DL to an image-classification problem. This approach is different from the approach proposed in this paper because the frequency of the CANIDs in windows is not used directly to create a feature matrix. The disadvantage of applying DL to the CANIDs as an image representation method is the very high computing effort required, because the DL is applied to CANIDs values directly instead of grouping them in windows. On the other side, these methods usually achieve a very high attack-identification accuracy. For example, this approach was used in [
10], where the CANIDs are transformed directly to binary images using the Car Hacking data set and CNN for supervised learning as in this paper. In another paper, the authors have used auto-encoders for CANIDs transformed to images for semisupervised learning, while this paper focused on supervised learning. Finally, another paper [
28] have combined the concept of recurring plots with CANIDs sequences as input to CNN to implement the IDS. The authors of [
28] have applied the method to the Car Hacking data set. The advantage of this approach is to use a window-based approach (the chosen window size was 128, which is in the same order of magnitude of the window size of 120 messages used in this paper), which decreases the computing complexity for the CNN. On the other side of the coin, the accuracy obtained in [
28] on the Car Hacking data set is slightly worse than [
10], which is more computing intensive.
To summarize the literature review, the approach proposed in this paper combines the two main approaches proposed in the literature using only the CANIDs information: the frequency of appearance of the CANIDs values or sequences (e.g., n-grams) is used to create a histogram feature space (which is basically a dictionary) on which the CNN is used. The proposed approach combines the computing effectiveness of using windows to group CANIDs values with the power of the DL, which is applied to a transformed feature space. A potential disadvantage of this approach is that two main hyper-parameters have to be tuned for the specific data set: the window size and size of the dictionary created by the histograms. On the other side, the methods proposed in this literature review are also dependent on hyper-parameters like the window size.