

Our results show that our attacks can effectively evade graph-based classification methods. We evaluate our attacks and compare them with a recent attack designed for graph neural networks using four graph datasets.

To address the challenge, we propose several approximation techniques to solve the optimization problem. However, it is computationally challenging to solve the optimization problem exactly.

We formulate our attack as a graph-based optimization problem, solving which produces the edges that an attacker needs to manipulate to achieve its attack goal. We consider an attacker's goal is to evade detection via manipulating the graph structure. However, they focused on graph neural network, leaving collective classification largely unexplored. Only a few recent studies touched adversarial graph-based classification methods. However, existing adversarial machine learning studies mainly focused on machine learning for non-graph data. Attacking a graph-based classification method enables an attacker to evade detection in security analytics. Roughly speaking, graph-based classification methods include collective classification and graph neural network. Graph-based classification methods are widely used for security analytics. It can be used with application-specific models, even in the presence of new software versions, as well as application-agnostic meta-models that encompass a wide range of applications and installers. We also demonstrate that SIGL can pinpoint the processes most likely to have triggered malicious behavior, works on different audit platforms and operating systems, and is robust to training data contamination and adversarial attack. Using a test corpus of 625 malicious installers containing real-world malware, we demonstrate that SIGL has a detection accuracy of 96%, outperforming similar systems from industry and academia by up to 87% in precision and recall and 45% in accuracy. SIGL flags suspicious installations as well as the specific installation-time processes that are likely to be malicious. SIGL collects traces of system call activity, building a data provenance graph that it analyzes using a novel autoencoder architecture with a graph long short-term memory network (graph LSTM) for the encoder and a standard multilayer perceptron for the decoder. We introduce SIGL, a new tool for detecting malicious behavior during software installation. However, recent supply-chain attacks demonstrate that application integrity must be ensured during installation itself. Many users implicitly assume that software can only be exploited after it is installed.
