Projects:2015s1-45 Analysis and Visualisation of Packet Data for Cyber-Security Purposes
About
Aim
The aim of this project was to investigate the usefulness of packet data from internet connected networks in the field of Cyber-Security. The use of Hybrid IPv4/IPv6 communication for the purpose of data exfiltration was the main focus for this project.
Motivation
In the modern age, there is a continuous push for everything to be made available online. The need for advanced cyber-security techniques that can quickly adapt to new types of attacks continues to grow. The collaboration with Estonia stems from the fact that it is highly regarded as a technologically advanced nation; it was the first country to allow nationwide online voting in 2005 [1]. Together with its geographical location and heavy investment in the digital age, Estonia has been the target of instances of cyber-crime. In 2007, Estonia was victim to one of the largest instances of state-sponsored cyber warfare, affecting many national websites including those of the government, banks, ministries and broadcasters [2]. The motivation for this project is to work closely with academics at TTU to investigate these forms of malicious activities and develop techniques, through research and analysis, for real-time detection of cyber-attacks. The project is expected to provide new ideas and proposed solutions to the research questions that arise.
Background
Due to a rapid expansion in demand for Internet connected devices, the original 32-bit source and destination IPv4 address header fields were insufficient for accommodating this growth. Internet Protocol Version 6 (IPv6) was soon standardised in RFC 2460 [3] by the IETF and featured “Expanded Addressing Capabilities”. IPv6 allowed for 128-bit addressing, increasing the available address space from approximately 4.3×10^9 to 3.4×10^28 unique addresses. In addition to this, IPv6 also simplified the header format, instead adding the capability for optional extension headers. This allowed for a reduction in processing cost used for common case packets. The use of IPv6 also allowed for flow labeling capability, allowing the sender to label a packet as part of a particular traffic flow so that a request for special handling, such as non-default quality of service, could be applied.
However, as IPv6 was intended to replace IPv4, the two do not have direct communication abilities. This means that a device with only an IPv4 address cannot communicate with a device addressed only via IPv6. This means it is difficult to correlate the addresses of a hybrid IPv4/IPv6 device as there are no protocols to relate them. The exploitation of this property was the motivating factor behind the development and analysis of Hybrid IPv4/IPv6 data exfiltration attacks.
Project Team
Students
- Matthew Sclauzero
- Carmela Panuccio
- Pellegrino Coscia
- Benjamin Cosh
Supervisors
University of Adelaide
- Dr Matthew Sorell
Tallinn University of Technology
- Dr Olaf Maennel
Advisors
Tallinn University of Technology
- Dr Hayretdin Bahsi
NATO Cooperative Cyber Defence Centre of Excellence
- Mauno Pihelgas
- Bernhards Blumbergs
Centre for Defence Communications and Information Networking
- Dr Michael Webb
- Dr Hung Nguyen
Attack Development and Analysis
Exfiltration Attacks
The model for the exfiltration types required communication between two networked devices. This was achieved through the use of two virtual machines running Ubuntu on a single host system. The virtual machine client used was VirtualBox by Oracle. Ubuntu was chosen as the operating system as it was the Linux distribution familiar to all members of the project team. The implementation of this environment can be seen below.
Exfiltration Attack Type 1: Hybrid IPv4/IPv6 Over TCP
Exfiltration Attack Type 1, referred to as ‘Exfiltration Type 1’, would utilise the inbuilt socket connection library of Python to communicate between hosts using Transmission Control Protocol (TCP). The intention of this implementation was to create the appearance of an innocuous transmission between two devices through the use of a standard library. A TCP Connection was initiated to the Server from the Client. Upon connection the file from the Client was broken up into segments and transferred to the Server over IPv4 and IPv6 simultaneously while also alternating between the two IP protocol connections. When the transfer was completed, the connection was closed and the file reassembled on the Server side. The image below shows an overview of the transfer and the two flowcharts detail the procedure undertaken by both the Client and Server during transmission.
Exfiltration Attack Type 2: Hybrid IPv4/IPv6 Over UDP
The second exfiltration attack type used the User Datagram Protocol (UDP) as the transport method to send the file data. The Python-based packet manipulation tool, Scapy was used for construction and sending of packets.
Scapy is a powerful interactive packet manipulation tool. It is a free and open source application written in Python language and its functions can be imported and used in a custom script.
Once the basics of Scapy had been learnt experimentation on how it could be used to implement the proposed attack vector began. It became clear that Scapy’s versatile packet filtering tools made it very easy to construct a lightweight server application that could receive the required packets containing file data whilst being completely immune to the IP protocol that any packet had been sent with. The client application was not as simple as the server. However with the use of Scapy the construction and sending of different IP packets was made easier. The above code demonstrates a basic messaging application where a user is prompted to type a message that is then sent to the client via IPv4 or IPv6. Again, at this stage the script was simply a proof of concept that it were possible to use Scapy for the easy control of sending IPv4 and IPv6 packets.
Analysis of Data
The fundamental end goal of the project is to develop real-time solutions for detection and reconstruction of the hybrid session model. Understanding the attack formation is only one part of this process; the second is classifying the attack and developing analysis techniques.
Pattern Matching: Keyword Search
A Keyword Search is a simple Pattern Matching technique, where a data stream is searched to see if it contains a ‘keyword’. This analysis technique aims to be capable of determining from the keywords in a stream, what file type that stream is. It is also expected that there would be cases that the Keyword Search could define which two streams are associated with each other from a set of streams.
Information Theory: Lempel Ziv 78
Lempel-Ziv 78 (LZ78) is an algorithm that forms a dictionary through the data compression process. It is proposed that data samples originating from the same source will have similar dictionaries. Thus, by aggregating the payload of a session, a dictionary comparison could be used to associate its corresponding IP session. The streams with the highest number of matching dictionary entries can be associated as complementary hybrid sessions.
Study Tour
A major portion of the learning and development from this project came from a 3 week study tour of Estonia. This tour provided many unique and valuable opportunities to the project group that have assisted in the development of the project and the project group members. One of these opportunities was the time spent working with the projects supervisors and mentors, experts in the cyber security field, to work toward the project goals. The project group was also able to be immersed in the culture of Estonia and see first-hand how this culture, often in stark contrast to Australia’s, has led to the embracing of computing technologies and see what advantages this has brought to the country and its people.