Virtual Honeypot Framework




A honeypot is a closely monitored network decoy serving several purposes: it can distract adversaries from more valuable machines on a network, provide early warning about new attack and exploitation trends, or allow in-depth examination of adversaries during and after exploitation of a honeypot. Deploying a physical honeypot is often time intensive and expensive as different operating systems require specialized hardware and every honeypot requires its own physical system. This paper presents Honeyd, a framework for virtual honeypots that simulates virtual computer systems at the network level. The simulated computer systems appear to run on unallocated network addresses. To deceive network fingerprinting tools, Honeyd simulates the networking stack of different operating systems and can provide arbitrary routing topologies and services for an arbitrary number of virtual systems. This paper discusses Honeyd’s design and shows how the Honeyd framework helps in many areas of system security, e.g. detecting and disabling worms, distracting adversaries, or preventing the spread of spam email.
Introduction
Internet security is increasing in importance as more and more business is conducted there. Yet, despite decades of research and experience, we are still unable to make secure computer systems or even measure their security.
As a result, exploitation of newly discovered vulnerabilities often catches us by surprise. Exploit automation and massive global scanning for vulnerabilities enable adversaries to compromise computer systems shortly after vulnerabilities become known .
One way to get early warnings of new vulnerabilities is to install and monitor computer systems on a network that we expect to be broken into. Every attempt to contact these systems via the network is suspect. We call such a system a honeypot. If a honeypot is compromised, we study the vulnerability that was used to compromise it. A honeypot may run any operating system and any number of services. The configured services determine the vectors an adversary may choose to compromise the system.
A physical honeypot is a real machine with its own IP address. A virtual honeypot is a simulated machine with modeled behaviors, one of which is the ability to respond to network traffic. Multiple virtual honeypots can be simulated on a single system.
Virtual honeypots are attractive because they requirer fewer computer systems, which reduces maintenance costs. Using virtual honeypots, it is possible to populate a network with hosts running numerous operating systems. To convince adversaries that a virtual honeypot is running a given operating system, we need to simulate the TCP/IP stack of the target operating system carefully, in order to deceive TCP/IP stack fingerprinting tools like Xprobe or Nmap .
This paper describes the design and implementation of Honeyd, a framework for virtual honeypots that simulates computer systems at the network level. Honeyd supports the IP protocol suites and responds to network requests for its virtual honeypots according to the services that are configured for each virtual honeypot. When sending a response packet, Honeyd’s personality engine makes it match the network behavior of the configured operating system personality.
To simulate real networks, Honeyd creates virtual networks that consist of arbitrary routing topologies with configurable link characteristics such as latency and packet loss. When networking mapping tools like traceroute are used to probe the virtual network, they discover only the topologies simulated by Honeyd.
Our performance evaluation of Honeyd shows that a 1.1 GHz Pentium III can support 30 MBit/s aggregate bandwidth and that it can sustain over two thousand TCP transactions per second. The experimental evaluation of Honeyd verifies that fingerprinting tools are deceived by the simulated systems and shows that our virtual network topologies seem realistic to network mapping tools.
To demonstrate the power of the Honeyd framework, we show how it can be used in many areas of system security. For example, Honeyd can help with detecting and disabling worms, distracting adversaries, or preventing the spread of spam email.
The rest of this paper is organized as follows. Section presents background information on honeypots. In Section , we discuss the design and implementation of Honeyd. Section presents an evaluation of the Honeyd framework in which we analyze the performance of Honeyd and verify that fingerprinting and network mapping tools are deceived to report the specified system configurations. We describe how Honeyd can help to improve system security in Section and present related work in Section . We summarize and conclude in Section .
Honeypots
This section presents background information on honeypots and our terminology. We provide motivation for their use by comparing honeypots to network intrusion detection systems (NIDS) . The amount of useful information provided by NIDS is decreasing in the face of ever more sophisticated evasion techniques [21,28] and an increasing number of protocols that employ encryption to protect network traffic from eavesdroppers. NIDS also suffer from high false positive rates that decrease their usefulness even further. Honeypots can help with some of these problems.
A honeypot is a closely monitored computing resource that we intend to be probed, attacked, or compromised. The value of a honeypot is determined by the information that we can obtain from it. Monitoring the data that enters and leaves a honeypot lets us gather information that is not available to NIDS. For example, we can log the key strokes of an interactive session even if encryption is used to protect the network traffic. To detect malicious behavior, NIDS require signatures of known attacks and often fail to detect compromises that were unknown at the time it was deployed. On the other hand, honeypots can detect vulnerabilities that are not yet understood. For example, we can detect compromise by observing network traffic leaving the honeypot even if the means of the exploit has never been seen before.
Because a honeypot has no production value, any attempt to contact it is suspicious. Consequently, forensic analysis of data collected from honeypots is less likely to lead to false positives than data collected by NIDS.
Honeypots can run any operating system and any number of services. The configured services determine the vectors available to an adversary for compromising or probing the system. A high-interaction honeypot simulates all aspects of an operating system. A low-interaction honeypots simulates only some parts, for example the network stack [24]. A high-interaction honeypot can be compromised completely, allowing an adversary to gain full access to the system and use it to launch further network attacks. In contrast, low-interaction honeypots simulate only services that cannot be exploited to get complete access to the honeypot. Low-interaction honeypots are more limited, but they are useful to gather information at a higher level, e.g., learn about network probes or worm activity. They can also be used to analyze spammers or for active countermeasures against worms; see Section .
We also differentiate between physical and virtual honeypots. A physical honeypot is a real machine on the network with its own IP address. A virtual honeypot is simulated by another machine that responds to network traffic sent to the virtual honeypot.
When gathering information about network attacks or probes, the number of deployed honeypots influences the amount and accuracy of the collected data. A good example is measuring the activity of HTTP based worms . We can identify these worms only after they complete a TCP handshake and send their payload. However, most of their connection requests will go unanswered because they contact randomly chosen IP addresses. A honeypot can capture the worm payload by configuring it to function as a web server. The more honeypots we deploy the more likely one of them is contacted by a worm.
Physical honeypots are often high-interaction, so allowing the system to be compromised completely, they are expensive to install and maintain. For large address spaces, it is impractical or impossible to deploy a physical honeypot for each IP address. In that case, we need to deploy virtual honeypots.

More details