Introduction

Introduction

WEF is an implementation of an automatic drive-by-download – detection in a virtualized environment, developed by Thomas Müller, Benjamin Mack and Mehmet Arziman, three students from the Hochschule der Medien (HdM), Stuttgart during the summer term in 2006. WEF can be used as an active HoneyNet with a complete virtualization architecture underneath for rollbacks of compromised virtualized machines.

Your contact to WEF

If you have any further questions, thoughts or ideas feel free to contact

Abstract

Much has been written about security vulnerabilities in Microsoft Internet Explorer and Mozilla Firefox.  Some of these security threats are designed to execute malicious code in the browser.  Known as Remote-Code-Execution-Attacks, these threats typically exploit a specific utilization of buffer overflows in an application. They are not only limited to browsers but almost all services and applications that are part of the internet or that use it as a communication platform.

We focus on internet browsers here because of two key problems.  First of all, browsers are the primary user interfaces to the World Wide Web.  As the rendering engine transforms hypertext into a visual presentation for human, all parts of a webpage have to be interpreted and processed further by the browser—which leads to a complex and error-prone architecture, especially in regard to mobile code (JavaScript, Java, ActiveX, XUL etc.). Secondly, the browser is arguably the most frequently used program in the family of potentially vulnerable software. In contrast to server-based software, a browser is often used by non-technical users, many of whom neither understand the risks or know possible counteractive measures.  And even experts are often exposed to the risk of an attack.

In view of this, our goal was to develop a system that automatically detects and identifies malicious websites.

In addition, this system would also be able to serve as a platform for other security and sandbox-tests.  One use-case is to automatically analyze various kinds of malware in a secure and easy maintainable virtualized environment.

Introduction

To begin with, we started by discussing some important questions and project requirements:

  • How should we define the expression “malicious” for our project?
  • What options are available for detecting malicious web content?
  • What design requirements are needed for an adequate system?

    Our project defines a web page that downloads, installs and executes malicious software (a virus, worms, Trojan horses, keyloggers, etc) on a client as “malicious”.  We concentrate on malicious software that installs without any user interaction, making it hard to identify even for advanced users (Drive-By-Downloads).

    We limited our focus in order to find web pages that actually take advantage of security bugs in the browser.  At this stage, our objective was not to deal with web pages that trick the user or offer infected software as downloads.  However, we’ve considered how we could add this functionality at a later date.

    To find a way to detect malicious websites, you have to put yourself in the position of an attacker.  What are an attacker’s goals and how can he or she achieve them?  An attacker wants to compromise a user’s computer—to do this, he or she needs to change the state of the PC in some way.  For instance, consider a typical scenario:

  • The attacker executes his own code (shell code) with help of a buffer overflow in the browser.
  • Since the functionality is very limited, given  the small amount of code that can be included in the buffer overflow, he usually tries to download more code from the web and run it.


This small application is often called a “Dropper” or “Downloader”, since it downloads the actual malware to the system and includes it with some registry entries in every following system boot.

To discover such changes to a system there are at least two different options:

  • Intrusion Detection: Determine the state of the system before and after a visit to a suspicious internet page and compare both results.  You can use a list of all relevant files and registry entries coupled with their corresponding checksums to determine the “state” of a system.  In this way, new or modified files can be detected easily.  The key difficulties with this technique are the huge delays involved in detecting a threat as well as poor performance and scalability.
  • Rootkit: Detection using modifications to the operating system.  This technique monitors and evaluates the relevant system calls.  Such changes to the operating system are not designated and require a deeper interaction with the kernel.  This procedure is also often used in rootkits and is typically called a rootkit itself.


We decided to use the rootkit technique due to its performance advantages.

In addition, to use the system as a research platform it needs to satisfy the following requirements:

  • It should work automatically, requiring as little user interaction as possible.
  • It should be possible to control the system remotely, such as with a web interface.
  • It should be scalable and extensible.
  • It should be secure, with components to ensure the system itself cannot be infected by malicious websites.