In this story, we will try to learn what malware is and how malware analysis works. This is aimed at complete beginners who have never tried malware analysis before. If that’s you, then this article can help you in making your first step in this field.
What is a Malware?
Malware is any program that does damage to a host without the consent of the user.
Although we tend to use the words virus and malware interchangeably, they are not the same thing. Malware is a broad term that includes, in addition to Viruses, other types of malicious programs like Worms and Trojans.
Types of Malware
Malware can take many forms and comes in many variations. I don’t want to end up here with a lengthy post, so I’m going to keep the following list short. I have listed here the most common malware types that you should know about.
- Virus: Viruses are pieces of malware that require human intervention to propagate to other machines. Think of this intervention as a user installing a malicious program from a website or a phishing email.
- Worm: Unlike Viruses, Worms do not need the help of humans to move to other machines. They can spread easily and can infect a high number of machines in a short amount of time.
- Trojan: These appear to be normal programs that have a legitimate function, like a game or a utility program. But underneath the innocent-looking user interface, a Trojan performs malicious tasks without the user being aware.
- Spyware: This type of malware gathers data about the user and sends it to a third-party.
- Keylogger: This is a special type of spyware. It is specialized in recording the keystrokes made by the user.
- Ransomware: These types of malware have become more common in the last decade. When a piece of Ransomware infects a machine, it encrypts all the stored files. It then asks the user for a ransom in exchange for the decryption key. Well-known examples of Ransomware are WannaCry and Locky.
It is common for malware to be classified as more than one type. For example, you can have Ransomware that is at the same time a Virus and a Trojan.
What is Malware Analysis?
Malware Analysis is the field of examining malware samples to try to extract valuable information about their origin, behavior, and impact.
The person who conducts these activities is called a malware analyst. They are generally involved in digital forensics and incident response and they play a major role in helping organizations recover from malware infections.
The more skilled malware analyst can proactively prevent companies from getting infected in the first place. They are knowledgeable about the threat landscape and follow the global trends related to malware. This allows them to identify what is the next malware that would probably hit the organization in the near future.
Malware Analysis Techniques
Now that we know what malware is, and what malware analysts do, it is time to explore some of the techniques of malware analysis.
There are many ways to approach malware, and the techniques presented here are only the tip of the iceberg. I included below a list of resources that you can refer to for more in-depth learning.
You don’t have to execute a piece of malware to analyze it. By performing what is known as Static Analysis, you can get some valuable information simply by examining the static information associated with the file.
Here are some examples of valuable information that we can extract using static analysis.
1- File Headers
Depending on the target operating system, malware files can be one of two types : Portable Executable(PE) or Executable and Linkable Format (ELF). The latter is used in Linux, whereas the former is the standard format used by Windows executable files.
Since Windows is more targeted by malware than Linux, you will encounter PE-based malware files more often than their ELF-based counterparts.
It would therefore be more rewarding to learn about PE format first and to understand how you could retrieve useful information by examining certain sections of the file.
For example, by examining the PE header, you can obtain information about which functions from other libraries does the malware call, or at what memory address does the program execution starts.
A Hash is a unique string of a fixed length that can be generated based on an input. No matter the size of this input, the hash value will always be of a fixed length.
A hash is used to check for the integrity of files. If the content of the file changes, then its hash value will also change.
Now, by calculating the hash value of a file, we can verify if it’s a known malware by searching for this hash and see if it exists on a malware database such as Virustotal.
Strings is a tool that you can use to extract the ASCII text from a program file. It does this by searching for any series of consecutive ASCII characters.
Very often, you will find interesting stuff using this tool, such as a hidden code or a domain name address.
4- Code Analysis
Programs are executed in a special series of operations called opcodes (operation codes). These are special binary instructions that are generally represented in hexadecimal. They can be interpreted by computers and are far less understandable by us humans.
Disassembly is the process of extracting Assembly code from these opcodes. Although Assembly isn’t an easy language either, it is much more approachable compared to opcodes.
By performing disassembly, a malware analyst can peek into the instructions of the malware to understand what it does, where the malicious portions of the program are, and what hidden information they can retrieve.
Another way to reverse engineer malware is to go one step further and use a Decompiler instead of a Disassembler. While the latter outputs the assembly code, the former presents a much better alternative by providing the source code in a high-level language that is friendlier and easier to understand for humans.
Dynamic Analysis requires the execution of the malware program and examining its behavior while it is running.
This method is obviously less safe than static analysis because basically, you would willingly be infecting your machine. It is a good practice to perform it on a sandbox environment, such as a virtual machine, or even better, a completely separate physical machine isolated from any network.
A debugger is a powerful tool that any malware analyst should know how to use. It allows you to follow the flow of the program as it executes, and provides useful features that give you better control over the execution of a program.
For example, you can set breakpoints on certain instructions where you want the execution to pause. You can also examine the contents of registers and specific memory addresses, and even better, you can modify their values while the program is running.
These are just small examples of the possibilities that a debugger provides.
For Further Learning
This section provides a list of good resources that can help you on your journey to learn malware analysis:
Introduction To Malware Analysis (By Lenny Zeltser — SANS Institute)
Practical Malware Analysis : The Hands-On Guide to Dissecting Malicious Software
NIST SP 800–83 : Guide to Malware Incident Prevention and Handling for Desktops and Laptops
theZoo — A LIVE Malware repository (These are real malware samples. So, make sure you don’t run them unless it is on a safe environment)