Techniques for Detecting Malware & Malicious Ad Code
How is malware in the advertising industry – malvertising – detected? What are the different techniques and methods used by the professionals to identify and isolate the insidious codes? What are some best practices of how to defend against malvertising attacks?
Being able to detect malvertising requires expertise – and a keen eye that can review reams of data. The expert needs to strike an optimal balance that offers maximum speed and minimum memory usage for detection of the malicious codes. There are a few main methods malware researchers utilize:
This method was (and remains) one of the first methods used to detect malware. The malware researcher will scan and analyze feeds of suspicious files (received from a particular company or third-party source) looking for certain pieces of code or data also known as “signatures.” A code that repeats or a signature match on a file serves as red flags to the expert and they would mark it as suspicious.
This is a modification of signature analysis and is method based on calculating CRC (Cyclic Redundancy Check) checksums. This method was developed to compensate for a main disadvantage of the signature method, which is that there ends up being an incredibly large database and frequent false alarms.
To circumvent the above identifying tactics, hackers often make their malicious ad campaigns polymorphic – which makes them more difficult to detect. A polymorphic virus means that their “body” is self-changing during replication and avoids the presence of any constant search strings. So, as fast as security teams can identify a signature, this kind of malware has no constant fragment of virus-specific code to find. (Typically, polymorphism is achieved when non-constant keys containing random sets of decryption commands are encrypted into the main code of the virus – or by changing the executable virus code.) Since a variable code has no signature, other techniques must be used to detect the malicious code:
By using elements within the encrypted body of the virus, the researcher can “take” the encryption key out of the equation to obtain a static code. Then the signature, or mask, will be revealed in the resulting static code.
Known plaintext cryptanalysis
This method uses a system of equations to decode an encrypted virus body, in a way similar to the classical cryptographic problem, where one would decode an encoded text without keys (with a couple differences). In cryptanalysis, the system reconstructs the keys and the algorithm of the decrypting program. Then, it decodes the encrypted virus body by applying this algorithm to the encoded fragment.
The system can analyze the frequency of the processor commands used and will use this information to make a decision on whether the file is infected or not.
The malware researcher will scan and analyze reams of data looking for suspicious activity and behavior. This method requires the researcher to look for malicious code served with suspicious behavior; for example, to a thousand people in the space of five minutes. The researcher would note this and inspect further.
Confirming Suspicions: Phase 2
Once the anti-malvertising expert has identified code that is deemed suspicious, there are a couple ways for the expert to confirm the suspicion. First off, there are hubs of information where major security companies list the malicious codes they have detected. This library is a powerful resource for every security expert. Malware researchers can access these lists and run lookups for malicious codes. If they are within the system already, then they can tick it off their suspicion as confirmed.
If the malicious code the expert found is not listed in the main hub, then the researcher will use a technique called “Emulation,” a way to execute the file in a “virtual environment.” The system emulates not only processor opcodes (operation codes), but also operating system calls. This mimicry allows the researcher to identify if the code is indeed malicious.
An interesting note is that when an emulator is used, the actions of every command must be constantly controlled. The researcher must prevent the program from executing its malicious intent.
In practice, the researcher is looking to detect the malicious code as efficiently as possible, this boils down to whichever method can be implemented with maximum speed and minimum memory usage.