YARA

GitHub - VirusTotal/yara: The pattern matching swiss knifeGitHub

YARA is a powerful security tool used by malware analysts and incident responders to identify and classify malicious files based on known patterns and behaviors. It works by allowing analysts to write rules that describe specific characteristics of malware such as unique strings, byte sequences, or structural traits which can then be matched against files, memory, or network artifacts. Often described as a “pattern-matching engine for malware,” YARA is widely used in digital forensics, threat hunting, and malware research to detect both known threats and variations of previously analyzed samples. YARA gives defenders the power to detect malware by its behavior and patterns, not just by name, enabling detection beyond traditional signature-based approaches. It allows defenders to define their own rules, shaping their own view of what constitutes “malicious” behavior based on their environment and threat model. Additionally, many YARA rules have already been written and shared by defenders who have faced similar threats; these community-developed rules can be reused, adapted, and improved to strengthen detection capabilities against evolving adversaries.

USE CASES

Post-incident analysis: Verify whether malware identified on one compromised host exists elsewhere in the environment.
Threat hunting: Proactively search systems and endpoints for indicators of known or related malware families.
Intelligence-based scans: Use shared YARA rules from other defenders to detect new or emerging indicators of compromise.
Memory analysis: Scan memory dumps and active processes to identify malicious code that may not exist on disk.

ADVANTAGES

Speed: Rapidly scans large volumes of files, memory, or systems to identify suspicious artifacts.
Flexibility: Detects simple text strings, binary patterns, and complex logical conditions.
Control: Allows analysts to precisely define what they consider malicious behavior.
Shareability: Rules can be shared, reused, and improved by other defenders across different kingdoms.
Visibility: Helps correlate scattered indicators into a clear, coherent picture of an attack.

YARA RULE KEY ELEMENTS

Metadata: Descriptive information about the rule, such as the author, creation date, and intended purpose.
Strings: The indicators YARA searches for, including text strings, byte sequences, or regular expressions associated with suspicious content.
- This represent the signatures of malicious activity in fragments of text, bytes, or patterns that can reveal the presence of malicious code
  - Text strings are the simplest and most commonly used elements in YARA rules. They represent words or short text fragments that may appear in a file, script, or memory region. By default, YARA treats text strings as ASCII and case-sensitive, but their behavior can be customized using string modifiers—small keywords added to the string definition that control how the match is performed.
Conditions: The logical statements that determine when a rule triggers, combining strings, counts, or other parameters into a final match decision.

rule TBFC_KingMalhare_Trace
{
    meta:
        author = "Defender of SOC-mas"
        description = "Detects traces of King Malhare’s malware"
        date = "2025-10-10"
    strings:
        $s1 = "rundll32.exe" fullword ascii
        $s2 = "msvcrt.dll" fullword wide
        $url1 = /http:\/\/.*malhare.*/ nocase
    condition:
        any of them
}

STRING TYPES

TEXT STRINGS

These are the simplest and most commonly used elements in YARA rules. They represent words or short text fragments that may appear in a file, script, or memory region. By default, YARA treats text strings as ASCII and case-sensitive, but their behavior can be customized using string modifiers—small keywords added to the string definition that control how the match is performed.

rule TBFC_KingMalhare_Trace
{
    strings:
        $TBFC_string = "Christmas"

    condition:
        $TBFC_string 
}

Text string modifiers in YARA enhance the flexibility of string matching by allowing options such as case-insensitivity, wide-character support, or regular expression patterns, which can help detect variations or simple obfuscations in malicious content.

CASE IN-SENSITIVE STRINGS: NOCASE

By default, YARA matches text exactly as written. Adding the nocase modifier makes the match ignore letter casing

strings:
    $xmas = "Christmas" nocase

WIDE CHARACTER STRINGS: WIDE ASCII

Many Windows executables store strings using two-byte UTF-16LE (Unicode) encoding. The wide modifier in YARA instructs the engine to search for this format, while ascii enforces a single-byte ASCII search. Both modifiers can be used together to match either encoding.

strings:
    $xmas = "Christmas" wide ascii

XOR STRINGS: XOR

When the xor modifier is used, YARA automatically tests all possible single-byte XOR keys against the string, helping reveal strings that attackers attempted to conceal using simple XOR obfuscation

strings:
    $hidden = "Malhare" xor

BASE64 STRINGS: BASE64, BASE64WIDE

Some malware encodes payloads or commands using Base64. With the appropriate YARA modifiers, YARA decodes the content and searches for the original pattern, allowing detection even when the data is hidden in encoded form.

strings:
    $b64 = "SOC-mas" base64

HEXADECIMAL STRINGS

Hex strings allow YARA to search for specific byte patterns written in hexadecimal notation. This is especially useful for detecting malware artifacts such as file headers, shellcode, or binary signatures that cannot be reliably represented as plain text.

rule TBFC_Malhare_HexDetect
{
    strings:
        $mz = { 4D 5A 90 00 }   // MZ header of a Windows executable
        $hex_string = { E3 41 ?? C8 G? VB }

    condition:
        $mz and $hex_string
}

REGULAR EXPRESSION STRINGS

This allow defenders to define flexible search patterns that can match multiple variations of the same malicious string. They are especially useful for detecting URLs, encoded commands, or filenames that share a common structure but vary slightly between samples. Regex strings are powerful but should be used carefully; they can match a wide range of data and may slow down scans if written too broadly.

rule TBFC_Malhare_RegexDetect
{
    strings:
        $url = /http:\/\/.*malhare.*/ nocase
        $cmd = /powershell.*-enc\s+[A-Za-z0-9+/=]+/ nocase

    condition:
        $url and $cmd
}

CONDITION SECTION

The condition section is the heart of every YARA rule. It defines when a rule should trigger by evaluating the results of all string checks and other rule components. Think of it as the final decision point—the moment when YARA determines whether the scanned file or memory matches the rule.

MATCH A SIMPLE STRING

This is the simplest condition. This rule triggers if one specific string is found

condition:
    $xmas

MATCH ANY STRING

When multiple strings are defined, the rule can be configured to trigger as soon as any one of them is found. This approach is useful for detecting early signs of compromise; even a single matching clue can be enough to raise attention.

condition:
    any of them

MATCH ALL STRINGS

To make the rule stricter, defenders can require that all defined strings appear together. This approach reduces false positives; YARA will only flag a file if every indicator matches.

condition:
    all of them

WITH LOGIC: AND, OR, NOT

Logical operators let defenders combine multiple checks into one condition

condition:
    ($s1 or $s2) and not $benign
    
 * the rule will trigger if either $s1 or $s2 is found, but not $benign. In other 
   words: detect suspicious code, but ignore harmless system files.

WITH COMPARISONS:

YARA can check file properties, not just contents. For example, defenders can detect files that are unusually small or large, a common trick used by threat actors to disguise payloads.

condition:
    any of them and (filesize < 700KB)
 * the rule will trigger only when one of the strings matches and the file size is 
   smaller than 700KB

EXAMPLE

PS C:\> notepad > icedid_starter.yar

rule TBFC_Simple_MZ_Detect
{
    meta:
        author = "TBFC SOC L2"
        description = "IcedID Rule"
        date = "2025-10-10"
        confidence = "low"

    strings:
        $mz   = { 4D 5A }                        // "MZ" header (PE file)
        $hex1 = { 48 8B ?? ?? 48 89 }            // malicious binary fragment
        $s1   = "malhare" nocase                 // story / IOC string

    condition:
        all of them and filesize < 10485760     // < 10MB size
}

PS C:\> yara -r icedid_starter.yar C:\
 icedid_starter  C:\Users\WarevilleElf\AppData\Roaming\TBFC_Presents\malhare_gift_loader.exe

 * -r - Allows YARA to scan directories recursively and follow symlinks
 * -s - Prints the strings found within files that match the rule
 
 * this YARA command recursively scans the C:\ drive using the icedid_starter.yar 
   rule file and can optionally display the matching strings when using -s.

PreviousHOST ANALYSIS NextFLOSS

Last updated 2 days ago