Log4j Analysis - What to Do Before the Next Big One

Over the past month, security teams have been scrambling to deal with the fallout from the Log4Shell vulnerability (CVE-2021-44228) announced in early December. Between blocking exploitation attempts and trying to determine vulnerable assets, it had already been a long winter for defenders. This vulnerability is particularly challenging as the Apache Log4j library has been used within so many different applications worldwide that it created an unusually large surface area for security teams to identify and defend. Now that the initial shock of the vulnerability is over, we wanted to answer some questions received during the exploit surge and identify a few preventative strategies that might help during future outbreaks.

What does scanning for Log4J look like now?

GreyNoise-log4j-chart-data-December-January
Figure 1: Log4j-related activity from December 10, 2021, to Jan 12, 2022. ‘Attributable’ activity describes individuals or organizations that voluntarily provided self-attribution while scanning for Log4j

As of January 2022, a month after initial CVE announcement, GreyNoise still observes a significant volume of traffic related to the Log4j vulnerability. This traffic is primarily composed of generic JNDI string exploit attempts with known obfuscations.

One of the interesting patterns we saw during the first few days of the Log4j “scan-and-exploit” outbreak was a huge surge in benign actors scanning for the vulnerability. The chart above shows Log4j-related activity broken down by scanners who provided attribution (generally benign scanning done by security firms, researchers, and academics) compared to non-attributed scanning (generally, malicious scanning by threat actors).

A huge part of the surge in scanning activity during the first days of the outbreak can be attributed to benign actors. Within the security community, there is significant discussion about the appropriateness of this scanning volume, as security teams further struggled with the alert volumes generated by this traffic during an emergent situation. It’s controversial enough that some in the security community are advocating blocking these types of scans.

Should I block the IPs that are scanning?

That depends. GreyNoise tracks internet noise caused by IPs scanning the entire internet, and classifies them as malicious, unknown, or benign based on their behavior and identity. For example, security vendors that scan the internet to identify vulnerable systems who voluntarily provide self-attribution are generally classified as benign. Other IP addresses that do opportunistic or unsolicited scanning, vuln checking, or exploitation are generally classified as malicious.

Note that organizations are not obligated to allow scanning of their network perimeter, regardless of GreyNoise classification. The value added by allowing or not blocking any IP seen by GreyNoise will vary depending on an organization’s threat model and security posture. The intended purpose of most benign traffic observed by GreyNoise is often to provide context, awareness, and added value to the IT and InfoSec community. However, any significant volume of unsolicited traffic, even that classified as benign by GreyNoise, may result in SOC alert fatigue and dangerous distraction during an active attack.

Does the GreyNoise tag capture the newest versions/latest associated vulnerabilities?

Mostly. The GreyNoise Log4J tag utilizes the presence of a JNDI format string within a packet’s body to tag IPs. The tag focuses on the core cause of the Log4j vulnerability, common to all the CVEs related to Log4j (CVE-2021-44228, CVE-2021-45046, CVE-2021-45105, CVE-2021-44832). As a result, the GreyNoise tag has no false positives and provides substantial coverage for relevant CVEs.

However, GreyNoise researchers have observed at least two examples of attempted Log4j exploits where the malicious string was base64 encoded in an application-specific parameter, allowing it to circumvent the GreyNoise tag.

See the following for more details: https://gist.github.com/nathanqthai/197b6084a05690fdebf96ed34ae84305#base64-encoded-into-parameter

Can I get payload data? Pcap?

Not usually. GreyNoise does not currently provide raw sensor data for operational security purposes, although we may do so in the future. The GreyNoise Visualizer and APIs do expose select User-agents and URI paths.

That said, due to the high variance of payloads observed at the peak of Log4j activity in December 2021, GreyNoise researchers elected to curate and publish a unified list of payload examples:

https://gist.github.com/nathanqthai/197b6084a05690fdebf96ed34ae84305#base64-encoded-into-parameter

What’s next?

Application-specific attacks leveraging Log4j vulnerabilities. This Apache Log4j vulnerability has been extremely challenging due to the ubiquity of the logging library's use. CVE-2021-44228 had an enormous impact and drew significant attention to how the Log4j library was used within applications worldwide. This attention resulted in several follow-on CVEs that bypassed the initial patch and used varied attack vectors (CVE-2021-45046, CVE-2021-45105, CVE-2021-44832). Log4j-related exploit activity may evolve as security researchers continue to scrutinize the library and its usage across various applications. For example, application-specific vulnerabilities like those discovered in H2 Database Console and VMware may become more prevalent. (https://portswigger.net/daily-swig/researchers-discover-log4j-like-flaw-in-h2-database-console, https://www.vmware.com/security/advisories/VMSA-2021-0028.html) At this time, GreyNoise has not observed any notable trends or upticks regarding application-specific Log4j payloads.

There are more servers on the internet than there is IPv4 space to assign each of these servers a unique address. In the case of the HTTP protocol, hundreds of servers may share a singular IP address and only be reachable when a specific host header is set as part of the connection request. Scoping out this much larger section of the internet in relation to Log4j is a non-trivial task that remains to be fully explored. It is also one of the reasons the cyber defense search engine “Onyphe” opted against scanning the entire internet for vulnerabilities related to Log4j and instead opted for a more targeted approach.

Stay tuned to GreyNoise to help identify exploit outbreaks

While things are not as bad as they were in December 2021, we do not envision Log4j scanners and attackers disappearing anytime soon. At GreyNoise, our goal is to help identify these kinds of outbreaks as fast as we possibly can in order to give security teams the time and breathing space they need to get their defenses in place.

You are always welcome to use the GreyNoise product to help you separate internet noise from threats as an unauthenticated user on our site. For additional functionality and IP search capacity, create your own GreyNoise Community (free) account today.