Blog
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Harness the Power of GreyNoise Integrations to Enhance Your Cybersecurity Posture

GreyNoise is a powerful cybersecurity solution that provides valuable context on internet-wide scan and attack data. By collecting and analyzing this data, we help organizations distinguish between targeted attacks and background noise, reducing false positives and improving security operations efficiency and overall security outcomes for every organization that uses both our Visualizer or API. Today, we'll explore the GreyNoise integrations universe, discuss how these extensions can benefit every category of security tool and service, plus explain why both vendor flexibility and community support is essential. 

How Can All Cyber Tools/Solutions Benefit from GreyNoise API Integrations?

Cyber tools and solutions of every kind can greatly benefit from integrating with the GreyNoise API, even at the community tier. Here are a few ways that these tools can leverage GreyNoise data:

  1. Enrich Security Events: By integrating the GreyNoise API into security monitoring tools, users can gain additional context about security events, helping them prioritize threats and respond more effectively.
  2. Augment commercial and OSINT Data: Commercial and open-source intelligence (OSINT) tools can benefit from GreyNoise data by providing users with additional insights into IP addresses and scanning activities, ultimately improving the quality of their intelligence.
  3. Enhance Vulnerability Management: By incorporating GreyNoise data into vulnerability management tools, users can better understand the risk associated with specific vulnerabilities and make more informed decisions about mitigation strategies.
  4. Optimize Incident Response: GreyNoise API integration with incident response tools can help streamline the investigation process by providing valuable context on potentially malicious activities, enabling faster and more accurate response efforts.
the solar systems of the greynoise integrations universe
The Solar Systems Of The GreyNoise Integrations Universe
The visual is just meant to communicate group presence. There is no significance to the order of the "planets"
I mean, I still think Pluto is the best planet (and, it 100% is a planet), even though it's way out there.

The GreyNoise Integrations Universe

The GreyNoise integrations universe is vast and designed to support a variety of security tools and platforms. This includes Security Information and Event Management (SIEM), Extended Detection and Response (XDR), Security Orchestration, Automation, and Response (SOAR), Threat Intelligence Platforms (TIP), and Analyst Tools/Open-Source Intelligence (OSINT). You can see just how vast it is for yourself!  By integrating GreyNoise into each of these solution areas, organizations can enrich their security alerts, enhance visibility, streamline operations, and make far more informed decisions.

The Importance of Vendor Flexibility

Vendor flexibility is crucial in the cybersecurity landscape. By supporting a wide range of security tools and platforms, GreyNoise empowers organizations to tailor their security ecosystem to meet their unique needs. According to buzzword-laden analysts and experts alike, adopting an open and flexible approach to security integrations enables organizations to leverage best-of-breed solutions, maximize the value of their existing tools, and enhance overall security posture. What that truly means, though, is that GreyNoise meets you where you are at, regardless of budget cycle. When that three- or five-year contract is up on that SIEM, rest assured that it’s more than likely GreyNoise will work with that fancy new tool you spotted at RSA. GreyNoise's commitment to vendor flexibility ensures seamless integration with your preferred tools, ultimately improving your organization's security capabilities.

Maximize the Benefits of GreyNoise Integrations

Integrating GreyNoise across your entire environment is highly beneficial. By incorporating GreyNoise data into all the acronyms you own — SIEM, XDR, SOAR, TIP, and OSINT — you can enrich your security alerts with valuable context, reduce false positives, improve incident response times, and centralize threat intelligence data. By leveraging GreyNoise's context across your systems and services, you can gain a more comprehensive understanding of your security landscape and make better-informed decisions to protect your organization.

Our API can help you hone in on anomalies, cast a wider net when it comes to identifying and blocking malicious sources, and provide actionable context on today’s “internet weather”.

GreyNoise Community: A Hub for Open-Source Integrations

GreyNoise has a vibrant and thriving community of practitioners, with members that collaborate on developing open-source integrations with GreyNoise. These community-driven integrations enhance the functionality of a plethora of Free and Open-Source Software (FOSS) cyber tools, demonstrating the versatility and value of GreyNoise data.

Members of the GreyNoise Community invest their time and effort into integrating GreyNoise into open-source projects for several reasons:

  1. Improved Security Analysis: GreyNoise data adds valuable context to security events, allowing FOSS tools to provide more accurate and actionable insights.
  2. Reduced False Positives: By incorporating GreyNoise data, open-source projects can more effectively filter out internet background noise, reducing false positives and helping analysts focus on real threats.
  3. Knowledge Sharing and Collaboration: The GreyNoise Community encourages collaboration and knowledge sharing, fostering innovation and driving continuous improvement in open-source cybersecurity tools.
  4. Enhanced Threat Intelligence: Integrating GreyNoise into FOSS cyber tools can enrich threat intelligence data, empowering users to make better-informed decisions about potential threats.

If you're developing or using FOSS cyber tools, consider integrating the GreyNoise API to unlock the full potential of your security solutions. Join the GreyNoise Community and collaborate with like-minded individuals to improve the cybersecurity landscape and make the internet a safer place for everyone.

Integrate GreyNoise Today!

Are you ready to harness the power of GreyNoise? Sign up for a free GreyNoise account and start exploring the benefits of GreyNoise integrations. Experience firsthand how GreyNoise data can enhance your security operations and threat intelligence. For organizations looking to unlock the full potential of GreyNoise, navigate to your GreyNoise account section to sign up for a free enterprise trial and begin integrating GreyNoise into your security ecosystem today. Don't miss out on the opportunity to strengthen your organization's defenses with the power of GreyNoise.

Get Started With GreyNoise for Free

Work Smarter, Not Harder: How to Upgrade Your Threat Intel Program in 2023

Cyber threats are constantly evolving, and organizations need to stay on top of the latest techniques and tools to protect themselves against attacks. One of the most critical aspects of this is having an effective threat intel program in place. But how do you upgrade your program to keep up with the ever-changing threat landscape? Our answer: start looking for patterns in attack telemetry.

David Bianco’s ‘Pyramid of Pain’ illustrates the relationship between the types of indicators you might use to detect an adversary's activities and how much pain it will cause them when you are able to deny those indicators to them. Organizations can better identify and defend against threats by moving from simple indicators like domains, hashes, and IPs to focusing on more difficult to change indicators such as TTPs. While gaining this additional insight can take more time, defenders can do more to detect and prevent future attacks.

The Pyramid of Pain | Source: David Bianco

GreyNoise data is awesome, but in order to move from IPs -> TTPs, we have built new features to help you upgrade your Threat Intel program (thanks to the Pyramid of Pain)!

IP Similarity

It is now easier than ever to fingerprint attacker infrastructure. This new feature clusters activity based on similar behavior, like similar HASSH and JA3 fingerprints, RDNs, user agents, and ports scanned. Based on the results from IP Similarity, you can hunt within your own network to proactively find other related malicious activity.

GreyNoise IP Similarity Dashboard comparing HASSH Fingerprints of two IPs 71.6.199[.]23 and 89.248.172[.]16


IP Timeline

The IP Timeline displays activity as seen by GreyNoise sensors of a particular IP Address over the past thirty days. By checking our timeline graph, you can see when an IP interacts with our sensors. This chron data helps CTI teams identify if an attacker is using an automated process or if the scan/attack process is manual.

GreyNoise IP Timeline view for 41.65.223[.]220

Understanding how adversaries operate and adopting a defined strategy to detect and remediate can lead to a more effective threat intelligence program. GreyNoise can be used to easily enrich threat feeds to gain deeper insight into how attacker infrastructure is being used and quickly understand what services, devices, and vulnerabilities they want to leverage as part of their campaign.

If you are interested in learning more about any of these new features, request a demo.

Get Started With GreyNoise for Free

Feature Preview: April Fools - The Making of GhostieBot

Co-Authors include: Austin PriceJen Dooley

Introducing GhostieBot

For April Fools this year, the GreyNoise team created GhostieBot, an Artificial Unintelligence bot serving you all the answers you didn’t need. 

We had a lot of fun creating it and thought it was a good example of the ideation, design, development, and release process at GreyNoise. Here we hope to walk you through that process so you can understand a little better how we work.

The Process

Ideation

We knew we wanted to have a fun April Fools joke this year, but everyone was already working on a ton of projects to make GreyNoise a more useful product. We decided to take a group of volunteers and just grab an hour here or there to work the problem.

GreyNoise April Fools Tributes

Our first stop was a Figma brainstorming session. Just set up some space for everyone to jot down ideas, start a 5-10 minute timer, play some smooth jazz, and go to work.

Brainstorming

After all our ideas were gathered, we discussed them and added +1s to the ideas we liked. Since the world has been taken over by chatbots and large language models, we ultimately ended up with a GreyNoise chatbot that we could use to make jokes and expose some of the other ideas from the brainstorming session that weren’t big enough for their own show. Though you never know, the Internet Weather Report from the brainstorming session might pop up sometime…

Mocks

Next up, we had to create some mocks for what we wanted the GhostieBot interface to look like. Chatbots and messaging interfaces, in general, have some pretty established patterns. To keep things as simple and quick as possible, we leaned heavily on our design system and went with a standard chat/messaging layout. There were a few new elements in the design, like the message bubbles and Ghostie avatar, that we needed to create. We also needed to make sure it was responsive and handled small and large screens well. Altogether, these were pretty simple items, and we were able to have the finished mockups ready in under an hour.

Sourcing Responses

Since our chatbot is not a real chatbot, we had to actually come up with the responses we wanted, arguably a tougher task than creating a real chatbot. Luckily, we have a ton of nerds on staff who like terrible jokes. After spinning up a quick Notion page, we were able to crowdsource some ideas.

Bad joke central

Making it real

Now it is time to make it all real; we took the mock-ups, created a new page, and started building. We compiled the list of questions and properly formatted them for display. Then built out the basic structure of a chat interface; once that was set up, we added a few nice to haves:

  • “Enter” to submit instead of having to click “Submit”
  • Scroll offscreen gradient to add visual cues
  • Improved message timing so it felt like you were actually chatting with someone instead of instant replies.
  • “Ghosty is typing…” message based on response length

Once the interface was completed, we hid the Chat behind a feature flag as well as set a date window for the chat to be available to the public. This allowed us to test the chat before it went live.

And while we went with a more informative page for 127.0.0.1, y’all almost ended up with:

Alternate GreyNoise Localhost Details

Recap

We had a ton of fun putting this all together, and we hope you enjoyed it too. To keep in touch with GreyNoise as we figure out how to build an amazing product for the cyber security community, sign up for a free account (https://viz.greynoise.io/signup), join our Slack community (greynoiseintel.slack.com) and follow us on Twitter https://twitter.com/GreyNoiseIO. We also have a couple of positions open (https://www.greynoise.io/careers#Current-Openings).

Get Started With GreyNoise for Free

GreyNoise Voluntary Product Accessibility Template

At GreyNoise, we're excited to announce that our Voluntary Product Accessibility Template (VPAT) is now available. We believe that everyone should have equal access to our product and service, regardless of their disabilities or abilities. By providing a document that evaluates our product's accessibility for people with disabilities, we are taking a step forward in ensuring that our product meets the needs of all users. We are committed to creating an environment that is inclusive and accessible to everyone, and we believe that our VPAT is an essential part of this initiative.

What is a VPAT?

VPAT stands for Voluntary Product Accessibility Template, which is a document that outlines how accessible a product or service is to individuals with disabilities. It provides information on how well the product or service conforms to the Web Content Accessibility Guidelines (WCAG) and other accessibility standards. It's an important tool for ensuring that everyone, regardless of their abilities, can use and benefit from our product and service.

What does a VPAT contain?

A VPAT is a detailed report on how well a product or service conforms to accessibility guidelines such as Section 508 of the Rehabilitation Act in the United States. It typically contains information on the product's conformance to accessibility standards, including how it complies with various criteria related to accessibility, such as keyboard accessibility, color contrast, and assistive technology compatibility. Additionally, the VPAT provides details on any known limitations or barriers that may exist for users with disabilities and any plans for future development or improvement.

Why is a VPAT important?

Accessibility is a fundamental human right, and it's crucial that our product and service are designed with everyone in mind. People with disabilities make up a significant portion of the population and deserve equal access to information and services. A VPAT is a valuable tool for organizations to demonstrate their commitment to creating and providing accessible products and services, as well as fulfilling legal obligations. By completing a VPAT, we're ensuring that GreyNoise is accessible to as many people as possible.

Why is accessibility important?

Accessibility is important because it ensures that everyone, regardless of their abilities or disabilities, can access and use our platform. In the United States, approximately 61 million adults have a disability*, representing a significant portion of the population. By making our platform accessible, we're opening up our product and service to a much broader audience, leading to increased engagement, more meaningful interactions, and, ultimately, better outcomes for everyone.

In addition, accessibility can lead to better user experiences. People with disabilities may face significant challenges when accessing websites or online tools not designed with their needs in mind. By making our platform accessible, we're reducing these barriers and making it easier for everyone to use our product and service. 

What's next for GreyNoise's accessibility efforts?

At GreyNoise, we're committed to continuous improvement. We're constantly looking for ways to make our platform more accessible and inclusive. In addition to providing a VPAT, we're also working on other accessibility initiatives, such as improving our keyboard navigation, adding alternative text to images, and ensuring that we meet accessibility standards.

We believe that accessibility is an essential part of our platform, and we're committed to making our tools and services accessible to everyone. By providing a VPAT, we demonstrate our commitment to accessibility and inclusivity, which can lead to a better experience and outcomes for everyone. We look forward to continuing our accessibility efforts and making GreyNoise a platform everyone can use and enjoy. 

Reference:

*https://www.inclusivecitymaker.com/disability-statistics-in-the-us/#:~:text=As%20stated%2C%20according%20to%20the,or%201%20in%204%20adults.

https://www.section508.gov/sell/vpat/

Get Started With GreyNoise for Free

How we built IP Similarity

We briefly introduced IP Similarity previously, but now we want to dive deep and show how we made this idea a reality. 

The Goal

The first goal of IP Similarity is to encode a GreyNoise record as a numerical feature vector. This is just an array of numbers that somehow represent all of the data we have in a GreyNoise record.

Figure 1: Record to  Feature Vector

This representation is extremely useful for machine learning and any numerical analysis. From this point we can quantitatively measure how far away two records are, cluster groups of records together, and build all sorts of classifiers. This is the ground floor basis for applying machine learning to GreyNoise data.

The Nitty Gritty

But, getting there is hard. Our records contain a vast amount of unstructured and semi-structured textual data. User-Agents can be nearly anything you want, from ​

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko)

to ​

Anarchy99

Web paths can be as simple as ​

/

or complicated like ​

/${(#a=@org.apache.commons.io.IOUtils@toString(@java.lang.Runtime@getRuntime().exec("whoami").getInputStream(),"utf-8")).(@com.opensymphony.webwork.ServletActionContext@getResponse().setHeader("X-Cmd-Response",#a))}/

Ports can be any or all of the 65,535 available values. The list goes on.

In order to turn this complex multi-modal data into a fixed size numerical feature vector we employ a few tricks, primarily: tokenization and “the hashing trick”.

Books could be (and have been) written on tokenization, but for our purposes we can implement a simple regex.

tokens = re.sub(r'[^\w\s]', ' ', text)

This matches everything but  alphanumeric characters and replaces them with whitespace, with which we can split the string on and lowercase all values. This turns our ​

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko)

into the list of item ​

['mozilla', '5', '0', 'x11', 'linux', 'x86_64', 'applewebkit', '537', '36', 'khtml', 'like', 'gecko']

Once we have more consistent tokens, we can put them all into a fixed bucket with the hashing trick. This works as follows:

  1. Create a zero vector the size you want. E.g. a size 16 vector would be [0, 0, 0, 0, 0, 0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0 ,0].
  2. Take your text, hash it, and modulo it to the size of your vector.
import hashlibbucket_size = 16
text = 'mozilla'
hash_index = int(hashlib.sha1(text.encode("utf-8")).hexdigest(), 16) % (bucket_size)
  1. Insert a 1 (or some other value as you choose, perhaps scaled based on the number of items you’re indexing) into the ​hash_index​ position. So ‘mozilla’ would get inserted into the 9 position of the vector, resulting in [0, 0, 0, 0, 0, 0 ,0 ,0 ,0 ,1 ,0 ,0 ,0 ,0 ,0 ,0].
  2. Continue with all of the items you want to hash into that vector.
  3. Note: For our use case, we are scaling the value inserted into the vector based on the number of items we are indexing. If two are put in the same position, they are added together.
  4. Finally, the string of tokens ​['mozilla', '5', '0', 'x11', 'linux', 'x86_64', 'applewebkit', '537', '36', 'khtml', 'like', 'gecko']​ would get hashed to [0, 0.0833, 0.0833, 0.1667, 0.1667, 0.0833, 0, 0.0833, 0, 0.0833, 0.0833, 0.0833, 0.0833, 0, 0, 0]
  5. For higher fidelity, you can increase the bucket size from 16 to a larger number.

Now that we have a method to turn unbounded text into a fixed numerical vector, we can do this with many more of our fields and concatenate the results , along with boolean variables (e.g. is this IP coming from a VPN? T/F), to create one long feature vector to represent each record. Success!

Figure 2: Base Feature Vector

But weight, there’s more!

Not all features have equal importance, so we need to create weights so some features have more significance than others in the analysis. 

For IP Similarity we are using a combination of relatively static IP centric features, things we can derive just from knowing what IP the traffic is coming from or their connection metadata, and more dynamic behavioral features, things we see inside the traffic from that IP. These features are:

IP Centric

  • VPN
  • Tor
  • rDNS
  • OS
  • JA3 Hash
  • HASSH

Behavioral

  • Bot
  • Spoofable
  • Web Paths
  • User-Agents
  • Mass scanner
  • Ports

Features like JA3 can be less important while features like Web Paths can really show good similarity between records.

We are curating an ever growing collection of pairs of GreyNoise records that we think are good matches and bad matches. With these, we can randomly go through our collection, compare the feature vectors for the records and adjust the weights to make those matches (or non-matches) better and better. This creates a weight vector that we can use to adjust our base feature vector.

Figure 3: Weight Vector

The Final Vector

We take our GreyNoise record, extract the features we want to use, apply the hashing trick or other numerical logic, apply our weights, and we are left with a final vector that is ready to be used in comparison and machine learning. For example:

Figure 4: Final Feature Vector Calculation

The Results

With this new representation we can do a lot of ML, but our first use case is IP Similarity, which answers the following question:

Given an IP address and all that GreyNoise knows about it, show me all other IPs GreyNoise has seen that have similar characteristics and behaviors.

To do this we compare two feature vectors and calculate L2Norm. Just like in geometry where you use the Pythagorean theorem, a2 + b2 = c2 or c = sqrt(a2+b2), L2Norm just extends that to a larger space, so it is simply a measure of how far two points/vectors are from each other. If L2Norm is small, the feature vectors are close and thus very similar. If it is large, the feature vectors are far from each other and thus dissimilar.

We put all of this feature vector information into ElasticSearch alongside our GreyNoise records and voilà, we can now find any GreyNoise records that are similar to any other. Some of the use cases are:

We can take a single IP from our friends at Shodan.io, https://viz.greynoise.io/ip-similarity/89.248.172.16, and return 21 (at the time of writing) other IPs from Shodan, 

Figure 5: IP Similarity of 89.248.172.16  as shown in GreyNoise. 

And we can compare the IPs side by side to find out why they were scored as similar.

Figure 6: IP Similarity Details 

While we have an Actor tag for Shodan which allows us to see that all of these are correct, IP Similarity would have picked these out even if they were not tagged by GreyNoise.

We can take an IP from the tagged with NETGEAR DGN COMMAND EXECUTION, https://viz.greynoise.io/ip-similarity/182.126.118.174, and return many other IPs that could be  part of that attack, 

Figure 7: IP Similarity of 182.126.118.174 as shown in GreyNoise. 

We can see they share OS, Ports, Web Paths, and rDNS.

We can take an IP from another prolific scanner like ReCyber, https://viz.greynoise.io/ip-similarity/89.248.165.64, and return a large number of IPs, many from ReCyber, but others that simply act like ReCyber, 

Figure 8: IP Similarity of 89.248.165.64 as shown in GreyNoise. 

The End

Ultimately, we hope this tool is insanely useful to you and you’ve developed a better understanding of how it works under the hood. Be on the lookout for more features, machine learning applications, and explanations! To try IP Similarity for yourself, sign-up for a free trial or request a demo to learn more.

(*Create a free GreyNoise account to begin your enterprise trial. Activation button is on your Account Plan Details page.)

Get Started With GreyNoise for Free

OpenAI, MinIO, And Why You Should Always Use docker-cli-scan To Keep Your Supply chAIn Clean

OpenAI ChatGPT has recently released a new feature that allows for plugins to fetch live data from various providers. This feature has been designed with "safety as a core design principle", which means that the OpenAI team has taken steps to ensure that the data being accessed is secure and private.

However, there are some concerns about the security of the example code provided by OpenAI for developers who want to integrate their plugins with the new feature. Specifically, the code examples utilize a docker image for MinIO RELEASE.2022-03-17. This version of MinIO is vulnerable to CVE-2023-28432, which is a security vulnerability resulting in information disclosure of all environment variables, including MINIO_SECRET_KEY and MINIO_ROOT_PASSWORD.

While we have no information suggesting that any specific actor is targeting ChatGPT example instances, we have observed this vulnerability being actively exploited in the wild. When attackers attempt mass-identification and mass-exploitation of vulnerable services, “everything” is in scope, including any deployed ChatGPT plugins that utilize this outdated version of MinIO.

To avoid any potential data breaches, it is recommended that users upgrade to a patched version of MinIO (RELEASE.2023-03-20T20-16-18Z) and integrate security tooling such as docker-cli-scan or use Github’s built-in monitoring for supply chain vulnerabilities, which already contains a record referencing this vulnerability.

GreyNoise has posted an issue to the affected OpenAPI GitHub project to help ensure this weakness gets addressed as soon as possible.

While the new feature released by OpenAI is a valuable tool for developers who want to access live data from various providers in their ChatGPT integration, security should remain a core design principle.

Text of a chat with ChatGPT where we ask it about the vulnerability explained in the post.

Pancakes Con: Cyber + Interests = The Best Con (ever?!)

If you’re looking for an extremely wholesome conference to attend, look no further than PancakesCon. This conference requires speakers to talk about 2 things: (1) a brief talk about any cybersecurity topic and (2) a brief talk about something which is not IT-related. As you might imagine, this leads to some great talks! Some of our favorites from this year included:

Collaboration Required: Threat Intelligence Sharing & Coop Board Games (Grace Chi)

Grace led us through why sharing in CTI matters (spoiler alert: it's how we all win) and some of the best coop games (Spirit Island, anyone?!) around to play with family and friends.

Playing with Exploits in Metasploit & Playing with Ideas via Clowning (Tina Coleman)

Extremely talented performer Tina live-demoed Metasploit shenanigans (bless the demo gods) and shared her experience as a semi-professional Clown!

Dry Cup: A Primer on Cybersecurity in China (and also Baijiu) (Jonathan Reiter).

Jonathan gave an incredible and nuanced background on what cybersecurity looks like in China (with some Mandarin to boot). We also got educated on the customs of Baiju, a Chinese fermented grain alcohol best shared with friends. 

We were also lucky enough to talk a little bit about our hobbies and cyber investigations.

OSINT & Oreos: Using OSINT to uncover a network of “House Hunters” (while trying to make homemade oreos)

3 years ago on a Sunday afternoon, I was baking when I got a call from an out-of-state number inquiring to buy a house I owned. Except, I don't own any house. Since then, I have received dozens of calls inquiring about the property and I have always wondered - why me? Who is the owner? Why do people want this house so badly? I got to explore this and more, all while sharing my best tips for weekend baking with Oreos. (or uh generic chocolate wafer cookies). (Why Oreos? It rhymed!) 

Homemade Oreos! Source: My own two hands (and recipe from Serious Eats)

Dio9sys: Fancy Nails and Fancy Rails: how to make “smart” NFC fingernails and the Trans Travel Guide to Amtrak

Brianna has a keen eye for fashion (and hardware) and has combined this into the ultimate manicure - “smart” NFC nails. Bri took nail “tip taps” to a whole new level with a true DIY demo. Beyond that, we got a firsthand fan-girl account of the American rail system - the true way to get your fill of adventure in a safe and comfortable environment!

NFC Nails Complete - Source: Twitter

A major thank you to Lesley Carhart, and the entire PancakesCon organizing committee & volunteers - you made the internet feel a little cozier this weekend! 

Don’t Let Your Team Drown in Netflow

Whether you’re working with netflow data collected from your own devices, flow logs from a cloud provider, or purchasing data from netflow providers, you may find it challenging to get immediate value out of it. Not only is there a vast amount of data to hunt through, but it can be challenging to fully understand what is happening based on netflow logs alone. While there are plenty of benefits to collecting and analyzing flow data, these challenges can make it difficult to use the data day-to-day to support investigations. 

In order to start using netflow effectively in an investigation, it’s important to have a good understanding of the network and an established baseline of activity. This makes it easier to distinguish between normal traffic on the network and anomalous traffic that should trigger alerts to your team. Beyond defining these baselines, netflow data often needs to be correlated with additional sources such as alerts from the firewall and threat intelligence data to better understand particular flows and further establish patterns and relationships in the connections.

Even with these baselines defined and alerts being created using correlated data, oftentimes users still need to hunt though a massive amount of flow data to identify malicious activity that might have been missed. Outside of looking at deviations from a baseline it can be challenging to determine where to start investigating. 

This is where GreyNoise can help! Filtering opportunistic and mass scan activity with data gathered from GreyNoise’s sensor network fast-forwards the process, allowing analysts and threat hunters to identify  suspicious activity and find targeted threats that might have been missed by other detections . GreyNoise also provides information on infrastructure used by common business services, which can be used to filter egress netflow traffic and hunt for malicious activity leaving the network.

Taking this one step further, let’s look at  organizations purchasing commercial netflow data from different sources. Oftentimes these groups combine this data with internet scanning services like Censys or Shodan in order to identify C2 beacons. While taking a proactive approach like this can help identify infrastructure and compromised systems earlier, there is still a lot of data to sift through. Instead, let’s use GreyNoise as a first pass filter to remove IP addresses that are saturating these sources and focus the investigation to find the signal in the noise.

Using GreyNoise Analysis to enrich all IPs from a netflow with one click

While the volume of data may present a challenge, flow data also does not always contain all of the necessary information needed to act on. Oftentimes this data needs to be correlated with other sources of data to have a true understanding of the activity observed. Using VPC flow logs as an example, instead of just filtering out the noise, users may also want to identify malicious IP addresses accessing their assets. In this case, enriching the data with GreyNoise provides insight into how these IPs operate and highlights access attempts that may violate defined policies.

Netflow data provides a solid option for understanding who is doing what on the network but comes with an operational  challenge  based on the volume of traffic and the lack of details in the data. In order to address these challenges:

  • Filter the data with GreyNoise to remove flows that teams don’t  need to focus on and speed up investigation times
  • Enrich IPs with GreyNoise to  better understand threats and build  detections  to support SOC and IR teams.

If you are interested in learning more about how to operationalize netflow data with GreyNoise, request a demo.

Get Started With GreyNoise for Free

Debugging D-Link: Emulating firmware and hacking hardware

GreyNoise doesn’t have much common need for detailed firmware analysis. If it’s happening on the internet, we already see it. However, when we do need to investigate vulnerabilities in embedded devices, things can get very complicated, very quickly if no information is publicly available. It can be fun and insightful to learn these skills in the rare case we need them.

In late October 2022, we became aware of CVE-2022-41140, a buffer overflow and remote code execution vulnerability in D-Link routers, which D-Link had been notified of on February 17th. Noting the months-long turnaround time, we decided this was a good chance to perform a learning and discovery exercise.

On March 13th, 2023 we became aware of CVE-2023-24762, a command injection vulnerability in D-Link DIR-867 devices. This recent CVE spurred us to share some of our internal documentation regarding a research spike into D-Link devices.

This blog aims to explain the process of gaining a foothold in firmware or a physical device for vulnerability research and achieving a debuggable interface. While existing Proof-Of-Concept code for (yet another) D-Link vulnerability CVE-2022-1262 is utilized within this document, as well as strong hints at suspect areas of code, don’t expect to find any new ready-to-fire exploits buried in the contents below.

What Vulnerability?

D-Link was notified of CVE-2022-41140, a buffer overflow vulnerability on February 17th, 2022. By November 15th, 2022, no additional information was available, which sparked an investigation into discovering available hints about the nature of the vulnerability. While this accurately speaks to the current state of public vulnerability tracking, we start off our investigation with a simple search on Google for the CVE and find two relevant links:

  1. https://www.zerodayinitiative.com/advisories/ZDI-CAN-13796/
  2. https://supportannouncement.us.dlink.com/announcement/publication.aspx?name=SAP10291

While the Zero Day Initiative lists the vulnerability as

(…) flaw exists within the lighttpd service, which listens on TCP port 80 by default. The issue results from the lack of proper validation of the length of user-supplied data prior to copying it to a fixed-length stack-based buffer.

the D-Link Technical Support page provides more detailed information

(…) a 3rd party security research team reported Buffer Overflow & RCE vulnerabilities in the Lighttpd software library utilized in DIR-867, DIR-878, and DIR-882/DIR-882-US router firmware.

A stack-based buffer overflow in the prog.cgi binary in D-Link DIR-867. A crafted HTTP request can cause the program to use strcat() to create a overly long string on a 512-byte stack buffer. Authentication is not required to exploit this vulnerability.

Additionally, the D-Link support page provides a table of the Affected Models

Model Affected FW Fixed FW Last Updated
DIR-867 v1.30B07 & Below Under Development 03/04/2022
DIR-878 v1.30B08-Hotfix & Below v1.30b08_Beta_Hotfix 04/01/2022
DIR-882-US v1.30B06-Hotfix & Below Under Development 03/04/2022

From this information, we can derive that the vulnerability is triggered by an HTTP request to TCP port 80, which will hit the lighttpd service and route to the prog.cgi binary resulting in an overflow on a 512-byte stack buffer.

We can also derive that the vulnerability can be patched/mitigated on some hardware models, but not others.

How to trigger the vulnerability?

The D-Link support pages provide links to download firmware images for the DIR-878, including base firmware versions like v1.30B08 as well as security advisement firmware versions like v1.30B08 Hotfix_04b.

Knowing that we can access the firmware images before/after the security patch for CVE-2022-41140, we will attempt the following steps:

Obtain copies of prog.cgi

We start by downloading a known vulnerable version of the firmware for a model that also offers a patched version. We download DIR-878_REVA_FIRMWARE_v1.30B08.zip and extract the firmware image DIR_878_FW1.30B08.bin.

We run the file command to quickly determine if it’s a commonly known file type. Unfortunately, this returns generic information.

Next, we use a more specialized tool, binwalk, which assists in searching binary images for embedded files and executable code. Again, this produces no results.

A handy feature of binwalk is the -E, --entropy command line flags, which allow you to measure the entropy or “randomness” of a file.

As an example, here is an entropy graph of 1024 bytes of Lorem ipsum:

And here is an entropy graph of DIR_878_FW1.30B08.bin

As you can see, the entropy of our firmware image is very high. Typically, this is indicative that a file is in a compressed archive format or is encrypted. Since neither file nor binwalk identified it as a compressed archive format, it’s reasonable to assume that it may be encrypted.

If you believe a file is encrypted, it’s always a good idea to take a peek at the bytes at the start of the file, just in case there’s an identifiable file header:

At the start of the file is a 4-byte sequence that maps to the ASCII characters “SHRS”.

A quick Google search for “SHRS firmware” turns up relevant results, indicating that we’re on the right track.

  1. https://github.com/0xricksanchez/dlink-decrypt/blob/master/dlink-dec.py
  2. https://0x00sec.org/t/breaking-the-d-link-dir3060-firmware-encryption-recon-part-1/21943
  3. https://0x00sec.org/t/breaking-the-d-link-dir3060-firmware-encryption-static-analysis-of-the-decryption-routine-part-2-1/22099

After a bit of reading, we can determine that D-Link does indeed encrypt some of their firmware, which is identifiable by the “SHRS” header. The blogs linked above go into depth on how they obtained a copy of the imgdecrypt binary and reverse engineer the binary to determine how to decrypt the firmware and produce the relevant python script.

Since we will be dealing with encryption again later in this blog, we won't go into depth on this specific layer of encryption. Our firmware can be decrypted with:

Taking our decrypted firmware and running it through binwalk again we can see that some file signatures are recognized.

Since file signatures were recognized, we can recursively extract them by using the -e, --extract, and -M, --matryoshka, command line flags.

This creates nested folders for each extracted layer of the file, ultimately resulting in a cpio-root folder containing the root filesystem for the firmware.

The desired prog.cgi file is located exactly where those familiar with *nix directory structures would expect it to be. However, for completeness, the file can be located by name using:

Now we have a copy of the entire root filesystem, including prog.cgi.

Repeating the same steps on the patched firmware sets us up for the next step.

Patch Diffing

In the previous step, we obtained an unpatched and patched copy of prog.cgi. We’ll rename them prog_old.cgi and prog_new.cgi, respectively, to help keep track.

BinDiff is a comparison tool for binary files, that assists vulnerability researchers and engineers to quickly find differences and similarities in disassembled code

For this blog, we’ll be using Binary Ninja with the BinDiff Viewer Plugin. There are roughly comparable free alternatives and plugins like Ghidra.

Following the relevant plugin steps to generate a bindiff, we open old/new and begin to look for functions that are very similar but not 1.00, indicating that a small change such as a patch may have been performed.

Uses of strcat()

Using our list of similar (but not exact duplicate!) functions, we work our way down the list, looking for uses of strcat() that have changed between old/new. In this example, the main function:

Old

New

Here we can see that the old binary used strcat() and the new binary has a different set of logic.

The strcat() function concatenates the destination string and the source string, and the result is stored in the destination string.

A quick check of the destination var_20c shows that its size is 0x200, or 512 bytes. For a sanity check, we can list all uses of strcat() throughout the binary.

There are 22 uses of strcat(). After reviewing them, none but the usage within main operate on a 512-byte buffer.

We now have a reasonable candidate for the location of the vulnerability.

Debugging with Emulation

Now that we have a reasonable candidate for a vulnerable code path, the next step is to start determining what conditions are required to actually reach the vulnerable code path. While wiser minds may be able to determine these conditions without needing a debugger, it’s always a safe bet to make getting a debugging interface a priority.

We want to run the necessary components and attach a debugging interface to a running program.

First, we need to determine the attributes of the file we would like to emulate. The file command we used earlier can be used to identify important information about the architecture the binary is meant to run on.

Using QEMU is an easy way to run binaries for other architectures, but with the same operating system as the current one. In this case, we want qemu-mipsel-static which is provided from the qemu-user-static package.

However, we need to know what to run.

There are init scripts that run when a system boots, and we can find the relevant one in /etc_ro/rcS:

It’s best to start at the top and work your way down and Google things where applicable.

  1. Filesystems are mounted
  2. /var/run folder is created if it doesn’t exist
  3. A script to create device (/dev) links is run
  4. The Message Of The Day (motd) is written to the device console
  5. A binary to manage reading/writing to non-volatile random-access memory (nvram) is started in the background
  6. A binary init_system is run with the start command
  7. A telnet daemon is started

/var/log folder is created if it doesn’t exist

Understanding the functionality of the init_system binary is elementary:

If init_system start is run, it checks for the presence of /var/run/nvramd.pid. If the pid file is not found, it enters a loop printing lighttpd: waiting for nvram_daemon. If the pid file is found, it branches into the following logic.

nvram is init then closed. sub_400e50 starts a number of .cgi binaries from /etc_ro/lighttpd/www/cgi-bin/, and finally the lighttpd web server is started with:

Using a combination of chroot and qemu-mipsel-static we can minimally and directly launch the lighttpd web service like so:

This results in an error of:

(network.c.747) SSL: Private key does not match the certificate public key, reason: error:02001002:system library:fopen:No such file or directory /var/private/lighttpd.pem

By simply commenting out the SSL related lines in /etc_ro/lighttpd/lighttpd.conf config file, we can just run the web service in HTTP mode exclusively and bypass the error.

Upon further review of the config, we can observe that the lighttpd web service is running in fastcgi mode and HTTP requests to the path /HNAP1/ are routed to be handled by prog.cgi.

If we navigate to our emulated system in a web browser, we can see that a page is loaded, and numerous UI assets load successfully, but the page is blank due to a malformed XML response from the /HNAP1 endpoint.

The root cause of the malformed XML response is due to the default values for nvram not being set. I spent a large amount of time trying to fix this by using LD_PRELOAD tricks and eventually ended up ordering a physical DIR-867 model (guaranteed vulnerable, no patch available) in frustration.

By the time the physical router was about to be delivered, I had a mostly working proxy for calls to functions from libnvram-0.9.28.so, at which point I remembered that the vulnerability was Pre-Authentication. I was trying to fix something that was part of the login flow which, I thought, was necessary.

After taking a bit of time to find a different endpoint to sanity check myself, I found that most of the other pre-auth functions of prog.cgi respond without issue. They are missing default values which would have been stored in nvram, but do not result in errors.

For our purposes, this is enough to work with and proceed forward. Getting a debugger attached by invoking prog.cgi in QEMU with the -g flag starts a GDB connection on port 1234.

Debugging With Physical Device

As stated earlier, I purchased a used physical model DIR-867 router, which is guaranteed to be vulnerable as no patches are available.

After opening the box, I began the setup process and set a device admin password of Password1 and set updates to “manual”.

After completing the setup steps, the router reboots.

Most importantly, I figured out how to reset the router using the button on the back and re-do the setup steps again to make sure nothing that I’d set so far has persisted across a reset.

Now that the router is set up in its most basic state, I do a quick scan for open ports.

Much to my chagrin, there is no 23/tcp open telnet result, despite the telnetd service appearing in the /etc_ro/rcS init scripts I’d found during emulation. I’ll need to find another way to get an interactive interface on the router to run a debugger.

At this point, physically opening the router up and trying to find a UART interface would probably be the quickest path to success. However, as I wasn’t in any particular rush, I decided to try to figure out how to just re-enable the telnet interface since I know from extracting the firmware that the telnetd binary already exists in the firmware.

Running a recursive grep on our extracted firmware shows that “telnet” shows up in many binary files, as well as what appears to be factory and default settings shipped with the device.

Note the telnetEnabled=0. This probably explains why telnet isn’t running. It also seems to indicate that it’s a setting.

While poking around earlier looking for command injection, I located the “System” menu, which allows exporting/importing settings. If we’re lucky, telnetEnabled is a hidden setting we can just flip on and re-import.

Clicking “Save Settings To Local Hard Drive” results in downloading a 5.9kB config.bin file.

We use binwalk to check for known file formats.

It looks like we have a SEAMA firmware header and the config is encrypted. Again, we turn to Google and search for the starting bytes of the file 0x5EA3A417 which returns a very useful C header file that defines the structure of a SEAMA file.

In the same folder on GitHub there’s a corresponding .c file for a command line tool to unpack a SEAMA file, but a quick review does not show any usage of OpenSSL. This likely means D-Link is doing some additional layer of encryption on top of SEAMA, and we’re better off doing some more static analysis on the firmware itself and using the GitHub repo for sanity checking ourselves.

Re-Opening prog.cgi in Binary Ninja and searching for usages of the string “config.bin” we see that it’s used in a single section of code at sub_42ad78.

Taking a closer look at sub_42ad78 we see the following flow graph:

At a high level:

  1. /tmp/config_2g and /tmp/config_5g are written to a manifest file /tmp/sysupgrade.conffiles
  2. The config files are put into a .tar.gz archive with tar czf "-" -T /tmp/sysupgrade.conffiles 2>/dev/null > /var/backup_tmp.tar.gz
  3. sub_42b2f4 reads model_name and hw_version from nvram and returns a string of “<model_name>_<hw_version>”
  4. The /var/backup_tmp.tar.gz file from step 2 is passed through a command mkconfig
  5. The resulting file is returned to be downloaded by the end-user

Taking a closer look at the call with mkconfig:

snprintf(&var_14c, 0x100, "mkconfig -a enca -m %s -i %s -o %s", &var_18c, "/var/backup_tmp.tar.gz", "/var/backup.tar.gz", 0x4f1530)

This results in &var_14c containing mkconfig -a enca -m DIR-867_A1 -i /var/backup_tmp.tar.gz -o /var/backup.tar.gz

Now that we know the command being run to generate the encrypted config.bin, we take a look at /bin/mkconfig to determine what those flags do. We can just run it in QEMU without any arguments to view the help message.

As the description states, it can encapsulate or de-encapsulate a config. However, it’s unclear where the suspected presence of encryption comes into play. A reasonable assumption from looking at the available flags indicates that the -m flag may be used in some sort of key derivation function. Remember that the -m flag is the model_name and hw_version. If the model and hardware version are used for a key derivation function, this would prevent someone from uploading a config from a different D-Link router model and potentially breaking their device.

We can confirm this by taking a peek at the enca function of mkconfig in Binary Ninja:

Indeed, we see the usage of openssl as well as a new, but fully expected binary seama.

In the first relevant part of the program flow, the -m flag (DIR-867_A1) is used in sub_400e30

Then the logic enters a loop that writes the MD5 hash as a hex string to &buffer

The OpenSSL command is as follows:

If we had preferred not to disassemble the function to figure out how the encryption key was generated, we could have simply added the -E "QEMU_STRACE=1" flag when running mkconfig and the resulting key would have shown in the strace output.

Command:

qemu-mipsel-static -E "QEMU_STRACE=1" /bin/sh -c "mkconfig -a de-enca -m DIR-867_A1 -i config.bin -o config.dec"

Strace output:

39 execve("/bin/sh",{"sh","-c","openssl enc -e -aes-256-ecb -k 81F9A6E40BDEC26DB67FE53A555D0E8E -in config.dec -out config.dec.enc >/dev/null 2>&1",NULL})

As expected, 81F9A6E40BDEC26DB67FE53A555D0E8E is the hex string representation of the MD5 hash of “DIR-867_A1”.

Knowing this is true, we can make a simple shell script to recreate this logic and patch in Telnet support:

  1. Use mkconfig to de-encapsulate (Unpack SEAMA firmware, Decrypt image)
  2. Extract the Gzip’d Tar archive
  3. Replace telnetEnabled=0 with telnetEnabled=1 in /tmp/config_2g
  4. Write /tmp/config_2g and /tmp/config_5g to a manifest
  5. Tar and Gzip the files in the manifest
  6. Use kconfig to encapsule (Encrypt image, Pack SEAMA)

The result being telnetpatched.bin, which should be a valid settings file for us to upload and enable telnet.

Indeed, another nmap scan shows the desired results of an open telnet port.

Unfortunately, when trying to connect, we are instantly prompted for authentication.

After a quick peek at the disassembly of prog.cgi, we can see that the password is set to the value we provided originally, Password1 + @twsz2018

A guess that the username is admin allows us to log in successfully with a password of Password1@twsz2018

While we have successfully logged in over telnet, we are dropped into a limited shell with only a select number of commands available to run. We cannot directly use this shell to load a gdb server and attach it to prog.cgi.

Here we will cheat a bit and leverage CVE-2022-1262, a command injection vulnerability in the protest binary that is available to us in the limited shell. Using the Proof-Of-Concept exploit included in this writeup from Tenable, we spawn another telnetd instance running on port 1337 and running as root.

From here, we can get a hint about the version of Linux headers the firmware was built with by running:

Finally, we can either cross-compile a mips32el uclibc GDB server against Linux headers 3.10.14+ ourselves by using something like crosstool-NG … or we can download a pre-built toolchain matching our criteria from https://toolchains.bootlin.com/

This allows us to transfer a gdb server to /tmp on the physical router and attach gdb to prog.cgi for remote debugging purposes.

Conclusion

Learning how to gain a debuggable interface with both emulated and real D-Link devices is a valuable skill for anyone interested in vulnerability research and network security. By using emulated devices, you can experiment and test in a safe environment before attempting changes to real hardware. The ability to debug real devices can help you identify and fix issues, as well as determine how firmware operates “in the real world”. While it may seem daunting at first, with the right tools and resources, anyone can learn these skills. With this knowledge, you can enhance your understanding of network security and device development, and apply these concepts to future projects, such as writing network signatures for malicious traffic.

GreyNoise tags for command injection CVE-2023-24762 and stack-based buffer overflow CVE-2022-41140 are live and available to all users for tracking related activity:

Get Started With GreyNoise for Free

A week in the life of a GreyNoise Sensor: It's all about the tags

To ensure we have as much visibility into activity on the internet as possible, we regularly deploy new sensors in different “geographical” network locations. We’ve selected two sensors for a short “week in the life” series to give practitioners a glimpse into what activity new internet nodes see. This series should help organizations understand the opportunistic and random activity that awaits newly deployed services, plus help folks understand just how little time you have to ensure internet-facing systems are made safe and resilient.

We initially took a look at what the sources of "benign" traffic are slinging your way. Today, we're going to look at the opposite end of the spectrum. We stripped away all the benign sources from the same dataset and focused on incoming tagged traffic from malicious or unknown sources. After all, these detection rules are the heart and soul of the GreyNoise platform, and are also what our customers and community members depend upon to keep their organizations safe.

The "not-so-benign" perspective

If we hearken back to our previous episode, it took over an hour for even the best-of-the-best of the "benigns" to discover our freshly deployed nodes. This makes sense, since there aren't too many legitimate organizations conducting scans, and they do not have infinite resources. Sure, they could likely spare some change to scan more frequently, but they really don't need to.

In contrast, we tagged 8,697 incoming IP addresses in that first week, and the first packet of possible ill-intent appeared ten seconds after the nodes were fully armed and operational. However, the first tagged event — RDP Alternative Port Crawler — was seen three hours later. The difference between those two events lies in one of our core promises: our tags are 100% reliable. We don't just take every IP address hitting our unannounced sensor nodes and shove it into a list of indicators of compromise (IoCs).

Tagged Malicious Traffic Started Coming In As Soon As The Sensors Were Functional

217,852 total malicious/unknown events encountered during the ~7.8 day sampling period.

The above chart is the raw, non-benign connection data to those sensors for that week. You're likely wondering what those spikes are. We did to!

Let's look at some summary data by tallies of:

  • autonomous system organization (aso) — the name of the network the connections came from
  • geolocated country
  • destination port
  • source IPv4
  • total connections

The Four Largest "Spike" Hours Had Mostly Similar Characteristics

The August 28th malicious traffic spike (for an hour) focused mainly on SMB exploits, and originated from the "Data Communication Business Group" autonomous system in Taiwan. It is odd that we saw so little other activity, and that the port volume was an order of magnitude less. There are any number of reasons for this. Given where these sensors are (which we're not disclosing), it could have been a day of deliberate country network isolation. Or, it could just mean that the botnet herders were super-focused on SMB.

Over the course of those seven days, non-benign nodes hit over thirteen-thousand ports, and you can likely guess which ones made the top of the list.

We Saw The Usual Suspects Rise To The Top Of 13,576 Ports

Port numbers associated with Telnet, RDP, SSH, and SMB were, by far, the most common.

Since we called out one autonomous system, we should be fair and call them all out.

If You've Ever Stared At An IPv4 IoC List, You Definitely Recognize These Folks

These are very common autonomous systems to see in malicious attack logs. What should concern you, however, is that "high reputation" sources such as Linode, OVH SAS, Google LLC, and Microsoft Corporation are all in the visible treemap cells. Even if you don't consider them as high reputation sources, you cannot permanently block communication to/from them. These are all hosting providers, and we see other hosting providers with decent IP reputation also hosting malicious traffic sources. The hourly updated nature of our API-downloadable block lists, combined with the scheduled roll off of IP addresses when they stop sporting malicious activity after a while, means you can make your organization safe from these IPs while they are trying to do harm.

It's all about the tags

What you, and we, truly care about is the tagged, non-benign traffic.

The Tagged Traffic Distribution Takes A Familiar Shape

Of the 115 identified tags, those associated with RDP, SMB, and MS SQL attacks topped the list, along with scrapers looking for useful information. If you still haven't removed RDP from your perimeter, please stop reading and take the opportunity to do so now.

Head on over to the Observable notebook that houses all these charts and data for a more interactive version of the chart and data tables.

You can see the aforementioned yellow SMBv1 Crawler August 28th spike right after "12 PM".

Key Takeaways

When you deploy a new internet-facing system, you have only seconds until unwanted traffic comes knocking on your door. After that, there's a constant drumbeat, and sometimes even an entire off-key orchestra, of unsought after:

  • benign traffic attempting to maintain an inventory of internet-connected devices and services;
  • malicious traffic with a laundry list of goals to achieve; and,
  • unknown traffic that could be something we and the rest of the cybersecurity community have not yet identified as malicious, but is definitely something your apps and other services might not be able to handle.

You can sign up for a GreyNoise account to start exploring the tags and networks identified in this post, and check out some of our freshly minted new features such as IP Similarity — which lets you hunt for bad actors exhibiting behavior similar to ones we've tagged — and, IP Timeline — where you can see what sources have been up to, and how their behavior has changed over time.

Get Started For Free

Cribl + GreyNoise: Solving Common Struggles for SOC Analysts and Security Engineers

Yesterday, our founder & CEO, Andrew Morris, got to join Ed Bailey from Cribl for a live stream conversation discussing how to help SOC analysts overcome common struggles and improve security detections. Over the years, we’ve built a great relationship with Cribl and truly believe in our “Better Together” message. The Cribl + GreyNoise integration is available now, so if you want to learn more about it, let us know.

Check out the full live stream below:

The Highlights

GreyNoise Released the Triple Threat 

During the conversation, Andrew mentions our new product features. We put out a series of blogs and a press release last week if you want to learn more. 

Why SOC Analysts Are Struggling

You can feel Andrew’s excitement when Ed poses this question. Here is how Andrew broke it down:

  1. The internet is extremely noisy.
  2. The SOC is being asked to "do more with less."
  3. False positives are wasting their time

In addition, Ed explains that 30% of your detections are things that just don’t matter. With better data & context (like GreyNoise) you can finally ignore the noise. This prevents wasting hours and hours analyzing alerts and events that don’t matter.  

All Logs Are NOT Created Equal

Some security teams are left with a problem of determining which logs matter. Or that storing logs or processing data is all or nothing. Those with years of experience in the SOC know this isn’t true. Not only do different event types have different analytical value, but also logs from certain places matter more than others. So, how do you scale this knowledge?

Stop Chasing Ghosts

So what does GreyNoise do? We help our customers understand the alerts and events that DON’T matter. It’s kind of the opposite of a typical threat intel feed. By eliminating the noise you can focus on what really matters.

Cribl + GreyNoise Are “Better Together”

  1. GreyNoise solves the problem of what log content matters and what is noise
  2. Cribl allows you to use that GreyNoise insight to funnel and store your logs in a way that optimizes for better detections, lower bills and faster decisions that result in a more secure organization.

Follow Andrew and GreyNoise on Twitter

Big thanks to Ed Bailey and the Cribl team for letting us join. Hopefully you found this information interesting and insightful. If you want to learn more about our Cribl integration, contact us.

Try GreyNoise For Free

7 benefits of a GreyNoise paid plan

7 benefits of a GreyNoise paid plan

(And when you should stick with our free version)

Giving back to the cyber security community will always be a key part of the GreyNoise mission, so our free plan isn’t going anywhere. 

But there are a lot of benefits to a paid plan that may not be immediately obvious (benefits other than subsidizing Andrew’s tweets). Let’s dig into the top 7 reasons you should upgrade, and when a paid plan might not be a good fit for you. 

Increased Search limits

The most obvious reason for a user to upgrade from free to paid is the expanded Search limits. Search is at the core of our product, it’s the first thing users see when they land on our visualizer and the primary way most users interact with GreyNoise. Asking GreyNoise for data on an IP, a CVE, a tag, or a trend all counts as a Search. 

The Free limits are designed for hobbyists, independent users, or for someone who is just starting to explore the GreyNoise ecosystem. They aren’t high enough to tap into automation or large scale data enrichment. So if you’re a team who wants to save time by throwing out all the hay to get to the needles faster, a paid plan will give you the volume you need. 

Increased limits on alerts and dynamic blocklists

In addition to Search, all paid customers get increased limits for alerts and dynamic blocklists - the other two features that make up the core of the GreyNoise product. 

Alerts let you configure an email notification that will trigger anytime the response to a GreyNoise Query Language (GNQL) query changes. Use them to identify compromised devices on your network (or a vendor or third party supplier’s network), or get a heads up when attackers start exploiting a vulnerability in the wild. 

Dynamic blocklists give you a full list of IPs associated with a GreyNoise tag. The list is updated hourly, and can be plugged directly into your next gen firewall to keep your perimeter safe. When the next Log4j happens, you can use a dynamic blocklist to buy your team needed time to patch. 

A more robust API with full IP context

The Community API has 1 endpoint, it takes an IP and returns a simple response with some basic information about that IP. Is the IP in the GreyNoise Noise or RIOT datasets? What is its classification? And a link to the visualizer.

For some teams this is enough. But for teams that are stretched thin and need comprehensive answers fast, rely on automation to make their lives easier, or want to run more complex queries (like pulling back a list of IPs tagged with a specific CVE) access to our Enterprise API is a must. 

Our Enterprise API has 16 endpoints. Enterprise API users can: 

  • Get abbreviated IP info or full context (on up to 1,000 IPs in a single request)
  • Run full GNQL queries 
  • Get tag metadata
  • Check an IP against our RIOT dataset (a list of IPs known to be associated with common vendors)
  • Access our IP Similarity endpoints 
  • Access our IP Timeline endpoints

A full list of our Enterprise API endpoints

An example of a full context response, the response for this IP is 132 lines and includes full metadata, ports scanned, web paths and user agents, and fingerprints.

More integrations, and expanded integration options

We have 31 integrations that support using a Community API key. But these integrations are limited by what the Community API can return, so if you need full context on the IP (which most users do) you'll have to click into the Visualizer to get a full picture.

Full integrations and integration with some of the most popular security tools like Splunk, QRadar, LogStash, and Recorded Future are available to paid customers. You’ll also want the higher Search limits that come with our paid plans to maximize our integrations and automate more of your work.

For a full list of integrations check out the GreyNoise docs.

Exclusive paid features

While limited access to the core GreyNoise features will always be available to free users, there are a handful of features that are only available to Paid customers, with more being added this year. These features can be used for enhanced enrichment, threat hunting, and protecting your perimeter from mass exploitation. 

Premium data fields

Premium data fields like our IP Destination fields tell users even more about what an IP is doing on the internet, and open up new Search queries. IP Destination specifically tells users which countries our sensors observed an IP scanning, and let users narrow down their searches to geo-targeted traffic. All paid plans come with IP Destination. 

Export

Getting data out of the visualizer (or accessing full context in the API) is restricted to paying customers. You can export GNQL search results or analysis results to a CSV or JSON file. 

IP Timeline

IP Timeline lets security analysts and threat hunters look back at an IP’s behavior over time. Answer the question, “what was this IP doing 7 days ago?” Useful if you find an IP in your logs and want to know what it was doing the day it hit your system. 

Use our IP Timeline feature to understand how an IP’s behavior changes over time. 

IP Similarity

IP Similarity helps users identify potential actors and infrastructure associated with an IP you’re investigating. All users can see how many IPs GreyNoise has identified as similar to a given IP, but only paying customers can access the list of similar IPs and a breakdown of the factors that determine similarity. 

Use our IP Similarity feature to identify potential actors or infrastructure amongst internet scan data.

Feeds

Feeds are a useful way to enrich your existing data without blowing up your Search limit, or to narrow down a search into a big dataset. 

Enterprise support

Paying customers get direct access to our Customer Success team who have a deep knowledge of the GreyNoise product, integrations, and customer use cases. While we will always do our best to support all of our users, our Customer Success team goes above and beyond for our customers with onboarding, training sessions, and quarterly check-ins.

Convenience

One of the most important benefits of a paid plan is convenience, and our best customers get this. You’re strapped for time. You’re trying to keep up with changing tactics from the bad guys, training and hiring good analysts, and the latest demands from the rest of your org. Any time you can save has real value. 

GreyNoise has proven time and time again that we save our paying customers time, and generate a pretty significant ROI

Upgrade to a paid plan if…

There’s a lot of value unlocked when you move from a Free GreyNoise account to one of our paid plans. You probably want to upgrade if you fit into one or more of these buckets.

  • You have a mid-size or large team heavily leveraging the visualizer for manual alert triage 
  • You want to build automations or enrichments leveraging GreyNoise data into your workflows
  • You’re using Splunk, QRadar, Panther or one of the other tools supported by our paid integrations
  • You want an easy way to monitor large blocks of IP space for malicious behavior
  • You want to block IPs scanning for emerging threats from touching your perimeter entirely
  • You are doing advanced threat hunting 
  • You want to leverage GreyNoise data in your product or service
  • You need enterprise support
  • You value saving your analysts time anywhere you can

A GreyNoise paid plan isn’t necessary for everyone, we get that. Don’t worry about a paid plan right now if you fit into one of these buckets:

Your team has a manageable number of alerts

Look, if you’re a team of 1 or 2 analysts, and you can easily handle all of the perimeter-related alerts in your SIEM then you’re probably not going to get a ton of benefit from a GreyNoise subscription right now. You can always create a free account, and use it as needed when you have questions about a strange IP or hot, new CVE. 

You outsource your security program

If most or all of your security program is managed by MSSP/MDR partners, they can leverage GreyNoise on your behalf to provide you a better service more cost-efficiently. Some of our rockstar MSSP/MDR partners include:

You’re a student, academic, or independent researcher

If you’re a student, academic, or an independent researcher using GreyNoise for non-commercial purposes then you may actually qualify for our VIP program. VIP users get access to all of the same features and benefits as paying customers, at no cost. 

Check out our pricing to learn more

In the end it’s up to you. GreyNoise isn’t going to be a good fit for everyone, and that’s okay. You can always use GreyNoise for free, and reach out later when your security team has grown. But if these benefits resonated with you, then consider reaching out to our sales team. They’re here to help, not be pushy, and know a ton about GreyNoise. 

Introducing IP Timeline

When running across an unknown IP address in the logs, the first move might be to check the IP address’s reputation through a number of services.  This check is useful for the immediate task at hand, but what if you could see not only reputation reports but see, at a granular level, when and what is causing this reputation?  That’s where GreyNoise comes in. 

Alongside the common fields of a GreyNoise IP address page’s located in the Visualizer (which include relevant DNS information, destination sites, and other data), GreyNoise now has a feature called the IP Timeline. The IP Timeline displays activity as seen by GreyNoise sensors of a particular IP Address over the past thirty days.  Let’s take a look at an IP address and explore this tool further.

Getting Started – IP address page

When an IP address is entered into the GreyNoise search box, if GreyNoise has observed scan activity from the IP, you will receive an IP Address page detailing data and related tags:

In this example (IP address 41.65.223.220), an opportunistic scanner appears to be crawling for SMBv1 endpoints and trying to brute force MSSQL servers.

Within the fields displayed are which ports this address scans, any associated fingerprints and what kinds of web requests the IP is known to make. In this case, GreyNoise does not have a lot of fingerprint and request data. So, how can we know for sure that this IP address is still active and malicious?

This is where the IP Timeline feature comes into play.  Next to the summary of the IP address, there’s a tab labeled ‘timeline’.  Let’s click that and see what we find:

Voila!  The page has gone from an overview of the IP address to discrete data points showing when, exactly, GreyNoise has noticed activity.  Consecutive days of the same activity are connected by a line.

Each data point GreyNoise has for the IP address is a field along the Y axis, and each day that GreyNoise notices it is the X axis. You can see respective fields and dates along the left and top sides of the graph.  In this example, you can see that on the 1st of January there is SMBv1 crawling observed.  Then, on the 2nd, there’s MSSQL brute force attempts. The SMBv1 crawling has an unknown intention so it’s listed as white, while the MSSQL brute force attempts are highlighted in red as they are tagged as malicious activity.

How is this useful?

This graph can be used for more than just a quick check on an IP address.  For example: you are running an MSSQL server and found this IP address in your logs. Seeing somebody trying to brute force your server can be a nerve-wracking experience!  However, by checking this graph you can see that this address attempts to brute force every 8 days on the dot, implying an automated process. That’s still not great, but it’s less scary than a concerted human effort. From there, you could make the call to block at the firewall or you could make sure your passwords aren’t on any well-known word lists and continue to observe the IP address.

Additionally, if you are looking for behavior patterns on an IP address, this graph could come in handy.  In this example, we only see two cases where the IP address crawls SMBv1 and then attempts to brute force the next day but, if this was a consistent pattern, this may be indicative of a pattern used when deciding which hosts to try and brute force. You could then use that information to pivot into checking your SMB logs for anything suspicious.

Final Takeaways

GreyNoise is always looking for new ways to bring as much value as possible.  IP Timeline data is only one part of a much bigger ecosystem you can integrate into your processes and investigations. Try it out yourself by signing up for our *enterprise trial or contact us to schedule a more in depth demo.

(*Create a free GreyNoise account to begin your enterprise trial. Activation button is on your Account Plan Details page.)

Get Started With GreyNoise For Free

Introducing IP Similarity

Why we created the IP similarity feature

While we at GreyNoise have been collecting, analyzing, and labeling internet background noise, we have come to identify patterns among scanners and background noise traffic. Often we’ll see a group of IPs that have the same User-Agent or are sending payloads to the same web path, even though they are coming from different geo-locations. Or, we might see a group that uses the same OS and scanned all the same ports, but they have different rDNS lookups. Or any other combination of very similar behaviors with slight differences that show some version of distributed or obfuscated coordination.

With our new IP Similarity feature, we hope to enable anyone to easily sniff out these groups without having an analyst pore over all the raw data to find combinations of similar and dissimilar information. Stay tuned for an in-depth blog covering how we made this unique capability a reality, but for now, here’s a quick snapshot of what the feature does and the use cases it addresses.

The GreyNoise dataset

GreyNoise has a very rich dataset with a ton of features. For IP Similarity we are using a combination of relatively static IP-centric features, things we can derive just from knowing what IP the traffic is coming from or their connection metadata, and more dynamic behavioral features, things we see inside the traffic from that IP. These features are:

IP Centric 

  • VPN
  • Tor
  • rDNS
  • OS
  • JA3 Hash
  • HASSH

Behavioral 

  • Bot
  • Spoofable
  • Web Paths
  • User-Agents
  • Mass scanner
  • Ports

Of note, for this analysis we do not use GreyNoise-defined Tags, Actors, or Malicious/Benign/Unknown status, as these would bias our results based on our own derived information.

What GreyNoise analysis can show you

The output of the IP Similarity feature has been pretty phenomenal, which is why we’re so excited to preview it. 

We can take a single IP from our friends at Shodan.io, https://viz.greynoise.io/ip-similarity/89.248.172.16, and return 19 (at the time of writing) other IPs from Shodan, 

Figure 5: IP Similarity of 89.248.172.16  as shown in GreyNoise. 

And we can compare the IPs side by side to find out why they were scored as similar.

Figure 6: IP Similarity Details 

While we have an Actor tag for Shodan which allows us to see that all of these are correct, IP Similarity would have picked these out even if they were not tagged by GreyNoise.

Key use cases 

As with any machine learning application, the results of IP Similarity will need to be verified by an aware observer, but this new feature holds a lot of promise for allowing GreyNoise users to automatically find new and interesting things related to their investigations. In fact, we see some immediate use cases for IP Similarity to help accelerate and close investigations faster, with increased accuracy, and provide required justifications before acting on the intelligence. For example:  

  • For cyber threat intelligence analysts. Use IP Similarity to generate a list of IP addresses that are similar to a target IP address identified/associated with specific malicious activity. This could entail generating a list of IPs similar to an IP observed executing a brute force attack.

  • For threat hunters: Use IP Similarity to identify a list of IP addresses similar to your “hypothesis” IP address. You can then search for them inside your network (perhaps within your Splunk SIEM or leveraging NetFlow data). The goal here would be to proactively find compromises from related threat actors. 

Get Access to IP Similarity

For more on how you can use IP similarity in your investigations, check out our recent blog from Nick Roy covering use cases of IP similarity. You can also read more about IP similarity in our documentation. IP Similarity is available as an add on to our paid GreyNoise packages and to all VIP users. If you’re interested in testing these features, sign up for a free trial account today!*

(*Create a free GreyNoise account to begin your enterprise trial. Activation button is on your Account Plan Details page.)

Try GreyNoise for free

Streamline your Threat Detections with Elastic and Tines

Our friends over at Elastic published a great piece on how to set up Distributed Alerting with Elastic Stack - and gave a shout out to us! As big proponents of SOC efficiency and sleep-filled nights for Incident Response teams, using GreyNoise data in your process with Elastic Stack and Tines can (1) prevent false positive alerts from ever reaching your SOC teams and (2) help provide additional context when you do receive an alert. 

Elastic and GreyNoise workflow. Source: Elastic

Our sensors are placed around the world to passively capture network traffic and give you details on what IPs are scanning the internet on a daily basis. We then tag the behavior we’re seeing with additional details and deliver that context to our users to help you understand what traffic is benign and what traffic is attempting to exploit everyone (or just you). 

GreyNoise data on 89.248.172.16, currently operated by Shodan.io. Source: GreyNoise

Beyond our Noise dataset as described above, we also provide data on common Internet services (Google DNS, Apple, CDNs, etc.), known as RIOT. 

GreyNoise data on 8.8.8.8 (Google DNS). Source: GreyNoise

By adding GreyNoise data (Noise and/or RIOT) into your Distributed Alerting workflow, we can help enrich logs with full context data on IPs seen by our sensors. As our friends at Tines highlighted, “Simply put, if an IP is classified in RIOT and has a trustworthy confidence level, the impact of the alert is very likely to be minimal, and it can be set as a low priority or closed immediately. Suppose the IP has a somewhat trustworthy confidence level. In that case, the priority could be raised, and/or additional risk factors could be added in to arrive at a confident automated determination of the status of the alert before any analyst review.”

Automated Workflow Example. Source: Tines

GreyNoise has both Elastic and Tines integrations available to our paid users, which we’re happy to demonstrate further. Whether you’re trying to tune your alerting system or digging deeper in an investigation, we hope you check out our data further. 

Start an enterprise Trial today.*

(*Create a free GreyNoise account to begin your enterprise trial. Activation button is on your Account Plan Details page.)

Get Started With GreyNoise for Free

GreyNoise Analysis Of A Quartet of Exchange Remote Code Execution Vulnerabilities: CVE-2023-21529; CVE-2023-21706; CVE-2023-21707; CVE-2023-21710

Microsoft’s Patch Tuesday (Valentine’s Edition) released information on four remote code execution vulnerabilities in Microsoft Exchange, impacting the following versions:

  • Exchange Server 2019
  • Exchange Server 2016 
  • Exchange Server 2013

Attackers must have functional authentication to attempt exploitation. If they are successful, they may be able to execute code on the Exchange server as SYSTEM, a mighty Windows account.

Exchange remote code execution vulnerabilities have a bit of a pattern in their history. This history is notable due to authentication being a requirement for exploitation of these newly announced vulnerabilities.

CVE-2023-21529, CVE-2023-21706, and CVE-2023-21707 have similarities to CVE-2022-41082 due to them all requiring authentication to achieve remote code execution, which GreyNoise covered back in September 2022. Readers may know those previous September 2022 vulnerabilities under the “ProxyNotShell” moniker, since an accompanying Server-Side Request Forgery (SSRF) vulnerability was leveraged to bypass the authentication constraint. “As per our last email” we noted this historical pattern of Exchange exploitation in prior blogs as well as tracked recent related activity under the Exchange ProxyNotShell Vuln Check tag which sees regular activity.

Shadowserver, a nonprofit organization which proactively scans the internet and notifies organizations and regional emergency response centers of outstanding exposed vulnerabilities, noted that there were over 87,000 Exchange instances vulnerable to CVE-2023-21529 (the most likely vulnerability entry point of the four new weaknesses). 

As of the publishing date of this post, there are no known, public proof-of-concept exploits for these new Exchange vulnerabilities. Unless attackers are attempting to bypass web application firewall signatures that protect against the previous server-side request forgery (SSRF) weakness, it is unlikely we will see any attempts to mass exploit these new weaknesses any time soon. Furthermore, determined attackers have been more stealthy when it comes to attacking self-hosted Exchange servers, amassing solid IP address and domain inventories of these systems, and retargeting them directly for new campaigns.

GreyNoise does not have a tag for any of the four, new Exchange vulnerabilities but is continuing to watch for emergent proof-of-concept code and monitoring activity across the multi-thousand node sensor network for anomalous Exchange exploitation. Specifically, we are keeping a keen eye on any activity related to a SSRF bypass or Exchange credential brute-force meant to meet the authentication constraints needed by an attacker to leverage these vulnerabilities.

GreyNoise researchers will update this post if and when new information becomes available.

Given the likely targeted nature of new, malicious Exchange exploit campaigns, you may be interested in how GreyNoise can help you identify targeted attacks, so you can focus on what matters to your organization.

Don’t have a GreyNoise account? Sign-up for a free account.

Fingerprinting Attackers With IP Similarity

One of the things that I was really excited about when I joined GreyNoise was the amount of data gathered by our sensor network and how it could be used in other ways outside of the traditional SOC efficiency use case. I spend a lot of time working with GreyNoise data, and inevitably, finding some thread to pull at that leads me down a rabbit hole. Our sensors see hosts that are scanning for known CVEs and misconfigurations end everything in between, but with the introduction of IP Similarity, we’re now able to group IPs by how they are operating, providing intelligence on botnets or even infrastructure used by adversaries. 

Prior to this feature, users would have to look through GreyNoise data and extract this manually. Now, using a combination of ports, requests, and fingerprint data the new IP similarity feature does the work for you. In preparation for a meeting the other day, I was looking through some of the ICS tags published on GreyNoise and reading up on the Tridium NiagraAX Fox ICS Scanner, which is always interesting to look at who’s scanning for it, as well as the results on Shodan/Censys since they typically provide details about the building they are located in.

GreyNoise Unknown IP

Digging into the results, there were a couple of IPs that stood out immediately, especially the ones that were classified as unknown. If you’re not familiar with GreyNoise, these are IPs that we’ve observed scanning the internet but haven’t seen any malicious activity from. There were a number of IPs that were scanning for ICS tags, and several in particular seemed to be opportunistically scanning for really any kind of SCADA device exposed to the internet. In this case, there is a JA3 fingerprint that we can pivot on, but the hash 19e29534fd49dd27d09234e639c4057e returns over 7,000 results.

That’s still a lot of information to parse through, and we could use a combination of requests and tags, but by using IP similarity and filtering down to a confidence score of 95% or greater, we can quickly find 8 other IPs that have been active in the last month all performing similar reconnaissance. This now gives us a way to start further investigating infrastructure and better understand what these IPs are targeting. 

Coming from the SOAR world, we were always looking for sources that could be used to make better decisions when automating responses or threat hunting. IP similarity now makes it even easier to identify common infrastructure and botnets in order to further hunt based on the data. It also makes it easy to create a blocklist since we know that these IPs have all recently had very similar scan behaviors. 

I’m really excited about the different ways people are going to be using IP similarity and the benefits they will get from this new feature. Try IP Similarity for free with our enterprise trial** or contact us for a demo.

(**Create a free GreyNoise account to begin your enterprise trial. Activation button is on your Account Plan Details page.)

The Real Time and Money Savings of Using GreyNoise

Figuring out if a security product is right for you is hard. Beyond the technical problem it solves, you have to make a business case for why those with purchasing power in your company should buy your favorite security tool vs. putting the money to another use. Most of the time, the rationale is “Gartner says it’s cool” or check out this testimonial that is definitely not from the company’s CEO’s cousin. We wanted to take a more data-based approach, which is the inspiration for creating our ROI calculator.

To do this, we surveyed our customers, the people that pay us real-life dollars to use GreyNoise, about how our products have been used in their day-to-day work. We asked about their company: how big it is, what sort of security team they have, what sort of work they do, and the number and average time of investigations. We also asked about GreyNoise’s impact, how often it is helpful in an investigation/project, how much time it saves, threats found, IP coverage, efficiency gain, etc.

A screenshot of the new GreyNoise ROI Calculator.
A screenshot of the new GreyNoise ROI Calculator.

From aggregating this data and segmenting it by the size of the company and type of work, we’re able to use real customer insights to give you an expectation of what GreyNoise’s value to your company could be. We know you love GreyNoise, and we hope this proves helpful when advocating to get the tooling you need to do your job effectively!

Try our new ROI Calculator and see how much GreyNoise could save your organization.

Not yet familiar with GreyNoise? We collect, analyze and label data on IPs that scan the internet and saturate your security tools with noise. This unique perspective helps analysts spend less time on irrelevant or harmless activity and more time on targeted and emerging threats.Sign-up for our free plan to see for yourself!

No blog articles found

Please update your search term or select a different category and try again.

Get started today