Cybersecurity digerati spends an inordinate amount of time focusing on the concept of “biggest” when it comes to cybersecurity threats. While there is some merit to such quantification, the concept itself can be difficult to generalize, since every organization has some set of unique characteristics that cause each of them to have fairly unique threat profiles, risk tolerances, and exposures.
We can, however, break down some of the broader themes from Black Hat and DEF CON 2023 and pull out some recurring themes across each that would cause some consternation for CISOs, CIOs, CEOs, and board members (since many of them are now on the hook when cyber things go south).
Meltdown, Spectre, and L1 Terminal Fault (Foreshadow) may be behind us, but modern processor architectures seem to be in a permanent state of vulnerability. Downfall is yet another one in this line of low-level flaws that require significant effort to mitigate, as said mitigations usually require some downtime and also some inherent risk in the patch processes themselves.
Fixing these vulnerabilities also may cause significant performance degradation, which may force organizations to incur extra spend to meet pre-projected capacity requirements.
C-suite folks are left with a gnarly, tangled risk assessment process that has to consider the likelihood and frequency of projected attacks and also the potential impact on various compliance requirements if they choose not to patch/mitigate.
This is a major distraction from delivering on core business functions, and we’re likely to see more of these types of vulnerabilities in the future, especially with the scramble to acquire GPUs. CVE-2022-31606, CVE-2021-1074, and CVE-2021-1118 are already known vulnerabilities in GPUs, and the rush to meet AI headfirst may see a parallel set of headaches on the horizon in any systems that are performing advanced ML/AI processing.
It’s no secret that an ever-increasing number of organizations are moving some or many workflows to cloud environments, joining the ranks who have blazed the trail before them. There was supposed to be some inherent level of trust in cloud providers to take security seriously, so all an organization had to do was ensure they didn’t mess up their configs or expose vulnerable services. Sadly, that has not been the case for some time now.
The specifics of what were presented at or before Hacker Summer Camp in this space really aren’t as important as the theme itself: you can no longer even remotely have any baseline level of assurance that the cloud environments you are adopting are taking security measures seriously.
This puts C-suite folks in a precarious position. While some cloud plans end up going over budget, there are many cloud use cases that do help organizations save time, money, and people resources. Yet, when you are put at serious risk due to negligence on the part of a cloud provider you have the potential of incurring significant costs for triage, incident response, and potentially data breach penalties.
2023 has made it pretty clear that “In Cloud, We Trust” is unlikely to ever be a motto again (if it ever was). Organizations now have extra complexity both up-front (as they bake in extra security measures and potential incident costs into new endeavors) and also as they handle the distraction of retrofitting a more defensive security posture onto systems that were likely more secure when they were hosted back in the “owned data center” days.
There has been enough discussion about “AI insecurity” ever since just around this time last year, so I can keep this relatively brief.
The large language/generative models (LLM/GPTs) we seem to be stuck working with were all trained with no thought for safety — either in the results one gets when using them, or for how easy it is to cause them to reveal information they shouldn’t.
They also come with an equivalent to the “cloud” problem mentioned above, since most organizations lack the skill and resources necessary to bring AI fully in-house.
This is a big topic of discussion when I talk to CISOs in my lectures at CMU. The AI gold rush is causing organizations to incur significant, real risk, and there are almost no tools, processes, guides, or bulk expertise to help teams figure out ways to keep their data, systems, and humans safe from AI exposures.
This is yet one more distraction, and focus grabber that makes it very difficult to just get “normal” cybersecurity practices done. Unless the bottom falls out of generative AI as it has with the web3/crypto fad that came before it, the C-suite will have to dedicate what little time and resources they have to corralling and shaping AI use across the organization.
There were many talks about vulnerabilities in general at both Hacker Summer Camp and RSA this year. But, I don’t think any talk made the brutal reality of what it is like to perform the thankless task of vulnerability management within even a moderately sized organization.
That calendar view has a colored square every time there’s been a CISA Known Exploited Vulnerability release since CISA began curating their catalog. Apart from the regular mega “Patch Tuesday” organizations have to deal with, they also have to contend with nigh immediate response to each new update, even if that’s only a triage event. There is little time in-between updates, and very common technologies/products make their way to the list on-the-regular.
Six weeks before Black Hat, there was at least one, major vulnerability in a core component of most enterprise IT stacks every week, with rapid and devastating malicious attacks following close behind each release.
This is an untenable situation for most organizations, even “resource rich” ones.
Hundreds (one estimate, today, said “thousands”) of organizations have been devastatingly hurt by MOVEit exploits, Citrix admins likely cannot sleep at night anymore, and even security teams have had to face an onslaught of patches for technology they’re supposed to be using to keep their organizations safe.
We rarely talk about this because it’s a hard problem to solve and causes real, lasting damage. It’s far “cooler” to talk about that “EPIC vulnerability” some clever researcher found in isolation. But, when they’re disclosed back-to-back, as a few security vendors did before Black Hat, it quickly moves from “cool” to a cold, hard annoyance.
More work needs to be done at and outside Summer Camp to help figure out ways to enable defenders to keep their own shops safe without dealing with the IT equivalent of a weekly hurricane sweeping across their asset landscapes.
“Recurring theme” is just a fancy way of saying we’re repeating the same negative patterns of the past and making little to no headway, or — to put it another way — we’re in a “two steps forward; three steps back” operational model as we work to overcome each new challenge.
However, all is not doom and gloom, and there are ways to strive for more positive outcomes.
Fundamentally, organizations must take a proactive and pragmatic approach to enhance their security posture.
For CPU vulnerabilities, investigate tools that can help detect and mitigate risks, and have a plan to rapidly patch and potentially downgrade performance if needed. Cloud providers should be evaluated closely, with redundancy and controls to limit damage from potential exposure. AI and generative models require robust testing, monitoring, and human oversight to prevent harmful outcomes.
Most crucially, vulnerability management programs require sufficient staffing, automation, and executive buy-in. Prioritization aligned to business risk can help focus limited resources. Communication and collaboration with vendors, regulators, and peer organizations could also move the needle on systemic issues.
While hacker conventions highlight scary scenarios, security leaders who take balanced action can still fulfill their mission to protect their organizations. With vision, realism, and tenacity, progress is possible even in the face of ongoing challenges.
Remember, GreyNoise has your back when it comes to vulnerability intelligence. We’re here to help you keep up with the latest CVEs, assist you in triaging a barrage of IoC’s, or providing you with the essential details necessary to make sense of the ever-changing vulnerability landscape.
The Managed Security Service Provider (MSSP) and Managed Detection and Response (MDR) markets continue to face significant challenges in handling a large number of security alerts and vulnerabilities across multiple client environments. While this task is made even more difficult by the shortage of cybersecurity professionals in our industry, it is critical to note that the ideal solution isn’t adding more hands on deck. It's leveraging innovation and technology that amplifies the capabilities of existing teams.
MSSPs & MDRs require solutions that enable them to provide top-notch services to their clients while balancing already thin profit margins, all while ensuring they prevent analyst burnout. They need ways to quickly identify and respond to threats with confidence, without compromising on efficiency or service quality.
At GreyNoise, we understand the importance of every second in your margin-driven business. That's why we save you time, resources, and money – all while helping you expand your customer base. We gather, analyze, and categorize data on IPs that mass scan the internet and saturate security tools with noise. This allows analysts to spend less time on irrelevant or harmless activity and more time on targeted and emerging threats.
Here are just a few of the ways GreyNoise is helping our MSSP & MDR customers:
GreyNoise scales in a way your analysts can’t. But don’t just hear it from us – see how leading MSSP Hurricane Labs is reducing costs while growing their customer base with GreyNoise.
"Any single analyst can handle, say, 20 alerts per day. But a product like GreyNoise can triage alerts for every one of our customers. So as we add more customers, GreyNoise scales in a way a person can’t.”
-Director of Managed Services, Hurricane Labs
Want to learn more about how GreyNoise can help your MSSP & MDR? Schedule a demo with a GreyNoise expert.
As we roll through the summer, GreyNoise is back from its July two-week shutdown with a bunch of fresh new improvements, including 63 new tags and a bunch of exciting new data insights for our customers to explore in our Labs API. We’ve also updated our integrations to add support for our IP Similarity and Timeline for our Palo Alto customers.
We’re excited to announce the availability of our Labs API. The Labs Beta API is a data source derived from the GreyNoise sensors and platform specifically designed to uncover insights our users may find intriguing and to facilitate exciting data explorations related to emerging threats. These APIs are in beta today; however we welcome feedback that will improve the quality of our data and suggestions on how we can add them to our product. Here are some of the datasets you can explore today:
Access the top 10% of possible Command and Control (C2) IP addresses, ranked by their pervasiveness, observed by GreyNoise over the previous 24 hours. Use this query to identify second-stage IP addresses that might be involved in malicious activities following the reconnaissance and initial access stages.
Access the top 1% of HTTP requests, ranked by their pervasiveness, observed by GreyNoise over the last seven days. Gain insights into the background radiation of the internet, exploring the patterns and trends of HTTP requests.
Access the top 1% of IPs searched in GreyNoise, ordered by the number of users observed searching over the last 7 days. Understand commonalities in how users search within GreyNoise, gaining insights into popular IPs and their associated activities. This query uses a minimum number of IP submissions and users to build consensus before an IP can be considered available in this dataset.
Access the top 1% of IPs by their noise score for the last 7 days. This score is determined by comparing the pervasiveness of the number of sensors and countries that observed packets from the IP, the request rate, and the diversity of payloads and ports for which the packets were observed. This query is intended to help rank the top noise makers compared to the quiet single-hit scanners.
We’ve added a “Create Alert” button in the Action panel on the Tag details page to make it easy to create an alert. GreyNoise users can use this to monitor scanning activity directly from the Tags page, informing them of any new IPs scanning for tags they are interested in.
There is now a Copy/Search button in fields on the IP details page. The previous behavior did not allow users to copy the values in the fields.
You can access the Copy/Search buttons by hovering over fields such as Ports Scanned, Country, OS in the IP Details pages.
Previously, the Analysis Feature only accepted inputs up to 2MB. We've increased this to 4MB, so that customers can submit larger files without getting an error.
We updated our Palo Alto XSOAR support to include our IP Similarity and IP Timeline features, allowing users to easily find similar IP addresses, or review GreyNoise’s classification history on an IP.
To learn more about using the XSOAR Demisto enhancements for IP Similarity and Timeline, you can check out our documentation.
In June & July, GreyNoise added 63 new tags:
56 malicious activity tags
2 benign actor tags
5 unknown tags
All GreyNoise users can monitor scanning activity we’ve seen for a tag by creating an alert informing them of any new IPs scanning for tags they are interested in.
Don't have a GreyNoise account? Sign-up for free.
(See below for the most recent update: 2023-08-03)
Citrix recently disclosed a single critical remote code execution (RCE) vulnerability, CVE-2023-3519, affecting NetScaler ADC and NetScaler Gateway (now known as Citrix ADC and Citrix Gateway. This vulnerability has a CVSS score of 9.8, making it a high-risk issue.
GreyNoise has a tag — Citrix ADC/NetScaler CVE-2023-3519 RCE Attempt — that organizations can use to proactively defend against sources of known exploitation.
Over the past several days, numerous organizations have contributed their pieces of the puzzle, both publicly and privately. While the most recent Citrix Security Advisory identifies CVE-2023-3519 as the only vulnerability resulting in unauthenticated remote code execution, there are at least two vulnerabilities that were patched during the most recent version upgrade.
Through the analysis by Rapid 7 and AssetNote a memory corruption vulnerability was discovered in the ns_aaa_saml_parse_authn_request function that handles Security Assertion Markup Language (SAML), which can be reached through HTTP POST requests to “/saml/login”. This vulnerability has been demonstrated to corrupt memory and cause program crashes, but it is unknown whether it can be leveraged for remote code execution at this time.
Through the analysis by Bishop Fox’s Capabilities Development team together with GreyNoise a memory corruption vulnerability was identified in the ns_aaa_gwtest_get_event_and_target_names function. This function can be reached through HTTP GET requests to “/gwtest/formssso”. This vulnerability was demonstrated as capable of being leveraged for stack corruption, leading to remote code execution; and, was further corroborated by AssetNote’s Part 2 Analysis.
Through analysis from Mandiant some indications of compromise (IoCs) and post-exploitation activity are now known. As part of their provided IoCs they shared that an HTTP POST request was used in initial exploitation as well as HTTP payloads containing “pwd;pwd;pwd;pwd;pwd;” which may be useful for writing detection signatures.
On July 28th GreyNoise began observing activity — https://viz.greynoise.io/tag/citrix-adc-netscaler-cve-2023-3519-rce-attempt?days=30 — for CVE-2023-3519 wherein the attacker was attempting to leverage the vulnerability for memory corruption. An initial analysis of the observed payloads indicates that the attacker initially sends a payload containing 262 `A`'s which would result in a crash of the Citrix Netscaler `nsppe` program. They follow up with two variants using URL Encoded values and appear to be attempting to remotely execute the command `/bin/sh -c reboot` which would result in a full reboot in the system. However, it appears that the attacker may not be aware of the CPU endianness of vulnerable systems. The payloads they are attempting to send would result in memory corruption, but would not result in remote code execution as they expected. This would result in the `nsppe` program crashing.
The observed payloads are provided below for completeness.
During our latest webinar Proactive Defense Made Easy: Leveraging GreyNoise in Your SOAR Playbooks, we discussed some everyday use cases using GreyNoise with other SOAR platforms. The main goal of using GreyNoise with other SOAR platforms is to quickly identify either opportunistic attacks, get better insight into how infrastructure is being used, as well as enriching alerts using RIOT data to IP's associated with common business services.
Using GreyNoise to identify opportunistic scanning provides context to decisions in a SOAR playbook to either decide to investigate further or more quickly move to block IP's. Adding the checks into an investigation playbook provides data on scan activity and any vulnerabilities observed as being exploited.
RIOT data also provides quick data for an investigation. Many services integrated into an investigation playbook will provide details for when something is malicious but often don't provide details on known or known good services. Everyone wants the confidence to take action with their automation but may not have the insight needed. Additionally, no one wants to be wrong about this decision. RIOT adds this information to a playbook to assist with decision-making.
GreyNoise can be used in common SOAR use cases to provide better context to phishing playbooks and investigations and have more confidence to block IP's. The power of GreyNoise, alongside other intelligence tools like Recorded Future, VirusTotal, Tines, and Splunk, is nothing short of astonishing(see our full list of integrations). I hope the insights shared during the webinar inspired you to explore these tools further and optimize your cybersecurity investigations. Sign in/up for GreyNoise to explore our data for free.
AI/ML and cybersecurity go together like peanut butter and bananas. You might not think it’s a fit, but it can work out great if you’re into it.
I recently did a talk with Centripetal and wanted to share some highlights as well as the entire video below. This covers a few themes, such as: “how has ML been used in cybersecurity in the past”, “what are the problems with it”, “why we need to use it”, “how to use it responsibly”, and “what to do with all these GPTs”.
If you’re interested in watching it in full, here is the talk.
One of the first use cases for ML in security was spam filtering in early email clients in the late 90s. This was a simple bag of words + a naive Bayes model approach, but has gotten much more complicated over time.
More recently, ML has been used to build malware detection models. Almost all anti-malware processors in VirusTotal have some ML component.
It has also been used in outlier detection (determining spikes in logs/alerts/traffic) and in rule or workflow generation.
However, it’s not all sunshine, roses, and solved problems. ML has some trust issues, especially when it comes to cybersecurity. Models are never perfect and can create False Negatives and False Positives.
False Negatives are when we do not detect something as bad when it is indeed bad—it’s a miss. This has obvious problems of allowing something malicious to act on your system/network without your knowledge.
False Positives are when we call a non-malicious thing bad. This can be just as big of a issue, as it creates unnecessary alerts, leading to alert fatigue, and ultimately leading to ignored alerts which allows actual malicious activity to slip through the cracks.
Cybersecurity has a very low tolerance for both types of errors, and therein lies the issue. ML solutions have to be very, very good at detection without creating too much noise. They also have to provide context for why the ML tool made its determination.
It might seem like a pain to use complicated tools like ML/AI, but the brutal truth is that we have to. There is too much data to work through. GreyNoise sees over 2 million unique HTTP requests a day, and that’s just one protocol.
Plus, bad actors aren’t slowing down. Verizon’s DBIR recorded 16k incidents and 5k data breaches last year, and that is merely what is reported. There are ~1,000 Known Exploited Vulnerabilities (CISA) floating around (side note: GreyNoise has tags for almost all of them).
There is no getting around it, we need to use ML/AI technology to handle the load of information and allow us to become better at defense.
Here I hope to give some practical advice on developing ML/AI tools. It really comes down to two main deliverables: Confidence and Context.
By “Confidence” I don’t mean the ROC score of your model or the confusion matrix results. I mean a score you can produce for every detection/outlier/analysis that you find. For numerous ML applications, a decent analog is given right out of the box. The [0.0, 1.0] score produced from a classification model, the number of standard deviations off the norm, the percent likelihood of an event happening.. These all work well, and you can provide the understanding on how to interpret them.
Every so often, you have to create your own metric. When we created IP Similarity, we had a similarity score that was intuitive, but there was a problem. When we’re dealing with incomplete or low information on an IP (e.g., we only know the port scanned and a single web path), then we could have very high similarity scores. But, they could be a little bit garbage since they were making very generic matches. We needed to combine the similarity score and another score that showed how much information we had on a sample to provide confidence in our results.
Next, “Context”. This is just basic “show your work”. A scarily increasing number of ML/AI models are seen as black boxes. That’s…not great. We want to provide as much material that went into the decision and any other data that might be helpful for a human to look at when reviewing the result.
To put it simply, build a report based on the question words:
Finally, since GPTs are so hot right now, I aim to give some simple advice on how to use them best if you decide to integrate them into your workflow.
Artificial Intelligence and Machine Learning can provide extreme value to your product and workflows, but they are not trivial to introduce. With some care and simple guidelines, you can implement these in a way that helps your users without creating additional burden or ambiguity.
We're cooking up some interesting projects using AI and ML at GreyNoise. Sign in/up to see IP Similarity, NoiseGPT and our other Labs projects (https://api.labs.greynoise.io/1/docs/#definition-NoiseGPT), and get notified of Early Access for what's coming down the pipeline!"
GreyNoise observed a significant spike in attacker activity the day CISA added CVE-2023-24489 to their Known Exploited Vulnerabilities Catalog:
Citrix ShareFile, a popular cloud-based file-sharing application, has recently been found to have a critical vulnerability, CVE-2023-24489, which allows unauthenticated arbitrary file upload and remote code execution (RCE). In this blog post, we will discuss the details of this vulnerability, how attackers can exploit it, and how you can protect your organization from potential attacks.
GreyNoise now has a tag for CVE-2023-24489, allowing us to track exploit activity related to this vulnerability. If you use Citrix ShareFile, make sure to apply the latest security updates as soon as possible to patch this critical RCE flaw.
CVE-2023-24489 is a cryptographic bug in Citrix ShareFile’s Storage Zones Controller, a .NET web application running under IIS. This vulnerability allows unauthenticated attackers to upload arbitrary files, leading to remote code execution. The vulnerability has been assigned a CVSS score of 9.8, indicating its critical severity.
Attackers can exploit this vulnerability by taking advantage of errors in ShareFile’s handling of cryptographic operations. The application uses AES encryption with CBC mode and PKCS7 padding but does not correctly validate decrypted data. This oversight allows attackers to generate valid padding and execute their attack, leading to unauthenticated arbitrary file upload and remote code execution.
Researchers at Assetnote dissected the vulnerability and published the first proof-of-concept (PoC) for this CVE. Other PoCs for this have been released on GitHub, increasing the likelihood of attackers leveraging this vulnerability in their attacks and further demonstrating the severity of the issue.
As of the publishing timestamp of this post, GreyNoise has observed IPs attempting to exploit this vulnerability. Two have never seen GreyNoise before this activity:
Citrix has released a security update addressing the ShareFile vulnerability. Users are advised to apply the update to protect their systems from potential attacks. The fixed version of the customer-managed ShareFile storage zones controller is ShareFile storage zones controller 5.11.24 and later versions. The latest version of ShareFile storage zones controller is available from the following location: https://www.citrix.com/downloads/sharefile/product-software/sharefile-storagezones-controller-511.html.
Leverage GreyNoise’s hourly updated data on scanning and exploit activities to stay ahead of opportunistic attackers. Our threat intelligence platform allows you to identify noise, reduce false positives, and focus on genuine threats. Sign up for GreyNoise Intelligence today and gain the edge in protecting your systems against vulnerabilities like CVE-2023-24489.
GreyNoise detection engineers have released tags for
CVE-2023-29298 is an Improper Access Control vulnerability affecting Adobe ColdFusion versions 2018u16 (and earlier), 2021u6 (and earlier), and 2023.0.0.330468 (and earlier). This vulnerability could result in a security feature bypass, allowing an attacker to access the administration CFM and CFC endpoints without user interaction. The vulnerability has a CVSS 3.x base score of 7.5, indicating high severity.
CVE-2023-29300 is a Deserialization of Untrusted Data vulnerability impacting Adobe ColdFusion versions 2018u16 (and earlier), 2021u6 (and earlier), and 2023.0.0.330468 (and earlier). This vulnerability could result in arbitrary code execution without user interaction. The vulnerability has a CVSS 3.x base score of 9.8, indicating critical severity.
CVE-2023-3519 is an unauthenticated remote code execution (RCE) vulnerability impacting several versions of Citrix ADC and Citrix Gateway. This vulnerability allows a malicious actor to execute arbitrary code on affected appliances. It may also serve as an initial access vector for ransomware and other types of malicious campaigns. GreyNoise would like to thank the Capability Development team at Bishop Fox for collaborating with us to track this emerging threat. They have an excellent, detailed write-up for folks interested in more details.
All three vulnerabilities are listed in CISA's Known Exploited Vulnerabilities Catalog, meaning they have been observed being exploited in the wild and pose significant risks to organizations. Organizations should prioritize remediation efforts for these vulnerabilities to reduce the likelihood of compromise by known threat actors.
Organizations are strongly encouraged to use GreyNoise’s hourly updated threat intelligence data to block IP addresses that are seen exploiting these vulnerabilities. By leveraging GreyNoise's tags and alerts, organizations can enhance their security posture and protect their systems from potential exploitation attempts while allowing their operations teams time to apply patches or mitigations.
In today's world, where networks generate an overwhelming amount of data, security analysts often find themselves struggling to separate the real threats from the noise. Their days are spent in a constant reactive mode, leaving little room for proactive measures due to limited time and resources. In this blog post, we'll delve into how GreyNoise empowers security analysts and transforms their daily work by cutting through the noise and providing invaluable insights.
GreyNoise is a powerful threat intelligence platform designed to assist security analysts in identifying noise and minimizing false positives. By meticulously collecting and analyzing internet-wide scan and attack data, GreyNoise equips security teams with contextual information about the threats they encounter. With its ability to filter out noise and shed light on the sources of attacks, GreyNoise empowers security analysts to focus their efforts on genuine threats.
Security Operation Centers (SOCs) are often inundated with an overwhelming barrage of security alerts. However, it's disheartening to discover that a significant portion of these alerts, often exceeding 50%, are nothing more than false positives or irrelevant internet noise. One exasperated GreyNoise customer even lamented, "Stop chasing ghosts!" (this is why you will see our little “ghostie” icon many places on our website and in our product) GreyNoise comes to the rescue by enabling SOC teams to filter out known benign and noisy alerts originating from SIEM and SOAR systems. This empowers analysts to laser-focus on targeted and malicious activities that truly demand attention. Learn More >>
GreyNoise takes threat intelligence to new heights by providing security analysts with valuable context surrounding the sources of attacks. Through thorough analysis of internet-wide scan and attack data, GreyNoise identifies patterns and offers insights into the tactics, techniques, and procedures (TTPs) employed by attackers. Armed with this knowledge, security analysts gain a deeper understanding of the threats they face, enabling them to devise more effective strategies to mitigate risks and safeguard their organizations. Learn More >>
GreyNoise provides an early warning system for vulnerabilities being actively exploited in the wild, plus dynamic IP blocklists that security teams can use during their window of exposure. Now you can swiftly identify trending internet attacks focused on specific vulnerabilities and CVEs, efficiently triage alerts based on malicious, benign, or targeted IP classifications, and take proactive measures to block and hunt down IP addresses opportunistically exploiting a particular vulnerability. By leveraging these comprehensive features, security teams gain an edge in staying ahead of threats and bolstering their defenses against mass exploitation attacks. Learn More >>
Security analysts grapple with numerous challenges in their day-to-day work, including the overwhelming volume of network data and the complexity of evolving threats. However, GreyNoise emerges as a formidable ally, providing context about attack sources, reducing false positives, and bolstering incident response capabilities. By harnessing the power of GreyNoise, security analysts can direct their attention to genuine threats and ensure their organizations remain resilient against cyber threats.
Take the first step and explore our data for free to experience the transformative power of GreyNoise firsthand.
Computers don’t understand words. They don’t understand verbs, nouns, prepositions, or even adjectives. They kinda understand conjunctions (and/or/not), but not really. Computers understand numbers.
To make computers do things with words, you have to make them numbers. Welcome to the wild world of text embedding!
In this blog I want to teach you about text embedding, why it’s useful, and a couple ways to do it yourself to make your pet project just a little bit better or get a new idea off the ground. I’ll also describe how we’re using it at GreyNoise.
With LLMs in the mix, modern use cases of text embedding are all over the place.
In all of these, we’re encoding a piece of text into a numerical vector in order to do basic machine learning tasks against it, such as nearest neighbors or classification. If you’ve been paying attention in class, this is just feature engineering, but it’s unsupervised and on unstructured data, which has previously been a really hard problem.
Lots of large models will offer sentence level embedding APIs. One of the most popular ones is OpenAI https://platform.openai.com/docs/guides/embeddings. It doesn’t cost a ton, probably under $100 for most data sets, but you’re dependent on the latency of external API calls and the whims of another company. Plus, since it’s a GPT model it is based on the encoding of the last word in your text (with the cumulative words before it), that doesn’t feel as cohesive as what I’m going to suggest next. (This is foreshadowing a BERT vs GPT discussion)
GPT stands for Generative Pre-trained Transformer. It is built to predict the next word in a sequence, and then the next word, and then the next word. Alternately, BERT stands for Bidirectional Encoder Representations from Transformers. It is built to predict any word within a set piece of text.
The little bit of difference between them, because they both use Transformers, is where they mask data while training. During the training process a piece of the text is masked, or obscured, from the model and the model is asked to predict it. When the model gets it right, hooray! When the model gets it wrong, booo! These actions either reinforce or change the weights of the neural network to hopefully better predict in the future.
GPT models only mask the last word in a sequence. They are trying to learn and predict that word. This makes them generative. If your sequence is “The cat jumped” it might predict “over”. Then your sequence would be “The cat jumped over” and it might predict “the”, then “dog”, etc.
BERT models mask random words in the sequence, so they are taking the entire sequence and trying to figure out the word based on what came before and after (bidirectional!!!). For this point, I believe they are better for text embedding. Note, the biggest GPT models are orders of magnitude bigger than the biggest BERT models because there is more money in generation than encoding/translation, so it is possible GPT4 does a better job at generic sentence encoding than a home grown BERT, but let's all collectively stick it to the man and build our own, it’s easy.
If your data is perhaps not just basic English text data, building your own encoder and model might be the right decision. For GreyNoise, we have a ton of HTTP payloads that don’t exactly follow typical English language syntax. For this point, we decided to build our own payload model and wanted to share the knowledge.
There are two parts of a LLM. The same parts you’ll see in HuggingFace models (https://huggingface.co/models) and everywhere else. A Tokenizer and a Model.
The tokenizer takes your input text and translates it to a base numerical representation. You can train a tokenizer to learn vocabulary directly from your dataset or use the one attached to a pre-trained model. If you are training a model from scratch you might as well train a tokenizer (it takes minutes), but if you are using a pre-trained model you should stick with the one attached.
Tokens are approximately words, but if a word is over 4-5 characters it might get broken up. “Fire” and “fly” could each be one token, but “firefly” would be broken into 2 tokens. This is why you might often hear that tokens are “about ¾ of a word”, it’s an average of word to token. Once you have a tokenizer it can translate a text into integers representing the index of the tokenizer set.
“The cat jumped over” -> 456, 234, 452, 8003
Later, supposing we have a model, if you have the output 8003, 456, 234, 452 (I reordered on purpose) you could translate that back to “over the cat jumped”
The tokenizer is the translation of a numeric representation to a word (or partial word) representation.
With a tokenizer, we can pass numerical data to a model and get numerical data out, and then re-encode that to text data.
We could discuss the models, but others have done that before (https://huggingface.co/blog/bert-101) All of these LLM models are beasts. They have basic (kinda) components, but they have a lot of them, which makes for hundreds of millions to billions of parameters. For 98% of people, you want to know what it does, the pitfalls, and how to use it without knowing the inner workings of how transformer layers are connected to embedding, softmax, and other layers. We’re going to leave that to another discussion. We’ll focus on what it takes to train and get a usable output.
The models can be initialized with basic configs and trained with easy prompts. Thanks to the folks at Huggingface (you the real MVP!). For this we are going to use a RoBERTa model (https://huggingface.co/docs/transformers/model_doc/roberta). You could use a pre-trained model and fine-tune it, however, we’re just going to use the config and build the whole model from random/scratch. A very similar workflow is usable if you want to use a pre-trained model and tokenizer though. I promise I won’t judge.
Import or copy the code from model training gist.
Create your own list of text you want to train the encoder and model on. It should be at least 100k samples.
If you have created your data set as `str_data` and set a model path as a folder where you want to save the model and tokenizer, you can just do:
This will create the tokenizer and model. The tokenizer is usable at this state. The model is just random untrained garbage though.
When you’re ready for the first train, get into the habit of loading the tokenizer and model you created and training it, this is what people call “checkpoints”.
When you want to retrain or further train, which at this point is also called fine-tuning, just load it up and go again with new or the same data. Nobody is your boss and nobody really knows what is best right here.
Note: You’re going to want to use GPUs for training. Google Colab and Huggingface Notebooks have some free options. All in, this particular model will require 9-10GB of GPU memory, easily attainable by commodity hardware.
Large Language Models do not have a great list of sanity checks. Ironically most benchmarks are against other LLMs. For embeddings we can do a little better to work toward your personal model. When you take two samples that you think are similar and run them through the model to get the embeddings, you can calculate how far they are apart with either cosine or Euclidean distance. This gives you a sanity check of if your model is performing as expected or just off the rails.
For Euclidean distance use:
For cosine distance use:
We’re early adopters of LLM tech at GreyNoise, but it is hard to put it in the hands of users responsibly. We basically don’t want to F up. We have an upcoming feature called NoiseGPT that takes natural language text and turns it into GNQL queries. Begone the days of learning a new syntax for just figuring out what the hell is going on.
We also have an in-development feature called Sift, a way to tease out the new traffic on the internet and describe it for users. This would take the hundreds of thousands of http payloads we see every day and reduce it to the ~15 new and relevant ones and also describe what they are doing. EAP coming on that soon.
Plus, if you think of any great ideas we should be doing, please hit us up. We have a community slack and my email is below. We want to hear from you.
With these tips I hope you’re able to create your own LLM for your projects or at least appreciate those that do. If you have any questions please feel free to reach out to email@example.com, give GreyNoise a try (https://viz.greynoise.io/), and look out for features using these techniques in the very near future.
On June 7, 2023 VMWare released an advisory for CVE-2023-20887, a command injection vulnerability in VMware Aria Operations for Networks (formerly vRealize Cloud Mangememt) with a critical severity score (CVSS) of 9.8. The proof of concept for this exploit was released June 13th, 2023 by SinSinology.
Primary takeaway is:
“VMWare Aria Operations Networks is vulnerable to command injection when accepting user input through the Apache Thrift RPC interface. This vulnerability allows a remote unauthenticated attacker to execute arbitrary commands on the underlying operating system as the root user.” – SinSinology
This issue can be resolved by updating to the latest version. Further information can be found here: https://www.vmware.com/security/advisories/VMSA-2023-0012.html
At the time of writing we have observed attempted mass-scanning activity utilizing the Proof-Of-Concept code mentioned above in an attempt to launch a reverse shell which connects back to an attacker controlled server in order to receive further commands. Continual monitoring of activity related to this vulnerability can be tracked via the relevant GreyNoise tag below.
If you’ve ever seen a GreyNoise presentation by me, it’s more than likely at some point I will pull up my Splunk instance to show what I would consider to be a few clever dashboards and searches. Apart from the impromptu searches that I may write (which may not be great), there’s some powerful and practical ways you can leverage GreyNoise data inside your Splunk environment right now.
With the latest version of the GreyNoise app for Splunk (v.2.2.0), you can now keep the last 24 hours of data local to your Splunk instance with feeds. Plus, it’s easier than ever to filter out noise from large datasets. Instead of relying on API lookups, the data can be referenced locally first to remove opportunistic and benign IP’s quickly when hunting through your data.
A good dashboard can turn a bad day into a great one.
I always joke that data isn’t real until it’s displayed on a map, but there's some truth to it! Having a quick overview of your data visually makes it easier to piece together an understanding of the scan activity landscape.
Using custom commands you can pull out information on internet traffic to safely and confidently ignore (things we classify as ‘benign’ or IP’s from the RIOT dataset) and particular pieces of information you may want to investigate further. Everything left over will include the IP’s that are not in GreyNoise, which could indicate more targeted attacks, and IP’s we classify as ‘unknown’.
You can find more information about our classifications and how to apply GreyNoise data to your analysis in our documentation: https://docs.greynoise.io/docs/applying-greynoise-data-to-your-analysis
Paired with information from your firewall imported into Splunk, GreyNoise data leveraged in a dashboard can show vulnerabilities that ‘unknown’ IP’s are specifically looking for. Combining this knowledge with your current vulnerability scans can help you quickly identify if someone is interested in vulnerabilities specific to your attack surface.
We talk a lot about filtering out opportunistic traffic, and enriching data based on GreyNoise but let’s not sleep on the RIOT dataset. If you’re not familiar with RIOT it’s a collection of ~50 million IP addresses that are associated with common business services.
What does this let you do with your data in Splunk? There’s a lot of ways that people are applying this dataset in their searches and hunting. Ryan Kovar wrote a great blog post about using wiredata with Splunk (https://www.splunk.com/en_us/blog/security/wire-data-huh-what-is-it-good-for-absolutely-everything-say-it-again-now.html) and while legitimate services can be abused (Hello T1567!) they can also make up a significant portion of the traffic being searched. RIOT makes it easy to do a first pass and remove any outbound traffic to those services and makes it easier to find potentially interesting traffic.
More on our RIOT dataset here: https://docs.greynoise.io/docs/riot-data
If you find this information useful then join me on June 15th for a live webinar where I’ll cover the Splunk integration in detail. Also, if you are going to .conf 23 we will be there as well! Swing by booth 103 or set up a meeting with us here.
GreyNoise today announced that it achieved SOC 2 Type 2 compliance in accordance with American Institute of Certified Public Accountants (AICPA) standards for Systems and Organizational Controls (SOC). Achieving SOC 2 compliance with unqualified opinion serves as third-party industry validation that companies provide best-in-class enterprise-level security for their customers’ data.
SOC2 is a difficult undertaking, especially if you do not have dedicated compliance or security resources who will contribute to creating the policies and implementing the changes. If you take one thing away from this post, let it be this: hire for Systems Administrator and IT operations roles before you think you need them because it will be too late by the time you do need them. Systems Administration tech debt and work is an exponential curve; the longer you go without them, the harder it becomes to fix. Aside from the struggle of collecting evidence through screenshots and questionnaires, both systems administration and engineering cycles will be required to meet the framework standards and controls.
SOC2 is broken out into five pillars:
Approaching the controls one-by-one can be a daunting task. We found it was more manageable to divide the process into general phases, the last of which is the audit itself.
Our advice here is to not go it alone. From evidence collection and auditor documentation delivery to infrastructure and compliance control scanning, there are myriad different vendors which make every step of the process easier. Take time choosing the auditor that is right for you. Some are very “by the book” and others will be more lenient on “acceptable risk” controls.
You will need platforms for a lot of controls - including SAST, vulnerability scanning, asset tracking/management, version control, and more. For the most part, free open-source software exists for each step along the way. We found it best to mix and match, opting for paid platforms where open-source implementation was going to take too much engineering time value away from other ongoing projects. For example, gosec and tfsec for some language-specific SAST scanning, CloudFlare’s Flan for internal vulnerability scanning, and Grokability Snipe-IT for asset management versus GitHub Advanced Security licenses, Tenable Nessus, ServiceNow ServiceDesk, or Oomnitza. These latter are perfectly useful products, but it’s important to decide what you want to pay for versus what you can run yourself for free. The value any company puts on each function or service the platform provides compared to the cost or time value of money will be different.
The two direct SOC2-specific platform choices are the auditor and the compliance automation platform. SOC2 is significantly more difficult without a compliance automation platform - we estimate using such a platform saves over a hundred hours of work.
Auditors: Check which audit firm was used when you collect your SOC2 and SOC3 reports from your vendors. Turn that list into your potential auditor review list, and make a decision for an audit firm based on your meetings and due diligence with those firms. GreyNoise went with Prescient Assurance. They have a security arm that can provide your third-party penetration test, which is optional for SOC2, for a bundle discount.
Compliance Automation: Auditors will need access to a mountain of evidence in the form of read-only access to your environment, screenshots, and questionnaire answers. This is made easier with a compliance automation platform. Whereas an audit firm may not have a process in place for provisioning roles for their access, compliance platforms do, and they make it easy to both roll out and roll back. GreyNoise decided to use SecureFrame as their pricing, offering, and overall functionality/featuring was more directly suited to our needs. Some other popular options include Drata, Vanta, HyperProof, Anecdotes, and Tugboat Logic.
Implement, document, and be able to explain the following eight “heavy-hitters”.
Set up Okta, Google Cloud Identity, OneLogin, Azure Active Directory, or Auth0. The choice here depends on what technology you are already using for business productivity. If you are already using Office 365, then Azure Active Directory is the easy choice. If you are already using Google Workspace, then Google Cloud Identity may be the best option. When an employee logs into anything, they would ideally use their work credentials as much as possible. Enforce multi-factor authentication everywhere. Ditch single-user access and access keys and switch to “AssumeRole” if you are leveraging AWS, GCP, or Azure. In our environment, we added SAML tokens to each user in Google Workspace allowing them to assume a role (Read Only, Billing, Administrator, etc.) in the corresponding AWS accounts.
Set a secure password preference order:
Leverage an organization-wide password manager like 1Password or Bitwarden, with separate “vaults” for departments and roles. Use something with automatic detection of weak or reused passwords, and enforcement of strong password policies.
Implement some approval processes for your pull requests. Don’t limit it to just a manual review by engineering management or leadership. Include automated testing and the scanning of code for unit, integration, and end-to-end tests to ensure builds are passing and security policies/controls are green. Diagram out the overall process, like this:
You will need different environments - such as development, staging, and production. Deployments move across each, and are tested in each before actual implementation in the production environment. Ideally, changes to these environments would be tracked and dictated by GitHub, GitLab, BitBucket, or some other code version control platform.
A SIEM is not a requirement for SOC2, but extensive logging capabilities with alerting are. If there is a resource or storage essential to the operation of your product or business, access and audit logs for the resource should be easily retrieved and reviewed.
If an employee logs in to a resource from Washington, DC and then logs in from Seattle, WA, a few moments later from a different device, you need to know about it immediately through logging or block that second login altogether. If 100GB of data is downloaded from an S3 bucket when the daily average is 10GB, alarm bells should go off. Establish what “normal” is, and have a process in place to regularly review anomalous activity or anything outside of that normal bound.
Collecting logs will help you in post-incident response situations. Regularly reviewing and alerting on those logs will help you to avoid post-incident response situations.
Have a reproducible process in place for spinning up infrastructure resources. This can be implemented with Infrastructure as Code and configuration management tools like Salt, Ansible, Terraform, Chef, Puppet, or CloudFormation.
SOC2 will be significantly more painful if infrastructure in your environment is created manually by the engineering or IT team without an approval process or automation. GreyNoise infrastructure is entirely in Terraform and Salt. This way, approval and automation are shared with the CI/CD and pull request pipeline. If a process already exists that can be leveraged, it will save time.
The general idea here is that you should do as much as possible NOT in the web console for something like AWS, Azure, or vCenter. Take note of any actions you perform in the web console - this is your automation list.
Install an MDM platform on all company-owned desktops, laptops, and phones. Any device which will access the internal systems of your product or customer data. Roll out the “compliance” packs for SOC2 to enforce things like password complexity, disk encryption, and software update cadence.
This is a crowded space, often undergoing expansion and consolidation. Fleetsmith was a great Mac OS and iOS MDM tool. Apple acquired the company and quickly removed all capability to install third-party (non-Apple and non-App Store) apps. Apple killed the product two years after the acquisition. The gold-standard for Mac OS and iOS seems to be JAMF/JAMF Pro.
GreyNoise ended up splitting MDM platforms - one for Mac and one for Windows/Linux. It is a difficult choice to make between a broader platform that covers three Linux distributions, Windows, and Mac OS at a percentage of what you need and two or three platforms that cover almost all of what you need for each.
A lot of time will be spent on scoring vendor risk based on their operational reliance and the data they access or contain. Part of SOC2 requires collecting compliance reports from these vendors (SOC2, SOC3, ISO 270001, etc.) and reviewing them annually. A comprehensive list of vendors is an important one to keep up to date for both compliance and cost control reasons.
In developing this list, GreyNoise found a handful of vendors we were still paying but either not using or the service/functionality they provided was duplicated by another platform. Ultimately, SOC2 required us to enumerate our vendors, generate a Software Bill of Materials (SBOM), and led to cost savings by eliminating or consolidating redundant platforms.
An understandably broad topic, but for SOC2 specifically you should be scanning for:
Each finding should have a rating from informational to critical, and each rating should have a time-to-resolution SLA which dictates how quickly or how much time it takes you to respond to and remediate. There are some free solutions which offer compliance control monitoring, such as SteamPipe compliance packs for AWS. GreyNoise decided to partner with SecureFrame to streamline the monitoring of these controls and to provide auditors with access to our provided documentation and evidence quickly and securely. A compliance automation vendor is strongly recommended for time and sanity's sake.
SOC2 includes some business operational aspects which will encompass a few different departments or teams in your organization. The following are some examples required for SOC2:
Many compliance automation platforms include auto-generated policies which require slight tweaking and adjustments to pass the “policy” controls. Invest time in either writing your own or significantly building on the automated policy output from your compliance platform. There are plenty of great security companies who publicly publish their policies (https://tailscale.com/security-policies/) which you can build on and adapt to your needs. GreyNoise will also publish our policies in the near future.
Failing controls and tests will pop up after rolling out the compliance automation platform. The time to resolve these controls varies significantly, so consider this phase will take the longest time. In our experience, the longest controls to flip from red to green were all data encrypted in transit and all data encrypted at rest.
You will want to resolve these tests until at least 90% are green before kicking off the audit itself. Work with your team to bucket the failing controls, and turn them into issues or projects to be assigned. You can even provide screenshot evidence of these projects and issues as proof of your organization’s incident tracking from discovery to resolution for the SOC2 audit.
This is the phase which will likely take the most time, money, and effort from your team. Unless you “shifted left” right out of the gate and began developing on day one with a security mindset baked in, plan to dedicate a few weeks or a couple of months to remediating failing controls.
Part of the phase also includes screenshot and evidence gathering. SecureFrame helped GreyNoise to easily organize this evidence and gave us an easy way for auditors to access it. This may take several days or weeks to complete and you will wind up with hundreds of screenshots, documents, templates, and examples.
One thing to note is that you will never see a failed SOC2 report or audit. You either get a report or not. If you fail to get a report, you can always try again when you are better positioned. Failure means you get to try again until you succeed. Success means you still need to do it again next year.
From project kickoff to completion, SOC2 took GreyNoise about 18 months for the first time. Recertification, which needs to be completed annually, will take us about four months moving forward.
The time to complete SOC2 accreditation can be greatly reduced by the more dedicated resources you have to the implementation and maintenance of compliance. The shortest amount of time we imagine possible for first-time SOC2 accreditation is six months.
Keep in mind that you will be reperforming the audit exactly one year after you receive the accreditation. You may decide to add some other compliance certifications, such as ISO 270001. As time goes on and your company grows, compliance becomes harder and will require a dedicated team.
The audit process is broken down into two phases, Type 1 and Type 2. Type 1 is a short audit period, usually a couple of days, and Type 2 is longer, usually between 60 and 90 days.
Type 1 means you meet the audit criteria at a single point in time; Type 2 means you maintain compliance with those same criteria over a period of several months. In other words, Type 1 is meeting the compliance standard, and Type 2 is maintaining that compliance standard with any changes over time.
Here are some of our opinions, takeaways, and advice:
The way your organization approaches SOC2 compliance can be the easy way or the hard way. Attitude could be easy, to treat compliance like a checkbox and do the minimum to pass the audit. Or it could be hard - to take the input and output from the framework and make significant changes to processes to bake in security as a priority early on for everyone. For those serious about security, the hard choice is easy to make.
May brought more product enhancements to user workflows, data coverage… and of course, more interesting tags! Twenty four to be exact, as we continue to improve our product to help our customers monitor emerging threats and identify benign actors. We improved our sensor coverage to include coverage in the country of Ghana, plus we made some helpful improvements to our bulk analysis, RIOT dataset, and APIs.
The Bulk Analysis function in the GreyNoise Visualizer has been improved so that users can now export unidentified IPs via CSV and JSON.
This improvement helps analysts more easily identify the ‘interesting’ IPs in a bulk dataset that they are analyzing (IPs identified by GreyNoise are identified to be known common scanners or common business services; IPs that are UNKNOWN in GreyNoise could represent a targeted threat or something that requires additional investigation).
To access this feature, go to the GreyNoise Analysis page and analyze a file or dataset containing IP addresses.
Two fields have been added to the metadata returned via Bulk Data, IP Context API, and GNQL API that will help users determine baselines or rates of activity:
We are now tracking Qualys scanner IP addresses in our RIOT database of common business services, so that customers can whitelist this activity (should they wish to) or contextualize this activity when seen in their security logs.
RIOT identifies IPs from known benign services and organizations that commonly cause false positives in network security and threat intelligence products. The collection of IPs in RIOT is continually curated and verified to provide accurate results.
The GreyNoise App for Splunk has been updated to include a new Feed component, which allows users to ingest the GreyNoise indicator feed into Splunk to be used for high-volume log enrichment. Additionally, new dashboard and commands have been added to support the IP Similarity and IP Timeline tools. Learn More
ThreatQ has released new GreyNoise Actions for the Orchestrator platform which allow for IP Similarity, RIOT and Quick lookups against the GreyNoise API. These updates can be downloaded from the ThreatQ Marketplace. Learn More
In May, GreyNoise added 24 new tags:
20 malicious activity tags
3 benign actor tags
1 unknown tag
All GreyNoise users can monitor scanning activity we’ve seen for a tag by creating an alert informing them of any new IPs scanning for tags they are interested in.
On Monday, May 1, 2023, CISA added CVE-2021-45046, CVE-2023-21839, and CVE-2023-1389 to the Known Exploited Vulnerabilities (KEV) list. For all three CVEs, GreyNoise users had visibility into which IPs were attempting mass exploitation prior to their addition to the KEV list. GreyNoise tags allow organizations to monitor and prioritize the handling of alerts regarding benign and, in this case, malicious IPs.
At GreyNoise we recognize the value of partnership and intelligence sharing when it comes to protecting internet citizens. Today the GreyNoise Labs team wants to give a shoutout to Trinity Cyber.
On May 31st, 2023 Progress issued a security notice to users of MOVEit Transfer regarding a vulnerability that allows for escalated privileges and potential unauthorized access to the environment. CVE-2023-34362 was assigned to this vulnerability on June 2, 2023.
We’ve added additional sensor coverage for the following countries:
You can view which IPs are seen scanning sensors in certain countries from our IP details page, or use `destination_country:”<country_name>”` in GNQL to find IPs that have hit those regions. Destination country search is available in all commercial plans for GreyNoise and to our community VIP users.
Introducing the Labs API Playground, a powerful tool designed to provide users quick access to data and an Early-Access/Beta API experience. Whether you’re a seasoned GreyNoise user (welcome back!) or just starting your journey (welcome aboard!), this playground will enable you to explore and interact with our data in new ways.
To enter the Labs API Playground, visit https://api.labs.greynoise.io. It’s that easy! Who can access this playground? Anyone with a GreyNoise Account! And if you’re already a proud member of our GreyNoise community, you’re just a few clicks away from unlocking this API experience. Don't have a GreyNoise account yet? Create your free account here.
Before you dive straight into the Labs API, let’s familiarize ourselves with a few essential guidelines:
Now that you’re familiar with the playground’s rules and feedback process, it’s time to get hyped about the exciting world of Labs API queries! Here are a few notable ones to get you started:
The Labs API Playground is intended to drive exploration and data-driven insights. Let curiosity guide you as you uncover hidden patterns, emerging threats, and connections within the vast landscape of internet noise.
On May 31st, 2023 Progress issued a security notice to users of MOVEit Transfer regarding a vulnerability that allows for escalated privileges and potential unauthorized access to the environment. CVE-2023-34362 was assigned to this vulnerability on June 2, 2023. MOVEIT transfer tag can be viewed here.
Progress’ security notice is advising users to review their system for unauthorized access for “at least the past 30 days”, however, GreyNoise has observed scanning activity for the login page of MOVEit Transfer located at /human.aspx as early as March 3rd, 2023. While we have not observed activity directly related to exploitation, all of the 5 IPs we have observed attempting to discover the location of MOVEit installations were marked as “Malicious” by GreyNoise for prior activities.
Based on the scanning activity we have observed, it is our recommendation that users of MOVEit Transfer should extend the time window for their review of potentially malicious activity to at least 90 days.
The primary artifact, observed through publicly available information, is the presence of a webshell named human2.aspx. This is a post-exploitation file artifact that is written to the filesystem by a malicious actor allowing them to execute arbitrary commands.
GreyNoise is observing scanning activity looking to identify the presence of the human2.aspx webshell dropped as part of the post-exploitation activity.
While the specific details of the initial exploitation vector are largely unknown at this time, we would like to provide the following items and details to our customers and community:
Last but not least, a big thank you to the GreyNoise community for alerting us to this activity early on.
Threat hunters spend a significant portion of their time searching through security logs looking for specific Indicators of Compromise (IoCs) or patterns of activity/behavior that indicate compromise. This work comes with some specific challenges:
To further enhance threat hunting and address some of these pain points, organizations can use tools like GreyNoise in conjunction with a SIEM or SOAR platform to quickly identify potential threats and investigate them further and get more out of their existing tools and filter through data sources faster. By understanding how infrastructure is being used, vulnerabilities being leveraged, and patterns of scans, threat hunters can gain valuable context on how adversaries operate and improve their response to threats.
Recently, we held a webinar on this topic, where we discussed how organizations are using specific techniques in their day-to-day operations. To gain perspective on how you can streamline your threat hunting process, sign up for the webinar and download it today to learn:
GreyNoise is built on a strong foundation of mutual respect from our community. While we love doing swag drops on Twitter (or maybe Bluesky - anyone have an invite?), we wanted to recognize community members that go above and beyond.
Enter the GreyNoise Ambassador Program! We couldn’t think of a better way to celebrate our users' constant support, spirit of collaboration, and mentorship within our community. I’m here to answer all your burning questions about the program and how you can apply!
Ambassadors are pillars of the GreyNoise Community. This program celebrates their efforts to support community growth and accessibility, focusing on three key elements:
Ambassadors are folks who have dedicated time and resources to bettering GreyNoise, whether through continuous feedback, bug reports, integrations, conference talks, or they’re just deeply dedicated to reducing Internet Noise.
If you are on the fence about being an ambassador, let us tell you about the perks you get:
In exchange for being our Ambassador, we ask that you will do 1 or more of the following:
Your term as an Ambassador will last a year, and when Spring 2024 rolls around, you will be asked to reapply.
If this all sounds good to you, we ask that you fill out this application. We will evaluate applications until the end of May and send notice to our Ambassadors in early June!
If you have any questions, don’t hesitate to reach out to the Community team.
Please update your search term or select a different category and try again.