Andrew Morris, founder of GreyNoise, joins Dennis Fisher to talk about the unique origins of the company and the security case for removing all of the background noise from the Internet to find what really matters.
Andrew Morris, founder of GreyNoise, joins Dennis Fisher to talk about the unique origins of the company and the security case for removing all of the background noise from the Internet to find what really matters.
Dennis Fisher: Alright. Welcome back to the Decipher Podcast. I'm super excited today and my guest is Andrew Morris from GreyNoise. Andrew, how's it going, man?
Andrew Morris: Hey, Dennis, how are you?
Dennis Fisher: I'm hanging in there. I'm surviving man. That's the best you can do.
Andrew Morris: Yeah, exactly. Thanks so much for having me on. I'm super excited to be here.
Dennis Fisher: Yeah, it's my pleasure. I should have asked you earlier, I don't know why I didn't. I want to talk about how you came up with the idea for GreyNoise. And you know, maybe for the listeners who haven't used it, possibly explain how it works, what it does, that kind of thing. But where did the initial idea for this come from?
Andrew Morris: So the initial idea for GreyNoise came up for me...there's like a lot, it's a long story. And some of the details I literally cannot get into, but I'll do my best to. So originally, as a pet project in maybe 2013, I set up a bunch of honeypots on the internet, there was this hosting provider (in retrospect, it was probably a Ponzi scheme) that allowed you to spend like $10 or $50 and buy a VPS for life. You can very quickly back into realizing that just can't work. So I bought, like, 10 x $10 VPSs for life or whatever, and I set up a bunch of honeypots. And I set up these honeypots in 10 different data centers around the internet. And I'd never set up a honeypot before.
And so I monitored the honeypot, I set up like a few different types of tech; there was Kippo, and there were number of other ones I used, which was an SSH honeypot. And I looked at who logged into them, and who tried to brute force those honeypots. And it was cool. I mean, I started getting attacks immediately. And I was like, wow, the bad guys are coming for me. And I was looking at all this data. And then at some point, I added like the 10th honeypot and I was like, "Man, this is a lot of data." So instead of logging in and checking the data and all these things, I'm going to put it into a central place and I'm going to look at the data from one place. So I stream all the data into one place, I streamed it into Splunk. And I'm looking at the data. And I was like, "Man, look at all these bad guys. This is crazy."
And then I noticed, kind of out of nowhere, that a lot of the IP addresses that were attacking these hosts on the internet were...there was a lot of overlap. There are a lot of IPs that were attacking all of them, right? And mind you, these hosts that these VPSs (which stands for "Virtual Private Server" for those who are unfamiliar, it's like a server that you rent on the internet). They're all in completely different data centers in different countries around the world, and very different locations geographically. And I was like, "That's super interesting. I'm seeing the same 1, 2, 3, 4, 5, 10, 100 IP addresses attacking all these things. That's crazy." And this was right around the time that company Norse was becoming kind of more and more popular. Norse, the cyber threat intel company with a map. Now they're obviously the butt of a lot of jokes. But it was right around the same time that they were doing stuff. And it was interesting.
And so I remember thinking about it like, "This is threat intel, right? I found where the bad guys are, I found the bad guys. And so that's threat intel, right?" And after I had a lot of conversations and I talked to a lot of people, some switch flipped in my head where I was like, "Wait, this isn't the stuff that people should be worried about. This is the stuff that's hitting everybody. This is just internet background noise. This is anti-threat intelligence." If you freak out about any of these things attacking you, then you're in for a really bad time because it's just a barrage constantly.
Then I started iterating that same sort of pattern of like, "Well, let's do that with different kinds of data. Let's put this over here, let's collect this kind of data. And I noticed something extremely interesting when I overlaid what I was seeing from these honeypots on the internet on top of like an actual network that has actual business users. And what I found is that it creates this like noise-canceling effect, where you basically have what's hitting everybody on the internet (totally opportunistically) and you subtract that out from what is hitting somebody's network. And what you're left with is only the things that are hitting that network specifically, both legitimate regular business users, but also targeted attacks (or more targeted attacks). So I built this thing and I presented it at a ShmooCon in in 2014 or so. And, and it was basically like, "Hey, look at all this stuff I do. It was called "No Budget Threat Intel." It was a stupid name. But it was a fun talk about this system of honeypots.
And there was a person in the audience who basically saw my talk and said, "Okay, you understand this problem better than we do for a project that we're working on. Come and build that out for a secret sneaky customer (that I can't really talk about). And let's solve this problem for a sneaky customer (that we can't really talk about) who has a problem of trying to figure out if things are hitting them specifically, or if they're hitting everybody on the entire internet.
And that was when someone asked the question, what is the expected amount of scan traffic that any host on the internet should see? And that question is so hard to answer and is so vast and so massive. And the implications of properly answering that question took me down this massive rabbit hole that I'm still going down with GreyNoise right now. Until finally, iterated, iterated, iterated, I built a company out of just this one problem. I started GreyNoise to solve this one problem for the world, and everything has kind of come from that. And we've got different use cases now for the data other than the initial one. We've got lots of different types of users and things that I would have never thought of before, and scale that we could have never achieved before. But that's functionally where the idea came from.
Dennis Fisher: When you so when you set up those honeypots, you said it was a side project, you were just kind of, screwing around. So until you gave that talk to ShmooCon and somebody came up to you and was like, "Hey, this is a real thing," had you thought, "Oh, maybe there's a business in this"? Or was that kind of the spark?
Andrew Morris: No, not even a little bit. That would have been really cool. Without getting too deep and personal, that just isn't what really motivated me at the time. I just wanted to do cool stuff, I wanted to like solve problems and do cool stuff. So even if somebody came and said, "Hey, I'll give you like a bunch of money. But you have to solve a different problem." I would have said, "Well, but I don't want to solve a different problem." I just found this one really cool. And so I I wasn't even thinking about the business aspect of it. I only ever started thinking about the business aspect of anything and making money once it became apparent to me that the only way to properly have control of building this thing in the exact way that I think it needs to happen was to build a company around it. And to build a company around something, you have to think about money.
Dennis Fisher: Yeah, unfortunately, that's true.
Andrew Morris: We live in a capitalist society, and that's just how it goes, right? I don't make the rules.
Dennis Fisher: So what were you doing at the time, what was your job when you had just set up the honeypots? What were you doing at that point?
Andrew Morris: I got kind of fired from a job. Not fired, fired. I was on a staff augmentation role for a customer, and they decided they wanted to hire me. And so I was like, "Oh, sick! I'm going to join this company full time.” And then, right at the last second, the offer was rescinded – after I'd already quit my other job. So I didn't actually get fired. I got un-hired. It felt like getting fired, but I was un-hired. And I was like, "Oh, man, this sucks." And so I had kind of like a week or two of not working, kind of my mind going a little crazy in between me like starting up my last job again. And so I was like, "Well, you know, I'm looking for something to do." And this just seemed like a fun thing to do. I've never really told anybody that before. But honest to God, I had some downtime between jobs and I was looking for a cool thing to occupy my time. And that was what was going on.
Dennis Fisher: Honestly, that's where some of the coolest ideas come from. Not just in security, or even technology, but in life in general – if you just have some time to think about things that maybe you're vaguely aware of, but you have too much other shit going on...
Andrew Morris: That's exactly right. So without getting into this too terribly much, it's like when you meditate, and you clear up your mind. You clear it out and give yourself all this new space to be creative and to think about new things. And so sometimes taking a step away, taking a step back, from your slog or the grind or the knife fight that is the day to day. It really gives you some flexibility. And unfortunately, it's the kind of thing that not everybody has the opportunity or the option to do, (because sometimes you just got to slog, right)? And in this case, with the case that what I was describing, I didn't choose to take two weeks to just think about cool stuff and what I wanted to do, right? I had this thing that I was going to do with my job, and then all of a sudden it was kind of like, "Wow, OK, I've got two weeks to kill. I've got to figure out something to do in the meantime."
Dennis Fisher: I mean, you could have just gone and played Call of Duty for two weeks...
Andrew Morris: Which, to be fair, they're not mutually exclusive – I probably played a lot of Call of Duty during that time as well...
Dennis Fisher: Fair. So once you once you kind of wrapped your head around, like, "OK, I'm going to build this thing, like it's going to be a thing." How close to what GreyNoise is now, was that original vision? What's the delta between what you originally produced and where it is now?
Andrew Morris: That's a great question. So it's gone through so many different iterations. I would say the GreyNoise right now is the exact embodiment of what the vision was several years ago before it got bigger and cooler. I was imagining Shodan, but the opposite, OK? So I was imagining the same layout, similar workflow, similar usability, similar freemium, similar feel; except, obviously the data that we collect is the exact opposite of the data that Shodan collects, right? We don't scan the internet, we listen to the internet. But laying out the data is similar. I've always been a huge fan of Shodan. So it was like, borrow from those who do good stuff, borrow from those who have figured it out. And so I borrowed a lot from Shodan. I know John and I've told him that many times. I've borrowed a lot from you over the years and I really appreciate it.
So now I would say our web interface is really beautiful. I feel really strongly about elegance, beauty, aesthetics. Products should feel good to use, you should feel cool using them, they should have a good feel. And so we're right there now. The thing is that, now that we're in the position that we're in – we have points of presence in dozens of countries and hundreds of data centers around the world; we have thousands of users; we have all these customers. And so now, there are so many more things for us to do. And the vision has gotten a lot bigger. And so where we're going is not at all where I envisioned a few years ago. We've kind of already surpassed that, which is exciting for me.
Dennis Fisher: Yeah, it must be. No matter how big something gets, you can kind of look back and think like, "Oh, man, this all came from this one little thing, this little idea. I had this goofy talk I gave at ShmooCon, you know, eight years ago (or whatever it was)." So how exactly do people use it? What are the most common use cases for your customers right now?
Andrew Morris: Yeah, absolutely. So our elevator pitch that we give to people when we're talking to customers, or potential customers, is:
Every security operation center is too busy. One of the reasons they're too busy is they have way too many alerts. Some of those alerts don't matter very much because they're generated by completely pointless, opportunistic, internet-wide scan and attack traffic that's not even a little bit targeted towards them. We'll tell you which alerts are generated by that 20, 30, 40% (of traffic) so that you can focus on the alerts that really matter to you. Like noise cancelling headphones for your center for your security products.
That's our two-second sales pitch for what we tell our enterprise customers. But the nerd fundamental version of what the most common use cases are answering, “What is GreyNoise?”
We run a gigantic network of collector sensors (kind of like honeypots) in all these different countries. We collect data in a bunch of places. We analyze that data. And we make that data available in a web interface, APIs, and security integrations.
What can you use GreyNoise to do? Really three main use cases. The first one answers the question, "Is this thing hitting everybody, or is it just hitting me?" I just saw this thing hit my network, and it looks weird, and it raised an alert or it did a thing. I'm going to look it up in GreyNoise, and if it comes back in GreyNoise, that means that it's hitting everybody on the entire internet, not just you, it's not a targeted attack. It also means that like we we've probably analyzed the actual behavior of what that thing was doing – maybe what it was looking for, what it was scanning for, what that IP address was targeting, some metadata about it, any tags. And so we can also tell you, that weird thing is an Apache Struts vulnerability check, that weird thing is actually a Mirai spreading mechanism. So use case number one, "Is this thing hitting me specifically, or is it hitting everybody on the entire internet?"
Use case number two is, "Show me where compromised devices are." As a byproduct of all the data that we collect, we know where a massive amount of compromised devices are on the internet, hundreds of thousands every day. So we can tell you, "Hey, these 300,000 IP addresses were compromised in the last day." And that's useful to people because we can tell you if something that you have is compromised. We use our alerts feature to do that, if you pop in your CIDR blocks that belong to your network and we see anything that pop there, we'll immediately email you and tell you which of your customers – if you have some kind of anti-abuse, anti-fraud, enrichment pipeline – we'll tell you if your customers or if your users are compromised. And then we'll also maybe give you information that you can use to block things that are definitely bad, like beyond a shadow of a doubt bad. If that's your jam, if that's what if that's how your network works, right?
And then the third most common use case is, and this is the one that you and I have interacted on before, is this identification of emerging threats and which vulnerabilities are being opportunistically exploited, and from where. So everybody has this question when a new vulnerability drops, a new vulnerability is out, new CVEs out and everybody's terrified. And it's a vulnerability and a piece of software that typically maybe sits on the internet, like a WAF, or a web server, or a FTP server, or an Exchange Server, or like, you know, an OWA server, right? Yes, something like that. And everybody has the exact same thought: "Is this thing actually being exploited in the wild? If so, from where? And has it been weaponized and operationalized by any of these botnets yet, which are just blasting the internet trying to exploit people with this thing?" And so it's this third use case, what we do is basically our engineering teams and our engineering and research teams will put something together on our side so that we're able to get some visibility. We'll make something that looks like the thing that's vulnerable, and we'll instrument the crap out of it. And then we will look to see what happens. What does the do the scanners and the crawlers do when they find it? Do we find anybody who's checking for the existence of the vulnerability? Do we find anybody that's opportunistically exploiting the vulnerability? Do we find anybody that's doing that at scale? Anybody that's doing that in multiple places?
Now, the issue is, with that third use case, it is very hard to productize because it's complex; you can't always do it, you can't do it consistently, and it requires a lot of work to get it right. So we mostly just do it for the marketing. We'll just do all this stuff for free, and we'll tell everybody about it. We do license that data to people, etc. But you can't predict that third use case, it's complicated, and it's hard to productize. But those are the three use cases, that last one being, "Is anybody exploiting this vulnerability? If so, from where are they doing it? Where are people exploiting BlueKeep from? Where are people exploiting Shellshock from? Who is scanning the internet for this OWA vulnerability? Who's running vulnerability checks at scale for this new OWA vulnerability? We can answer a lot of those questions.
Dennis Fisher: So for example, the Exchange stuff that came out last week, what kind of work goes into saying, "OK, we need to do our magic on the back end, because we know everybody's going to call us to be like, 'Hey, what's going on with the Exchange bugs?'"
Andrew Morris: So the question is, "What's the work that goes into that?" So I can already feel my investors and board members hearing this and saying, "Andrew, please stop saying so much about how you do what you do." So I don't want to get into it a terrible amount. But the short answer is that there's a little bit of protocol implementation that goes into it. Sometimes you have to write code, sometimes we just have to move around existing code that we already have. Some things just require more work than others. Some protocols are more complex than others, some vulnerabilities are more complex to try to emulate a device that is affected. It varies wildly. And it goes back to tooling, it goes back to a lot of stuff like this. So there's a lot of work that goes into this. Sometimes it's easy, sometimes it's complicated.
We're still a relatively young company, we've been around for about three and a half years now. We only started hiring people two years ago. So we're still getting our ducks in order on turning this into a machine. But a good amount of work goes into it. And honestly, the other crazy is that sometimes, for various different reasons, GreyNoise just can't be valuable with certain kinds of vulnerabilities, architecturally or because they affect something that's not necessarily sitting on the perimeter. Or maybe because there's some prerequisites that are required, such that we aren't going to be able to do anything about a given vulnerability. All of these things are the reasons why we've been very hesitant, or I would say deliberate, about when we want to productize this last thing as an offering because it's hard to do.
Dennis Fisher: It sounds really hard to me, I'm not even sure how you go about it, honestly, and it seems like you would need quite a bit of manpower to get that done.
Andrew Morris: We do. We've got patents on this stuff. If you want to dig into it more, you can rip into all of our patents and figure out what we do. It's all public. But the long and the short is, you're not fooling a human, you're fooling a scanner, you're fooling a crawler. So you need to present just enough information to them to be able to fool the scanner, fool the crawler into doing kind of the next thing. You need to do that quickly, and you need to do that at scale. And that's it.
Dennis Fisher: You mentioned earlier that a large part of the stuff that hits, say, a given enterprise in a given day, is stuff that they don't really need to be all that concerned about – it's mass scan stuff, it's opportunistic, it's not targeted. I know this is going to vary widely by organization, but in general, how much of the attack activity that organizations see is something they need? Activity they can just filter out with GreyNoise.
Andrew Morris: That's such a great question. So I'm going to try to break that down as best I possibly can. What amount of attack traffic that organizations see can they just filter out? I don't ever advise that our users filter out anything, because you don't want to drop data – you can never get it back, and if the tool is wrong, then you're in a bad position. So you may want to put things in a lower priority on the queue, you may want to deprioritize things, you may want to put certain events in cheaper storage, you may want to do a number of different things. But I'd say don't filter anything. The answer is, it depends. It depends on a lot of different factors. But every organization that we work with, on average, depending on where they're enriching things from GreyNoise from, just about any organization is going to find a hit rate of the alerts that they're seeing that are making it to a security analyst through all of their other automation of at least 20%. So that's after the firewall, after whatever other automation that the team has already built workflows for, etc. is at least 20% of the things that are going to make it to the SOC are going to be contextualized by GreyNoise.
Now again, does that mean that they can just forget about all of them? No. It means that it means that we want to turn that that one hour investigation into a five second investigation. Like, this thing was hitting everybody on the entire internet. And then then the explainability part comes in, like, "OK, why did it trip the alert? Oh, it's looking for this kind of technology. And we don't actually run that technology. So like, cool, done. I don't need to look at this thing anymore." There are a lot of factors that come in play there. Now here's the really interesting part. On the other side of this, let's just say, "How many of the opportunistic attacks that an organization sees at their perimeter are things that GreyNoise can contextualize, not necessarily that makes it into the SOC and makes it up to an analyst, but just the amount of things that hit an organization, that's over 90%. I'm not even making this up. The amount of internet-wide background noise, between tiny stuff like scans, more complex things like attacks, it's absolutely overwhelming. It's so much noisier than anybody realizes. The data that GreyNoise looks at overlaps with such a massive amount of what customers are seeing on their enterprise networks, their corporate networks' parameters. It's absolutely mind boggling.
Dennis Fisher: I've had conversations with people that work in SOCs many times over the years, and they all have kind of that 1,000-yard stare after a while. If they've worked in a SOC for long enough, they just have like PTSD from, like, "I have nine monitors in front of me showing all these pew pew maps and all these alerts. The contextualization part of it, and reducing the noise, must be such a huge thing for them, like, "OK, I might need to worry about this, but not yet. Maybe tomorrow."
Andrew Morris: Look, there are a lot of ways to think about GreyNoise and to think about this problem. But at the end of the day, security people don't want more things to worry about. I promise you, they don't. The people in the SOC are overworked, they are exhausted, it is a very hard job. They are getting alerts from a zillion security products, they are desensitized to a massive amount of those alerts being absolutely useless, bullshit wastes of time. And they are frustrated. And instead of us coming in and claiming, the reason that people love GreyNoise is because we don't say like, "You know what. Buy our product. Security, solved!" We don't do that. We're just saying that this is a problem, and we're going to make this problem incrementally better for you. And as we get better at it, we're going to get more and more confidence, and more and more data, to be able to view fewer and fewer things that really matter. To really whittle away more and more, to reduce that time to verdict of, "this thing doesn't matter". Not the time to verdict of, "this is a cataclysmic bad thing". That's not really what we focus on. We focus on getting you to the conclusion of, "this isn't really a big deal" as quickly as humanly possible, because of the exact reason that you're just describing before – the people who work in SOCs are exhausted. And the security industry is not making it any better, we're just making it worse. And so we're trying to do the exact opposite of what every security company, or every product company in the SOC has ever done. And that's just give more context, more explainability, and try to guide people towards, "These are the reasons why this thing wouldn't matter. Move on to the other thing. Oh, that alert? We don't know anything about it. You should investigate that."
Dennis Fisher: That's important. The part of telling people what you don't know is just as important as saying, "Here's the things that we do know." Right?
Andrew Morris: Negative ground truth, that's a data point in and of itself. When you look something up in GreyNoise and it's hammering your network, and it's hammering your whole perimeter, and you look something up in GreyNoise, and there's nothing there. That's important. That means it's not hitting the entire internet, it's only hitting you. You should be concerned about that thing.
Dennis Fisher: It's like the security equivalent of Down for Everyone or Just Me.
Andrew Morris: Is everybody else seeing this thing, or am I the only one? And that's something that nobody was able to answer. This is so funny. When people ask, "What is GreyNoise's competition?" I'm always like, "GreyNoise's competition is a mailing list and a group chat that you're in with your 40 friends who work in SOCs, where you say, "Hey, are you guys seeing this too? Are you guys seeing this too? I'm seeing this." And that is a bad user experience. That is not a good process, but it's the only one we've got right now. And that's one of the things that we're really trying to change with GreyNoise. And the cool part is, you don't even have to become a customer to use this thing. Use our web interface. It's free. Eventually, when you use it, it's going to prompt you to log in and create an account, so we can at least know who you are. But it's free. You don't have to give us money for this. Use the web interface and look stuff up. Anytime you're seeing an IP address doing something weird on your network, or you're investigating, look it up in the GreyNoise web interface.
And we're releasing a community API that's free and unauthenticated. It's just got a little bit less data. You don't even have to account for that, you can just look stuff up. So look stuff up with that. You're going to get value out of using this thing without giving us a dime. And if you find that after using it for a long time, you get sick of copying and pasting stuff into that web interface, and you really wish that you had this thing in your security products, then you can talk to us, and we'll have a conversation about selling you some stuff. But before that, just use this thing. It's free, it's great, I promise you it's going to make your day a little bit better if you work in the SOC.
Dennis Fisher: So how does the Enterprise product work, if it's not just the web interface? How is it integrated into the security workflow?
Andrew Morris: That's a great question. So the workflows vary from customer to customer, and there's some details that get lost, but at a high level, this is kind of how it looks and feels for everybody. You have this web interface, it's this free thing, anybody can copy and paste an IP address straight into it, or even dump a log file into it, and will enrich all that data, we just won't let you export it back out. And so you can do that with the web interface. Now the business model is, at some point for you to really get value out of GreyNoise, like get actual, like big-dollar-figure value out of it, you have to build automation around it. And so in order to build automation, you need to use our APIs, and you have to put us in your existing products. You need to put in a GreyNoise integration in your SIEM, you need to put a GreyNoise integration into your threat intelligence platform (maybe), you need to put GreyNoise into your SOAR platform, you need to put GreyNoise into the products that your analysts are already using. Or your data pipeline, your enrichment pipeline, that's what you have to do. And so in that case, there's a few different kinds of workflows.
One is just to enrich everything that hits our organization or that goes through our SIEM or whatever, enrich everything, and just tell us what GreyNoise was saying about it at that time. And let us do analytics on that after the fact. We have a lot of customers that do that. But it's expensive, we'll charge you a lot of money if you want to do that – not as much as a lot of the other security product companies, but still quite a bit. And then there's other less intrusive, less advanced...as part of any given security analyst's workflow, a ticket gets raised, an alert gets raised, an investigation gets open. Just either in an automated way to look up the source IP address the generated that alert against GreyNoise, or even just use the command line or use your run book of scripts that you have run up or use your SOAR, use whatever that you have set up to do a lookup against that thing, and report back the results. There's a good chance that it's a known benign internet scanner that's not even like a bad one, like, "Hey look, Shodan just added this vulnerability check and it tripped your perimeter, your IDS, into thinking that somebody was exploiting you, but it's not it's just a security company that's checking everybody on the internet. Those are the high-level workflows that we see most often.
Dennis Fisher: OK. And you mentioned earlier that you guys can tell, like, "Hey, there's a bunch of machines compromised with, say, the Exchange bugs or whatever?" How are you making those checks? How can you tell that, say, this OWA server zone?
Andrew Morris: Oh, that's a great question. So basically, the shortest answer that I've got for you is that this is the entire job of our research team. To look at raw data, to first write rules and make sense of that data, and second to apply accurate metadata that is expert analysts' observations or conclusions from the data that we're seeing. And so, what that means is that our analysts, what they do is they'll look at our data and they'll say, this raw traffic that we saw in GreyNoise, on our collectors, sensors, this raw data means this. This means that this is somebody that are specifically probing for this vulnerability or attempting to exploit this vulnerability, maybe the OWA vulnerability, right? So then they first say, "Let's make sense of the data. This is what this thing means. This is what this thing is." And now then the next step is all of the kind of like, "Well, what does that mean?" Does that mean it's bad? Would only a worm do that? Or would a researcher do that? Or would a legitimate security company do that? Or whatever. What are all of the different categorizations and all the other inherited metadata about that, that are going to ultimately end up flowing up to a conclusion of what we determine the maliciousness to be of that IP address.
We don't use any machine learning on this right now. It's just analysts' work – "If you do this, that means this. And that means you're bad." That's it. It's that it's that it's actually that simple. But we operate at such a large scale, that even super simple things like that, they mean that our false-positive rate is extremely low, because we're not doing any wildly advanced data categorization techniques. It's just one foot in front of the other. And the scale of data that we see is so massive, that it yields 1,000s and 1,000s and 1,000s of compromised devices every day.
Dennis Fisher: OK, so you can tell like, "This is what a compromised OWA server looks like, this is how it acts...
Andrew Morris: From our perspective.
Dennis Fisher: Right, from your perspective.
Andrew Morris: Yes. So we're not going to opine on like, "Hey, GreyNoise, can you tell me if that server over there is compromised?" Because we'll say, "Well, we've never seen anything from it, first of all. And second of all, if we had seen something from it, let's see, it did this, this, this...and that means these are the conclusions that we're willing to come to based on the data that we've seen in our collector networks. This is what we see, this is what we know." And we try to stick to the facts, and we try to make things as explainable as possible, because we don't want to have this black box voodoo, kind of, "What does it mean, when GreyNoise says this thing?" We want to be right, every time.
Dennis Fisher: Especially when you're dealing with something like that. If you're telling people we know that your email servers are owned, or we know that this network is compromised, that's not an accusation. It's not something you want to say lightly.
Andrew Morris: Exactly. And trust is a reservoir that's built over time. And so we really want to have people consistently look things up in GreyNoise, for GreyNoise to consistently deliver value to them and say, "Hey, this is what we know about it." And for GreyNoise to be consistently correct, so that people can trust us more and they can increasingly offload certain parts of their workflow more and more on to automation that uses GreyNoise data. And our research team is fanatical about getting it right. If it means that we can tag another 10% of things that we're seeing, but we open up a margin of error, where sometimes we're going to be wrong, we won't do it. And we do that every single time. We've always done that from the very beginning. And so if we say like, "Well, if we open up this tag, we open the aperture on this a little bit, we'll maybe capture some more stuff, because this tags a little bit brittle, and it's little whatever. But it means that we might false positive if somebody does this thing." Hard no. We will not do it.
Dennis Fisher: What a weird attitude.
Andrew Morris: When your only job is to provide ground truth of some kind, you need to be right every time. And we're not always right. We do make mistakes. We'll miss-tag things sometimes, we'll think something is some actor and it's actually another. So we'll get details wrong, and we always correct them. And if necessary, we'll even let our customers know that we've just corrected this thing. We'll correct the record; we'll go out and do it. It happens very rarely. But because our tagging is done by an automated system, it's not done by a human – the logic is written by humans, but the actual application of all the analytics is done by machines – it's really important that we have incredibly high standards on this.
Dennis Fisher: Yeah, I'm with you on that. I can see what you mean, you could be tempted to be like, "Well, look, let's just open this a little bit, and we'll get more stuff, and it'll be great..."
Andrew Morris: Well, as a journalist, you know that it's always tempting to sensationalize. You always are like, "Man, if I change this one word, or if I say this one thing, I know that a lot of people would get way excited about it." But you're like, "I just can't, my journalistic integrity is on the line if I do that." And for me it's the same thing. GreyNoise's reputation is on the line every time we think about writing an analytic or conclusion and publishing that out to people. And that's why we're very, very pedantic about the way that we deliver insights. It's also a fun fact: It's one of the reasons why the GreyNoise Twitter account doesn't tweet with regularity. Because if we start making people feel like, "Well, we got to put something out this week, we've got to put something out this month," then that's going to make us start to compromise on what we think is interesting, and what we think is useful for the user to know. And it's going to start making us feel like, "God, we gotta get something together, where's something, where's something?"
But that's not how we do it. That's never going to be how we do it. We're only going to stick to things that we know to be true. And we're going to tell users about that. And if nothing happens for three months that we don't think is of the of the level that we should tell our users and our customers, we won't tweet a damn thing. And that's just how it's always going to be. I'm sure we're going to have wrap ups, you know, weekly "this is what we saw, blah, blah, blah...." But when we're really saying like, "This is the thing that we're seeing, and you guys need to care about it." We really take that seriously.
Dennis Fisher: So if I see you hire a social media team, I should start to get worried.
Andrew Morris: I absolutely wouldn't. I think that we've got so much great stuff to work off of that we're going to be able to market a lot of the stuff that we've done already. But I would be concerned if all of a sudden GreyNoise is having hair on fire occurrences every day. Like, come on, there aren't...well, lately there have been hair on fire occurrences almost every day, or something close to it. But yeah, if things are always on fire, then things are never on fire.
Dennis Fisher: That's absolutely true. Are there any non-security applications for the data that you guys see and collect?
Andrew Morris: I'm asked that question a lot. The answer is mostly no. But if there are, I just haven't seen them yet. There are one or two, and I'll talk about them. But I haven't seen them yet. I'm also just a security guy. So I have a hard time envisioning use cases for our data that I haven't thought of. It's just so esoteric, and the data is so niche and so specific, that it mostly makes sense that most of the applicability is going to be from a security perspective. The two things that immediately jumped out to me, though, is that: One, there is a massive operational spend that goes into storing logs that are generated by Internet-wide opportunistic scan and attack traffic, sometimes just for compliance reasons. It's going to sound insane. But I'm going to tell you right now, there's a lot of organizations out there that capture firewall logs of everything that's happening on their firewalls, and they have to store it for compliance. And that is a massive amount of data with very little intelligence or analytical value to the analysts. And a lot of that stuff is, I cannot stress this enough, meaningless. And so we know the difference between what is and isn't meaningless. We're working with a partner on this right now, who handles all the routing, we just do the data, they do the routing, we will basically say like, "Hey, put anything that matches these criteria, put this into, like, glacier, right? Now, everything else should go into Splunk. But put this into glacier, because you're going to spend a lot of money storing a lot of useless data. So that's kind of more of like an ops thing, a network ops thing, a sysadmin thing.
The only other thing is, when the internet goes off somewhere, all of the internet background noise stops there. And so there is some applicability of using GreyNoise in the capacity of trying to identify problem areas of the internet or even just pipeline issues in the internet. I remember when Iran shut off their internet not long ago. And either they shut it off, or it was shut off, I actually don't recall, but it was about a year and some change ago. And all of a sudden, we started seeing going from a regular cadence of seeing 5,000 or 10,000 distinct IPS a day coming out of IP space that's geographically located in Iran, all of a sudden to zero. And we're like, "Oh, the internet's off." And so there are some other use cases, but I'm reaching pretty far when I try to think of them.
Dennis Fisher: OK. I just wondered, because it's such a unique approach and way of looking at the problem. It's just removing all of the crap that you don't need to think about.
Andrew Morris: That's exactly right. It's not sexy. We're not usually finding APTs, it's not like a cool, sexy thing. It's like insurance. It's like eating your veggies. You just have to do it; if you don't do it, it's going to suck. And so you just you have to do it. It's just work. And we're just trying to make it easier and better for you. But yeah, it is a very different approach. Most security companies try to save you money by scaring the crap out of you about the big costly breach, and then claiming to prevent the big costly breach. And that's not what we do. We save you money by functionally helping you be more efficient in your investigations and come to the "this doesn't matter" conclusion as quickly as possible. And that's where we help, we make your people more efficient, make it easier for them to get through the queue of tickets. Which will then free them up to find the big bad breach, which is like, "Cool, now you can go and do that." But we help with the basics.
Dennis Fisher: So, what's next? What's on the drawing board for stuff you'd like to tackle with GreyNoise?
Andrew Morris: So many things. The big things that are coming up are we're releasing an unauthenticated community API that anybody can use. It's like our enterprise API, but just with a little bit less data. You can see ~5 different fields. And you look it up as much as you want, integrated into all your products. Use it. It's fantastic. So the community API we're going to be releasing, I just saw the pull request for it this morning. So we're going to release that, I don't know, probably early next week. After that, we've got GreyNoise RIOT right now, which I'm super excited about. This is in beta, but functionally what we're doing is, we're taking the same concept of telling the analyst what not to worry about by enumerating as many safe areas of the internet as we possibly can. So forget about honeypots, forget about collecting data, collecting sensor data, etc. We are literally finding where all the IP spaces of every benign service ever, every single Windows update server, every CDN, every SaaS product API, every social media app, every legitimate mail server. All of the things that we can find that in and of themselves are not exploiting you. Yeah, maybe they can be used in a complex attack, and we know that – we know exactly where it breaks down. But just identifying those things, and giving the user the ability to say, "What would my network look like, if I just didn't look at that? What would my SIEM look like if I didn't see all of these Akamai CDN logs, and I didn't see all this Cloudflare, and I didn't see like all of my Office 365 logs, and like all this stuff like that. What would it look like?" So just giving the analysts the ability to do that.
I hope nobody gets mad at me internally for doing this. But we're open sourcing our collector, our honeypot, we're open for people so that people can use it program at themselves. We're doing that this year, I don't know when, but we're going to open source that. So people can literally use our collector to build their own GreyNoises if they want to, but it would probably just make more sense to use ours. They're going to be able to use our our honeypot and program it, and it's super interesting. So I'm excited about that. We're about to go live with a new website, we're going live with new public pricing, and transparency is insanely important to me. So making sure that people understand that we're not nickel and diming anybody. This is the price, this is what everybody gets. And so we're going to be doing public pricing again. We're going to be doing more interesting geographical trending later in the year. So that we're going to tell you when interesting things are only happening in certain places. We're going to be revamping our analysis page, and we're going to be revamping our alerts feature.
And then we're reaching for – it's really, really ambitious – but we've got a number of other things that we want to do. There's no way we're going to get all this stuff done this year, but that's what we're shooting for right now. It's a lot of stuff. The only other thing to add is that we've implemented this really cool bitmap technology that allows you to do enrichments against GreyNoise on a tiny data file, as opposed to having to hit all of our APIs, like murdering our APIs. So all of GreyNoise enterprise customers are going to get an absolutely insane performance improvement very soon. Those are the major things.
I want to solve internet background noise for people, I want to solve honeypotting for people, and I want to solve the "Does this thing really matter, or is this thing like not a big deal?" problem for people. Those are the problems that we really want to solve for people. And I think that there's a lot of work that goes into doing that the right way. Figuring out the right order of operations, while balancing revenue expectations vs. the right clip of progress and quality of product, and where we focus our time and energy, is challenging. But it's really exciting, and I'm really excited for this year. I'm really excited just to get some of these new features out to everybody and start getting everybody's thoughts and feedback on it.
Dennis Fisher: Yeah, that's a pretty ambitious list of stuff to get done. But considering this came from a weekend hobby project to where it is how, it's pretty rad.
Andrew Morris: It's changed my life completely. For better or worse. I'm kind of the GreyNoise guy now. I couldn't be happier about that. This is just a crazy hard problem, and I'm learning so much. And I'm really, really enjoying doing it. The coolest thing about my job now is that we're solving the problem, which is awesome. We're making a name for ourselves, which is awesome. We're punching way above our weight against all of these massive security companies, and we're providing something that none of them can or know how to provide. It's a mind-blowing list of users and customers that use and trust us and our products and our data. And the coolest thing for me is that I've hired so many people that are like so much smarter than me. Which is just insane.
Dennis Fisher: That's the way to do it, though.
Andrew Morris: Yeah, It's crazy. I'm so lucky, I'm so fortunate for this. Every day I've got all these people who are just so smart. And I'm just like, "God, I'm so lucky to be here." Yeah, I'm super excited about it. And it is really ambitious. But it's also just awesome getting invited onto stuff like this to talk about it. It feels really good.
Dennis Fisher: Yeah, well, I'm happy for you, man. It's such a cool idea and such a cool story of how it came to be, from one little idea and all the way up to this. I love those David vs. Goliath stories and creative people doing things saying, "Well, why isn't somebody else doing this? Well, I guess I'll do it."
Andrew Morris: Because nobody's going to do it unless you do it, right?
Dennis Fisher: It's a big problem that needs to be solved, let me see if I can solve it. That's the kind of stuff that I love. It's cool.
Andrew Morris: This is the hardest problem I've ever tried to solve in my entire life. And I'm still like, however many years into trying to solve it. But yeah, it's super interesting.
Dennis Fisher: Thanks so much for coming on, Andrew. This is a lot of fun, and hopefully we get to it again.
Andrew Morris: This has been fantastic. I've had such a great time talking to you. Please, anytime you are willing to do this again, I will be back in a second. So thanks for having me.
Dennis Fisher: Absolutely, man. Take care.
Andrew Morris: Alright, you too.