The Art of Recon: Strategies for Modern Asset Discovery
Episode Summary
Today, we explore the world of asset discovery and reconnaissance, particularly how these practices have evolved over time. Historically, discussions around reconnaissance have been overly simplistic and tool-centric, often focusing solely on the latest tools rather than the underlying principles and methodologies.
Join us as we break down our approach to reconnaissance into five key elements: breadth, depth, context, amplification, and focus. We discuss the importance of understanding the attack surface holistically and how to effectively map it out in a modern context.
Learn why breadth is crucial for discovering all assets related to an organization, how depth allows for a deeper understanding of those assets, and the significance of context in enhancing your reconnaissance efforts. We also touch on amplification techniques that can help you uncover hidden vulnerabilities and the importance of applying an offensive mindset to your reconnaissance work.
Whether you're a seasoned security professional or just starting in the field, this episode offers valuable insights and practical advice to enhance your reconnaissance skills and improve your overall security posture. Discover how to think beyond tools and embrace a more strategic approach to asset discovery!
Fundamentally change how you secure your attack surface. Assetnote's industry-leading Attack Surface Management Platform gives security teams continuous insight and control over their ever-evolving exposure.
For more details about Assetnote's Attack Surface Management Platform, visit https://assetnote.io/
Transcript
MG:
Today, we're going to talk about asset discovery and reconnaissance. So this is something that we talk about a lot, obviously, internally, it's a big part of what we do. But we have a lot of kind of interesting thoughts on this, maybe somewhat opinionated thoughts on thinking about this more as a practice, right? I think historically, you've seen the conversation about asset discovery and recon, I think historically be very simplistic, right? Compared to how we look at it. And I think this might be an interesting topic to dive into. And so when I say historically simplistic, you know, I'll be at conferences and it kind of is very tool centric, right? It's not so much focused on the approach or the understanding sort of framework or idea of how to approach you know, reconnaissance and asset discovery, particularly in a modern sense. And so, you know, you'll see it when I say it's tool focused, you'll see it where it's, you know, here's this tool. This is really great. You know, you provide a word list. It's usually centered, you know, mostly around brute forcing. Right. And then the next year it's like, actually this tool is more optimized and you can do way more faster. And then the next year it's, now this new tool is even better, right? You should be using this tool instead. And in a way, I mean, that's good, right? Like speed and scale is something that we've spoken about in other episodes. That is the real focus of ours, but I think it's kind of missing something, right? It's missing, I guess, the breadth and depth that you would expect when you are looking at trying to map something out more completely and more holistically. So yeah, I wanted to dive into this topic a little bit more today. I think it's an interesting one. I think people get a lot out of it. So maybe, you know, we can kind of start a little bit with that idea around those sort of historical approaches, and then go into, you know, perhaps how we think about it and how we would recommend thinking about, you know, mapping out an attack surface in a modern sense.
Shubs: Yeah. And, and, you know, when you talk about some of these historical approaches and even the research that's been presented or the presentations that have happened around reconnaissance, a lot of the time, I feel that a lot of people that come into, um, the industry, they often ask, like the first question they ask is what is reconnaissance? Like how do you do reconnaissance? And, um, usually the industry answers by you just use all these tools. And that's usually how it goes. And then everyone's like kind of doing it, but they're not really understanding that it's more than just using the tools. And this is the part where I find really interesting because a lot of people, even after using these tools for a very long time, don't really understand what reconnaissance entails, what it is, how to do it, the elements of it, the theory behind it, and how to get the outcomes you want. And that's like, really, I feel we should be trying to focus more on the theoretical sides of reconnaissance, the parts which will base the philosophy on how you discover things that are valuable from a reconnaissance perspective. And it sucks because reconnaissance is such a broad topic that everyone will just think that there's just so much to it and it's overwhelming and I don't understand it. But I think that, you know, over the years, I think one of the things that we've really started to understand is that you can break it down into some common elements. Right. So, um, yeah, I, I think, you know, the traditional approaches, um, I have been aware of over the years, it's kind of transformed, I think in the years, 2000s and onwards, everyone was using Nmap and running Nmap scripts. And, you know, you had like a, the HTTP title Nmap script or detecting versions and things like that. And, and you were, you were focusing on traditional net, And that's changed, right? Like now every company is not just as static as some IP ranges you find on BGP and you just scan it with Nmap and you get all the results. Everything's changed now from a reconnaissance perspective.
MG: Yeah, it's kind of interesting when you talk about, a lot of people do ask about reconnaissance and how do I discover assets and how do I get better? And then the answer is all these tools. And then they're surprised when they find out that their output is exactly the same as everybody else. And they're like, why am I finding extra stuff? And you do raise a good point around that history a little bit because, and I think this goes to the fundamental, idea of why you want to think about this more as an approach or a framework of thinking about it rather than the specific elements because it changes over time, right? We've spoken about this concept a lot where security doesn't exist in a vacuum, right? It's a reflection of the current practices from a broader information technology landscape. And so if you think back in the day, everybody had data centers, they had a range that they owned, and every asset that they had was in that range. So something like NMAP-based, port scanning-based reconnaissance, kind of makes sense. Now, that's not even close to being true. You know, most people are cloud native. That's a shared IP space. These IPs are way more ephemeral these days. And so, you know, thinking about it purely from a tools-based perspective, is kind of a limited view because you're not then thinking about it in terms of the objective and how do you get to that objective and the specific techniques and technical elements that will evolve over time as the practices of companies evolve when it comes to their infrastructure and their applications and things like that. So centering in on that, you know, maybe starting at a high level, you know, what are the things that we would sort of broadly think about if we would define kind of an approach, right? If we were to think about, you know, at a high level where we're trying to map out an organization's attack surface sort of completely, you know, what are the kind of core high-level elements that we would sort of, you know, bring into that approach?
Shubs: Yeah. And I, and obviously this is something that still gets refined and we've been working through it for the last six years. But when, when we've spoken about reconnaissance and how we've approached it, we've typically looked at, um, I would say around, uh, four elements or five elements, I would say, um, the first being breadth and like, we will go through each of these in this podcast, but breadth, depth, context, amplification, and focus. And it kind of sounds weird because we don't actually see many presentations that cover this sort of stuff, or we don't see things in the industry that cover this sort of stuff, but actually each one of these components is quite valuable. And if you understand what each one of these components entails, then then you can actually be much, much better at reconnaissance and you can focus on the right things and get the outcomes that you're looking for. Because, you know, one of the things that we really focus on with reconnaissance is that it's outcome driven. We want to find that critical vulnerability. We want to be able to discover things that no one else has discovered. We want to be able to see where the weak spots are in an attack surface. And in order to do that, you have to actually do all of these elements quite well. But yeah, maybe we can dive into the first topic, breadth.
MG: Yeah, and I think it's a good approach because it's more conceptual in nature, right? The idea of breadth, like as an example, just to pick one, you know, what you might do to sort of fulfill that objective, right? Like you mentioned, might evolve over time, you know, there might be new things that you think about, that you come up with, particularly as you start to dive in and understand, you know, particular attack surfaces that relate to that concept. And so because the mindset now is a little bit different in terms of it's less about running a tool and getting output and then just optimizing the tools, it's more around how can I get more breadth? How can I get more depth? How can I improve the context here? What amplification techniques can I use? And that gets you thinking about the technologies in use, how the practices even that an organization might use commonly, you know, in terms of developing or maintaining the infrastructure or developing applications. And so that allows you to kind of extend and grow with it and evolve rather than just thinking in a really constrained manner. But yeah, maybe we start there. You know, let's start to explore this concept of breadth and why breadth is important. I mean, it sounds obvious, right? It's just about getting as much as you can, right?
Shubs: For sure. But we might even have listeners who may not understand what we're talking about when you say breadth. But in breadth, it's all about finding as many things across an attack surface that relate to the organization from a discovery perspective, like going wide. So that may be discovering all the subdomains that may be discovering all the different endpoints. It might be discovering all the different technologies in use. It might be discovering these sort of things. So breadth, I guess, the way I look at it is. increasing the size of data that you have, that you can look into all of these things individually. So that would be the depth component, but breadth would be initially getting that data, which will then let you go and go deep into each one of these different things. So in order to even do any elements of depth, you need to cover breadth first, because you need to have a good understanding of what's out there. And that's really what I mean by breadth is, okay, like there's a company that I want to break into, or I want to find a security issue in, or we want to use our tax service management platform on. We need to know everything that exists as a part of this, every entity that exists that we can focus on and we can go deep into. And, um, just on a side note, one of the things that I've found, um, with a lot of hackers and even a lot of the hackers I work with today is, you know, people, um, can get good at like one or two of these elements, but not all of them sometimes. And you'll see like, um, often, um, you'll see a lot of people that, uh, are really good at reconnaissance. They have like a, or, or just generally security testing is they have a, they have a specialty in one area. Like, oh, my job is I go really deep into an application, I understand every functionality, and I do that sort of stuff. But then they're missing the breadth component, which is, okay, do you even know all of the different applications that are on this attack surface?
MG: And how well do you know them? Have you covered everything that you can go deep into with your specialized skill set? Yeah, exactly.
Shubs: And yeah, that I think is a key reason why some of the hackers that focus really heavily on reconnaissance are so successful is because they try to combine these elements. But yeah, when I talk about breadth, I'm talking about maximizing the attack surface that you are able to look into. Yeah.
MG: Yeah, and, you know, it's an obvious starting point, right? Because I think, you know, ultimately the rest of this stuff kind of flows from the breadth, right? And, you know, and I guess traditional approaches, you know, to some extent I'd focus primarily on breadth, right? That's sort of where it kind of tends to stop, right? You think about, you know, brute forcing, you know, with word lists, right? That's common, you know, in terms of, you know, subdomain discovery as an example. There's internet wide scanning as well. That's another sort of area. And so I think, you know, one of the things that maybe want to dive into just a little bit is thinking about, you know, maybe solidifying, you know, some ideas around breadth in terms of, you know, some techniques and some things that make sense for breadth that perhaps people aren't necessarily considering as much.
Shubs: Yeah. On that, really, there's several different ways to get that breadth, but really it's that discovery element that we're trying to get in that breadth. It's like, let's find every single element about this company that we can discover from an external perspective that we could look into deeper and understand. There is a huge element, and I know we are going to talk about this more later, but there is a huge element that during this discovery process, we need to understand what different elements are we trying to extract from this data. For example, in the breadth process, you might discover a very large list of subdomains from whatever tool that you've used, from passive data, the internet wide scan data, whatever. Part of breadth is understanding what are all the technologies running on all of these assets? What are the titles? What are the content lengths and status codes and so on and so forth. And really breadth should feed into this idea of pattern recognition. Because I sometimes think that reconnaissance is just one big game of pattern recognition, where we just have to look at all this data in front of us and understand what's standing out, what's interesting, where's your instinct taking you. Um, but yeah, um, specific examples of breadth. I think we've kind of covered some of them where it's like the subdomain enumeration, the internet wide scanning, the analysis of different, um, different key points like titles, status codes and content lengths and things like that. Um, but. You know, I think the breadth part, um, as we've mentioned is really trying to get as much coverage across an attack surface from a going wide perspective, everything a company might own on the external internet.
MG: Yeah. And it's kind of interesting because when we talk about, oh, well, like status codes and what's running on the assets and things like that, it kind of seems like it might bleed a little bit into depth, but it is a distinct concept and it's around, you know, getting that information to further expand. And I think there's, you know, to be a little bit more nuanced with some of this and some interesting things that don't get spoken about a lot, right? You know, it's one is, It's to understand how they're doing things, right? And then once you understand how they're doing things, applying that to get greater breadth. And there's more simplistic techniques that people have sort of spoken about, right? Is if you're targeting an organization, they're often a business, right? And so, you know, understanding the sort of, I guess, overlay of the business, you know, what's their, you know, what subsidiaries do they have? What acquisitions have they made? What divestitures have they made? Looking at financial reports and other sort of financial information can give you a good insight into getting more breadth. But I think there's one area that maybe is a little bit more nuanced when we talk about breadth. And it goes to this point that I think you're making, which is around understanding what's going on, right? So, for example, you know, internet wide scanning is often seen as a good source of this, and it is. It is a good source for breadth, right? Because, you know, you're covering things internet wide. But I think there's a couple of nuances. And, you know, one is, you know, most internet-wide scanning is IP-centric, right? And so, you know, that presents a whole bunch of challenges. You know, one, you know, the more obvious stuff being, you know, these days, IPs aren't as distinct or attributable to an individual organization as they used to be, generally speaking, with the mass adoption of cloud, right? And so sub-domain data becomes a little bit more king when we're thinking about breadth. But there is this kind of concept that we do talk about, which is, at least internally, which is the blind side of internet-wide scanning that I think people don't really understand fully and have a good handle on that from a contextual perspective. And what that leads to with that thinking is, What I'm trying to get to with this point is that you may think by looking at internet-wide scanning or internet-wide data sources that you've got a lot of breadth, right? You've got a holistic understanding of you know, the assets that an organization has. But when you start to dive in a little bit deeper and just think about the context, you actually don't in a lot of ways, right? So maybe, I know we probably want to dive into this concept a little bit more, but maybe can you talk a little bit about, you know, how we talk about this sort of blind side of internet white scanning? Yeah.
Shubs: And to be honest, I think that it's not going to be long with the way that the internet is going, that internet-wide scanning will be seen as a somewhat legacy way of doing things eventually. And I'm not saying it's not a good additive data source and it doesn't have good information. In many cases, it does. But if we look at any realistic attack surfaces of any large organization these days, they are not running everything like they were in the year 2000. And not everything is, as you mentioned, everything is now cloud-centric, but not just cloud-centric. Now everything is heavily gated behind web application firewalls and CDNs. And those are probably the two biggest contributors to the blind side of internet-wide scanning. Um, and look, this is something that I think isn't spoken about. And I think it's not something that people have, uh, really understood from a reconnaissance perspective when they start looking at internet wide scanning data. And that is primarily because I think, you know, people are just relying on it too heavily as a data source and not thinking about where is the attack surface that I'm missing here. And so what I mean by all of this is, let's say, for example, you've got an organization that is using Akamai WAF or whatever WAF really at the end of the day. Now, there's some ways to usually attribute it to their organization, like the SSL subject organization. But at the same time, all of these Akamai IP addresses are not going to actually respond with an application level header. Like the application itself is not serving you anything. You're seeing a response from Akamai that says something like invalid URL or something like that. Similarly, with Cloudflare, you don't actually have an understanding of what the application is because you're not sending the correct SNI when you're connecting via TLS. And that's basically Cloudflare has shared IP space. So they've got an IP that may be hosting hundreds of different sites, but it determines where to route you at the end of the day based on the SNI header, based on the host header, which ultimately gives you the application's content that you're looking for. And really with all these internet wide scanning technologies, they are not able to provide the correct SNI header. They're not able to provide the correct host header in most cases, which means that when you're looking at internet wide scanning data, you get like thousands of results, which are probably real, like real attack surface for a large organization, which are just like Akamai's default response or Cloudflare's default response. And that's not the application. And I find that really funny because certainly as a hacker, when I see that, I'm like, oh, I actually, like internet wide scanning data is not good enough. Like I don't have everything.
MG: Like I know for a fact.
Shubs: You're missing the breadth. Missing the breadth. And this is a big part. And I think, you know, one of the things that we've really focused on at AssetVote is this idea of, you know, subdomains are a first class citizen and it's something we care about very heavily because all of these reasons. One, we can provide the correct host header. Two, we can provide the correct SNI when connecting via TLS. And because of this, we are actually routed to the actual underlying application, which is really, really important for breadth. So now we understand what technologies are there. We understand what the application is. We understand all the different elements of it. And that's going to inform us in the depth.
MG: And we can look for further patterns, like you mentioned earlier. Yeah, and I think it's definitely kind of interesting, and it kind of underscores this point that we're trying to make, right? If you think about the concept and you think about the objective, then you approach things in a different way. You don't stop at like, well, internet-wide scanning data, whether you're doing it yourself or whether you're consuming it from somebody else's source, right? That's complete, I'm done. Because you think the term itself is like, it's wide. That's getting away from tools-based thinking to sort of objective and conceptual thinking. And so that's where your mind goes to when you see all these Akamai IPs, right? Then you're like, well, that's not good enough. I need to go further, I'm not getting, I'm not meeting my objective. And so I think it kind of underscores that point. But then sort of moving on, right, so we've touched on breadth, and we have touched on an element of, I guess, the sort of content of what's running on there. Not so much content, but the technologies that are in use and understanding that to get a better sense from a breadth perspective. But I think that kind of starts to lead a little bit into the depth side. So do you want to talk about depth a little bit more at a high level and we can kind of go into what we mean there?
Shubs: Yeah. And, um, yeah, as you've said, um, the breadth side directly leads into depth and, and when I talk about depth, uh, I can give some examples. So let's say we've now got a list of, um, assets we know about, we know the attributes of these assets. We know what technologies are there now that we're going to go really deep into something. And what I mean by that is, and this methodology can differ on which target and which company it is and what they're running and how they're operating. But let's use a few examples. Let's say it's a really modern organization that decides that we're going to deploy a new React micro application or whatever every second week or something in a new subdomain. Okay, it's great that my breath has detected that this new subdomain exists, there's a new React application, and so on and so forth. But the depth element is actually where you discover the way that this application is operating, what is exposed in this application specifically, and how to get there, and what you can do with it. So let's say we did discover a React GraphQL application or something. Depth would be looking at the JavaScript to understand what are the queries and mutations for this GraphQL application. What are the inputs? What are the default variables? What are the permissions? What are the roles? How do I access this? How do I get to the point where I'm testing each and every one of these individual parts of the application or at least understanding what they are? And that is the depth component. And depth can come in many different forms, but ideally it is once you've already identified what your target is from a breadth perspective or targets you want to focus on, that's where the depth really comes in. Now, of course, you can apply depth at a larger scale. It's just not as targeted.
MG: It's just not as specific. Well, no, not necessarily like, you know, and this is an interesting thing of how breadth and depth kind of interplay with each other, right? You've got You've got depth, and to your point, some of these things might be really specific, right? This is a certain kind of tech stack and application, how it works, and now I'm extracting that attack surface. And you might sort of build techniques to understand what that deeper attack surface is. And you might think, well, that's a waste of time going to that level of depth. But when you start to take that and think about ways that you can operationalize that and automate that at a large scale, suddenly it's not so much a niche thing anymore. Especially if you're looking at things in the context of modern development practices, modern frameworks, new things that are being adopted. Because you know, on a single attack surface, you might only find a few instances of it. But if you're looking across multiple attack surfaces, then suddenly that technique or that approach can become a lot more valuable and, you know, find a lot more things.
Shubs: Yeah, I agree. And I actually, I do think depth is an investment. If you invest in depth and you understand these common ways of doing things, and you come up with these, I guess, almost like methodologies at the end of the day, it can be applied differently for every different company. But we've been really successful with that at AssetNote. Like for example, the second company I was going to mention, or like the second example I was going to mention is Depth can also involve going to the point of discovering that shadow exposure and those vendor applications on an attack surface and then diving deep into the source code of these applications to see if you can find any systemic issues, vulnerabilities, so on and so forth. That is depth also. And, you know, these methodologies are distinct. One methodology is we're going to map out the JavaScript and the variables and the endpoints and everything else. The other methodology is we're going to find the things that we think are shadow exposure on your attack service, things that have been forgotten about or just haven't had the security due diligence. Because I think something that you pointed out quite rightly recently is, you know, all of our research in ServiceNow That's like everyone knows they have ServiceNow. It's not like they didn't decide to have ServiceNow on their attack surface, but no one knew how insecure it was, essentially.
MG: Yeah, or the level of exposure that is in the attack surface, right? I mean, that's obviously a big part of what we focus on, and it is kind of interesting. I mean, we're veering a little bit into exposure, but one of the things that we do talk about when we think about the research that we do, or the kinds of issues that we're looking at, is, you know, we'll often hear from folks where we've done some research and it's a very kind of niche thing, right? It's a niche bit of software that, you know, you don't think, you know, everybody's going to use or there's kind of like a precondition or something. And so on its own, you know, I get this more from pen testers than anybody else. But what's really the impact of something like this? How common is it going to be for some of these preconditions to be present as an example? And the interesting trade-off that they don't think about is, when you're a pen tester, you're thinking about one application, or you're thinking about one system, typically. But when you start to widen that scope and you start to look at millions and millions of assets, suddenly things that are rare become a lot less rare. So you really have to truly understand, I guess, that particular exposure and why it might matter. But it is an interesting concept. And I think that conceptually can also relate to what we're talking about in depth. And that actually does lead into, you know, and like this kind of all flows together, right? So the other side of depth, I think, is context, right? We've spoken about this, you know, in a little bit of depth before we did a presentation at B-Sides Canberra a few years ago, and we have an open source tool that was released alongside that research. called KiteRunner. And basically the core idea is thinking about contextual discovery. So understanding how things work and utilizing that to get better coverage from a discovery perspective. So in the example of KiteRunner, and it's just sort of one example of this, so it's not like the only thing to think about with context was sort of recognizing that, say, content discovery. If you're talking about application content discovery, again, historically very similar, is it's just tool focused, right? It's around like running as many queries and requests against a particular application with a word list as fast as you can. It's sort of been the, you know, to be somewhat reductive about it, it's just generally been like the trend in terms of the discussion. But we wanted to kind of change that a little bit. And again, because we think about things more in this kind of framework, and think about, well, what are we actually looking for here? And if you think about modern applications, they're often really framework driven, right? So they're built on various application frameworks, you know, very API driven as well. in terms of how they're built these days. And those frameworks have certain expectations around how things are structured and how things work. So it's not just the path, right? It's also the method that's sent, you know, other headers, parameters that it expects in certain places. And so once you try to understand that context, And if you can bake that context into your technique of what you're doing, you can get much higher output. And, you know, we've sort of followed that a little bit. Obviously, we do that internally with our products very deeply, but there are some things that we have put out there publicly, like the first step on that was the word list project, right? Where it's like, okay, well, if you're going to run a word list against something, you know, let's be smarter about what that word list is and be more targeted and more contextually aware. And then that went even further with KiteRunner. And so, and what we found, right, with just, you know, just thinking about that single problem space of application content discovery, we found that the results were miles ahead in terms of the extra stuff that you find. And when it comes to bringing it back into the context of exposure, you know, that one extra endpoint or that extra bit of content or that one extra asset, you know, can be the thing that really gets you done, right? From an organization defensive perspective, right? So, which is the whole point of what we do. So, context is actually, you know, a really important I think, extension of depth, right?
Shubs: Yeah. And sometimes the way I like to think of it was it was six years early on attack service management, four years early on API discovery. Because now everyone talks about API discovery now. And, you know, we released KiteRunner long before any of the high-profile API breaches or API-related issues. But actually, like, this is very important because, you know, we talk about wordless-based approaches and how that's been kind of the the undercurrent of work in the last 10 years. And when we were looking at KiteRunner, when we were building KiteRunner, we really understood that to be a lack of innovation in the space for how discovery is done. And we wanted to change that, and that's what we did. You know, it's not as easy and straightforward these days to just assume that everything is going to be responding to a GET request in a specific format in this specific word list that I have. Everything is so dynamic now, and everything is so strict now. It's not like those golden days in the year 2000, where everything's like a static PHP app and everything's static on the server, where everything will be defined in a word list and is quite likely to be in a word list. Now almost every application in the world has some sort of API and APIs are not known for their flexibility. They're not like, Oh, that's cool. You sent me a get request. I'm still going to respond and let you know that there's something here. No, actually APIs are very, very stubborn and they do not like being discovered from a discovery perspective. They're not, they're not really something that is, um, easily accessible. And now, of course, there are many different ways to discover these and many, um, points of amplification, like finding a Swagger JSON file or a GraphQL introspection or whatever it may be. But nine times out of 10. you don't actually know what APIs are running, and you don't know what the endpoints are, you don't know what the methods are, you don't know what the parameters are, you don't know what headers you need to provide, and in that case, any of the traditional tooling would not be sufficient in finding these. Unless it's like some old application running PHP, like I mentioned earlier, where it responds in a certain way when there's a directory, it responds in a certain way when there's an endpoint, And that is only like, you're only able to infer that something is there potentially. You're not able to guarantee that there's something there really.
MG: Yeah. And it's, it's interesting. And coming back to this sort of central point that we're trying to make around thinking like thinking about this as an approach, as opposed to a tool, um, when you think of something like context, right. So what we did with kite runner is we basically ingested, what was it? Millions, like tens of millions of, uh, of Swagger docs and other API documentation to build this idea of the context and then bake that into an automated tool that could run and apply that context on a mass scale. But you don't, you also don't need to go to that extreme. Like we go to that extreme because we're building a large scale platform, right? But, you know, if you're thinking about this as you're just an individual trying to do this, right? You, you know, you can still apply context in a manual way, right? Or in a small scale way. And that's why thinking about this more as an approach is actually kind of useful, because then you can adapt, right? It's not about, hey, you can just run KiteRunner. It's like, well, no, think about what you're looking at here. And particularly if you're an individual, maybe a pen tester or a bug hunter or something, you can apply that. in a more reasoned way and a more reasonable way for what you're doing in a smaller scale. You don't have to do things like understand common API context by ingesting tens of millions of of API documentation and building that in a tool. So it is very useful to think about it that way. And I think if you start to approach it with that mindset, it starts to unlock things that you weren't really thinking about before. You weren't really able to unlock before when you're more constrained in your thinking, if you're just thinking around it in a sort of a tool centric approach. But you touched on amplification there, like you mentioned the term amplification. So, you know, obviously we've gone through breadth, depth, context. Where do we think about amplification as sort of the next sort of thing?
Shubs: Yeah, it's really interesting because when you look at an attack surface for long enough, you will find jumping points. You will find something that you can jump off and get a lot more information about what you're looking at, as well as potentially discovering more elements that add to the breadth and depth side of things as well. So amplification, really what that means is, okay, I've discovered a API specification that has now amplified the attack surface that I have access to. Or let's say I've discovered a URL shortener, and I can brute force this URL shortener to now discover all these different parts of the attack surface I had no idea about. And in fact, at the end of the day, the people that are contributing to this information is the internal employees of this organization shortening URLs day to day, which means that half these URLs are quite valuable. They're URLs to third parties, they're URLs to services that you may not know of, they're URLs to internal applications or directories that you may not have ever seen before, and things like that. But it can expand even further. Like often what I see as well with amplification is we'll be doing an analysis of an attack surface and we'll see, oh, this page says that you have to download a thick client to interact with it. OK, that's fantastic. This thick client is written in .NET. We can decompile all the DLL files. And now suddenly we have a really good understanding of all these underlying APIs that exist in this application. And that is another example of amplification. And you know, it does get a little bit into that breadth and depth area because you can maybe argue things like even JavaScript files are leading to amplification of understanding of endpoints and things like that. But really with amplification, I think that there's these jumping points where you can go from one point to hundreds of different points.
MG: Yeah, the URL shortener is a good example of that, right? So if you think about it from a breadth perspective, well, you're finding most of the time it's with a domain that's short, right? Like that's how most people implement this sort of stuff, right? So you're finding that with breadth. going to a little bit of depth and context, your understanding that is a URL shortener, that, you know, you've got, you know, often there'll be some sort of short, you know, token, I guess, that that sort of eventually can get expanded out to, to the resource that it that it's shortening, right. And then, and so so that's where you've now got depth and context. Amplification is like, now that you've understood that, with that one point that's in the breadth of their attack surface that you've now discovered, how you can amplify that, right? Just from that one host and that one sort of service. So jumping off points, I think, is really a great way to look at that. And then, finally, focus, and focus is kind of an interesting one. We've spoken about some of this sort of stuff recently, which I think is kind of interesting, right? So, it's really about taking data, right? Because if you look at what we've been talking about, it's easy to look at this as purely a data problem. And I think that's one of the constraints of the historic approach, right? It's like, well, it's just a data problem. I need a bigger word list. I need a faster tool. I need to be able to process more data. I need to go internet wide. And that's part of it to a degree, right? Particularly as you talk about some of these other kind of concepts like breadth and depth. But really what you want and really what matters is turning that into information and thinking about it in the context of what you're trying to do, right? Especially if you're coming at it as we are in terms of looking at the exposure that's in the attack surface. It's like, how do you focus that data into real information that's actionable and useful for whatever you're trying to do? Whether that's just various things internally as an end user on the other side of our platform, like you know, dealing with this data and coming up with workflows, or sort of what we, you know, then go and do with our platform, which is the exposure monitoring, right? And feeding that into that. And so, you know, I think it's often the missing piece. And I think it's actually the hardest piece out of all of this. And this is sort of what we were talking about the other day is, How do you apply an offensive instinct to this data and this recon to get useful information that's useful from a security perspective? And ultimately, the interesting idea, I think, I mean, this is my opinion, I know that you agree, so we can talk about it, is that that is That's not the commodity skill. The commodity skill, believe it or not, is more of the exploitation of some of these outcomes. Maybe we can talk about that concept a little bit more around applying offensive instinct to recon. And why that's such a specialized and interesting skill that a lot of people don't really seem to have. But if I look at some of the top tier hackers or some of the folks that are really top tier in terms of their attack surface discovery kind of capabilities, this is something that's common with all of them. Right. Yeah.
Shubs: Yeah. Yeah. I do think that exploiting things that are tricky is still tricky and still requires a lot of work, but I think that what often happens is when you look at a large organization or you look at any organization that is even remotely mature with their information security practices, then What ends up happening is the exploitation side of discovering vulnerabilities or whatever, that part, the exploitation side is not usually the hardest part. It's finding those weak spots. And when you talk about applying this offensive security instinct to reconnaissance, I think the reason why it's so hard is because it is sometimes a tireless, unforgiving task that you have to just grit through with dedication and time to get to that outcome. And then after that point, once you've discovered this very vulnerable looking application or this soft spot in the attack surface, everything after that is just straightforward. Everything after that is like, okay, this is a very easy pathway to success. But to get to that point is usually the biggest differentiator between someone that's successful in this sort of methodology of understanding an attack surface compared to someone that is potentially just better at the technical sides of just exploiting something. And look, this is not to knock off people that are just really into CTFs or just really good at technical exploitation skills. But this is a differentiator in the top five.
MG: I don't think it's about necessarily saying that those skills are not you know, are not important or difficult or they're easy necessarily. It's not so much about that. It's more to do with, when we say it's like it's more the commodity side of the skill, it's just to do with supply, like supply and demand, right? Like there's just more people that know how to do that sort of stuff, as opposed to the folks who are able to really apply you know, what we've been talking about. And then also not only apply that, but apply that with an offensive mindset and being able to link the two. You know, there's a lot of stuff out there nowadays, you know, on recon, on attack surface management. And, you know, it's kind of always been tangentially related to the exposure side. But, you know, I find that it's very rare that both sides come together the way that they do, right? The way that they do sort of with us and what we're trying to promote, because either it's very simplistic and basic discovery and maybe decent exposure monitoring, or it's like really advanced and really somewhat decent discovery, but not really good exposure monitoring. And they're so distinct. But for us, our philosophy has always been that these two are linked, and these two are linked in a very important way. And the skill to be able to link those effectively is actually rare. So that's what I mean when we talk about this idea that the other side of it is a bit more commodity. It's just more common that those skills are there and present as opposed to that skill. It's a little bit more niche, right? So that's kind of the idea.
Shubs: No, I understand. And the other thing is, everything we've talked about in this entire podcast episode is so difficult to learn. There's not like labs you can go on, which will teach you how to do all of this. There's not like some prescribed environment that will give you this instinct. There's, it's like, uh, unfortunately it is something that you have to really work through over a long period of time. And, and, and you have to tackle each part of what we've discussed today and do it really well in order to be successful. Um, if you even miss out on one of these, then you will miss stuff generally. And, and you'll see that, um, in terms of other people picking up things that you may have missed and things like that. But, but yeah, these unicorns do exist. There are people that can do both of, uh, the, the, uh, the discovery and the exploitation part. But, um, yeah, it is, I will say that's kind of the exception to the rule is usually one or the other.
MG: Yeah, I mean, but that's the point of doing this, right? So obviously, for us, it's formed the basis of everything that we do at AssetNote. We think about it a certain way, we approach it a certain way, and that allows us to continue to build, right? And again, going back to the fundamental kind of concept that we've been talking about, if you start to think about this more as an approach and as a mindset rather than a purely technical or purely tools-driven exercise, then you can evolve, right? You can learn more. You can start to apply those learnings in a different way. And if I think about what separates those guys that we're referring to, you know, that sort of small group that seems to be able to do this really well, It's because they have that mindset, right? And that is a generic in the sense, it's generically applied to anything going forward. You know, you can then, as things change or as a new technology comes up or as something else happens, you can then start to apply that. and be a lot more effective, right? So it's not a static thing at the end of the day, it's more fundamental. And yeah, I mean, it's everything that we do at AssetNote, you know, that's what we, that's, you know, these are, this is sort of like part of our philosophy, but also part of the point of doing this is, like you said, nobody talks about it like this. And, you know, I think it'd be really great if we could start to kind of elevate that discussion a little bit. And bring some of that into to the larger discussion and you know see what people think about it but you know we could we could talk about this for for hours I know we talked about this for hours internally anyway but you know I think I think folks don't have. the same amount of time to really deep dive on these sorts of things. But, you know, maybe we'll spread some stuff out over a few episodes, because I think there's some interesting things here that we've touched on at a reasonably high level that, you know, could fill an entire sort of podcast discussion. So, you know, probably a lot of stuff to come in the future there. But I think we can wrap it up here. I think it was a good discussion. you know, it's always fun to talk about these things and put that out there. But yeah, thanks. Thanks. And we'll call it there.
Subscribe to our newsletter
Subscribe to our newsletter and stay updated on the newest research, security advisories, and more!
More Resources Like This One
Ready to get started?
Get on a call with our team and learn how Assetnote can change the way you secure your attack surface. We'll set you up with a trial instance so you can see the impact for yourself.