Can We Trust the Data?

Early Release: The following content is an early preview and should be considered a work in progress. Information may be missing or incomplete until the final release

Let’s explore an all too common scenario for just a moment. Imagine this: you’re sitting in yet another meeting where someone pulls up a dashboard showing 47,000 “critical” vulnerabilities across your application portfolio. The room goes quiet. Someone inevitably asks, “So… which ones should we fix first?” And then comes the uncomfortable silence, followed by the usual suspects: “Let’s tackle all criticals first” or “I think we should start with high CVSS scores.” Sound familiar?

Here’s the thing: if the last chapter was about bringing people together with a unified language, this chapter is about unifying our data so we can actually focus on what matters. Because without trustworthy, contextual data, we’re just playing a very expensive guessing game on what to fix first.

The Great Data Disconnect

I’ve seen this scenario play out countless times across organizations of all sizes. Development teams are drowning in scan results that seem disconnected from reality. Security teams are frustrated because their carefully crafted vulnerability reports disappear into the void. Operations folks are getting paged at 2 AM for issues that may or may not actually impact production. And executives? They’re making million-dollar security decisions based on numbers that might as well be pulled from a magic eight ball.

The root cause isn’t that we don’t have enough data—we’re actually drowning in it. The problem is that our data lives in silos, lacks context, and often tells conflicting stories. Your static analysis tool says you have a SQL injection vulnerability. Your dynamic scanner can’t reproduce it. Your container scanner flags a library with a critical CVE, but that library isn’t even used in the code path that’s exposed to the internet. Your infrastructure team shows the server is fully patched, but your application security tool is still throwing alerts about an OS-level vulnerability.

Who do you believe? What do you fix first? How do you even begin to separate the signal from the noise?

The first step toward trustworthy data is getting everyone looking at the same picture. I don’t mean literally the same dashboard (though that helps), but rather ensuring that when we talk about risk, we’re all considering the same factors and working from the same foundational information.

Think about how air traffic controllers manage dozens of planes in the sky simultaneously. They don’t each have their own radar screen showing different views of the same airspace—that would be chaos. Instead, they have a unified view that incorporates multiple data sources: primary radar, secondary radar, weather data, flight plans, and communication logs. Each controller might focus on different aspects or regions, but they’re all working from the same comprehensive picture of reality.

Security data needs to work the same way. When your development team sees a vulnerability report, it should incorporate not just the raw scan results, but also deployment context, business criticality, actual exposure, compensating controls, and historical context about similar issues. When your security team presents risk metrics to leadership, those numbers should reflect the same reality that the development team sees in their daily work.

This isn’t just about better reporting; it’s about building trust. When teams can see how conclusions were reached and understand the context behind the numbers, they’re more likely to act on the recommendations. When a developer can see that a particular vulnerability is flagged as high priority because it’s in a customer-facing service with sensitive data exposure, rather than just because it has a high CVSS score, they understand the urgency.

I once worked with a team where the security group would send weekly vulnerability reports that the development team largely ignored. The reports were technically accurate but completely divorced from the developers’ reality. We spent time mapping the vulnerability data to actual applications, deployment environments, and business functions. Suddenly, the same data became actionable. Developers could see which issues affected their specific services, understand the business impact, and prioritize fixes accordingly. Trust in the data went up, and so did remediation rates.

The Prioritization Trap

Let’s talk about everyone’s favorite buzzword: prioritization. I can’t count how many conversations I’ve had where someone says, “We just need better prioritization,” as if there’s some magical algorithm that will solve all our security problems. Guess what? There isn’t.

Here’s what I’ve learned after years of helping organizations wrestle with this challenge: prioritization is not a technical problem that can be solved with a better tool. It’s a business decision that requires human judgment, contextual understanding, and clear communication about trade-offs.

Tools can absolutely help. They can aggregate data, apply scoring models, filter noise, and present information in useful ways. But at the end of the day, some human needs to look at the options and decide: Do we fix the authentication bypass in the customer portal first, or do we address the privilege escalation vulnerability in the internal admin tool? That decision depends on factors that no tool can fully capture: current threat landscape, business priorities, regulatory requirements, resource constraints, customer commitments, and a dozen other considerations that change based on context.

I’ve seen organizations spend months building elaborate prioritization frameworks with weighted scoring models that consider dozens of variables. The result is usually a number that feels scientific but doesn’t actually help teams make better decisions. That critical vulnerability might be in a system that’s being decommissioned next month, while the medium issue might be in the core authentication system that every customer touches.

The key insight is that prioritization needs to be contextual and flexible. What’s most important for your customer-facing e-commerce platform might be completely different from what matters for your internal HR system. The team working on your mobile app has different constraints and risk tolerances than the team managing your data warehouse infrastructure.

This is why the most successful security programs I’ve worked with focus less on finding the “perfect” prioritization algorithm and more on ensuring that teams have the contextual information they need to make good decisions for their specific situation.

The Missing Context Link

So here’s the question that keeps me up at night, and probably should keep you up too: “Do we have all the relevant data we need to separate real risk from noise for any given application?”

Notice I didn’t ask if we have all the data, period. The question is whether we have the relevant data, properly contextualized, to make informed decisions about risk.

Let me break this down with a real example. A few years ago, I was working with a fintech company that was getting hammered by their security scanners. They had thousands of findings, mostly around outdated dependencies and configuration issues. The security team was frustrated because remediation was slow. The development team was frustrated because they couldn’t tell which issues actually mattered. Everyone was working hard, but nothing felt like progress.

We started by asking a simple question: For each application, what data do we actually need to understand whether a security finding represents real risk? The answer wasn’t simple, but it was illuminating:

Application Context: What does this application do? Who uses it? What data does it handle? How is it deployed? What other systems does it interact with? Is it customer-facing or internal? What compliance requirements apply?

Technical Context: What did the security scans actually find? Are these theoretical vulnerabilities or confirmed exploitable issues? Are there compensating controls in place? What’s the actual attack surface? How is the application configured in production versus development?

Business Context: How critical is this application to business operations? What would happen if it were compromised? What would happen if it went down for maintenance? What are the current business priorities and constraints?

Historical Context: Have we seen similar issues before? How were they resolved? Have there been any actual security incidents related to this type of vulnerability? What’s our track record with remediation in this application?

The breakthrough came when we realized that they had most of this information, but it was scattered across different teams, tools, and systems. The CMDB had some application context. The security scanners had technical findings. The product managers understood business criticality. The incident response team had historical context. But none of this information was connected in a way that enabled informed decision-making.

Marrying the Data Together

The magic happens when you can marry these different types of data together in a meaningful way. This isn’t just about dumping everything into a data lake and hoping for the best; it’s about creating connections that preserve context and enable insights.

Think about it like putting together a puzzle. Each piece of data is valuable on its own, but the real picture only emerges when you can see how the pieces fit together. A SQL injection vulnerability (technical context) in a customer-facing payment processing system (application context) that handles credit card data (business context) and has had two security incidents in the past year (historical context) tells a very different story than the same technical vulnerability in an internal development tool that only processes test data.

The most successful implementations I’ve seen start simple and evolve over time. They begin by connecting just a few key data sources and focus on answering specific questions rather than trying to build a comprehensive risk management platform on day one.

For example, start with: Can we automatically tag security findings with basic application context? If we know that a vulnerability scanner finding is associated with a specific application, can we automatically include information about that application’s business criticality, deployment environment, and data classification?

Once that’s working reliably, you can start adding more sophisticated connections. Can we correlate vulnerability findings with actual network exposure data? Can we factor in compensating controls from our infrastructure monitoring? Can we include information about planned changes or upcoming deployments?

The key is to focus on connections that actually improve decision-making rather than just creating more data. Every piece of information you add should answer the question: “How does this help a human make a better decision about security risk?”

Building Trust Through Transparency

Here’s something I’ve learned the hard way: data is only as trustworthy as the process that creates it. Teams need to understand not just what the data says, but how it was collected, processed, and analyzed. Black box risk scores that can’t be explained or validated will always be viewed with suspicion, and rightfully so.

The most effective security programs I’ve worked with embrace transparency in their data processes. They document their data sources, explain their correlation logic, and make it easy for teams to drill down from high-level metrics to specific findings and recommendations. When a developer sees that their application has been flagged for attention, they can quickly understand why, see the specific issues that contributed to that assessment, and access the information they need to take action.

This transparency also enables continuous improvement. When teams can see how decisions are being made, they can provide feedback about accuracy, relevance, and context. They can point out when the data is missing important factors or when the conclusions don’t match their understanding of the risk landscape.

Moving the Needle

So where does this leave us? The path to trustworthy security data isn’t about finding the perfect tool or building the ultimate dashboard. It’s about creating systems and processes that bring together the different types of context needed to understand security risk in your specific environment.

Start by mapping what data you have and where it lives. Identify the key questions your teams need to answer to make good security decisions. Look for opportunities to connect related data sources in ways that preserve and enhance context rather than just aggregating numbers.

Most importantly, remember that data is only valuable if people trust it and act on it. The fanciest risk model in the world is useless if teams ignore its recommendations because they don’t understand or trust the underlying data.

In the next chapter, we’ll look at what happens once you have trustworthy, contextual data: how to use it to actually fix the things that matter most. But before we can talk about remediation strategies, we need to make sure we’re working from a foundation of data we can actually trust.

Because at the end of the day, security isn’t about having perfect information; it’s about making the best decisions we can with the information we have, while continuously improving our understanding of the risk landscape we’re trying to navigate.