SMRA pick of the week: Can we look to social media data to uncover threats & vulnerabilities? Newest research from Pacific Northwest National Laboratory shows startling results.
by Erik Costlow, Principal Product Evangelist on August 13, 2020
New research from the Pacific Northwest National Laboratory (PNNL) Data Sciences and Analytics Group shows that 25% of vulnerabilities appear on social media before the National Vulnerability Database (NVD). And it takes an average of nearly 90 days between a vulnerability being discussed on social media and the time it shows up in the NVD.
Vulnerabilities on Social Media
The reasons application vulnerabilities show up this often on social media before they get logged in the NVB are multiple. For developers just starting out in their career or those learning about a specific piece of software, they may not know that something is a vulnerability, that vulnerabilities need to be treated differently, and/or how to report vulnerabilities. In some cases, they may not know if the “issue” they found is a true vulnerability. Naturally, they look to the tools they regularly use when connecting with other developers—social media channels like GitHub, Twitter, and the various forums and discussions housed on Reddit.
Sometimes, developers may submit a potential vulnerability for discussion with other developers on one of the above social channels, and neither they nor the maintainer takes the time required to report it in the NVD. This makes sense: Developers are measured on the amount of code they write and the velocity of release cycles—not on the number of vulnerabilities they find and report. For “hobby” open-source developers, working on a project is fun; reporting vulnerabilities in a bureaucracy is not.
But for those who rely exclusively on the NVD to identify new vulnerabilities, the risks of the wait can be significant. Indeed, once a vulnerability is made public, cyber criminals heed notice and attacks targeting that vulnerability are soon to follow. Relying solely on NVD data and using corresponding reports to decide if/when to upgrade a library create problematic blind spots.
Reporting Vulnerabilities in Real Time
There are several ways to report application vulnerabilities:
Ways to Report Application Vulnerabilities
One approach is to submit a formal Common Vulnerabilities and Exposures (CVE) claim to the U.S. National Vulnerability Database (NVD). A CVE is an official designation that requires a fair amount of time and work from someone to review and verify the claim before it can be announced. All the while, the risk associated with that vulnerability is out there for cyber criminals to exploit as it awaits CVE recognition.
The upside is the veracity of CVEs, which go through a gatekeeper to keep out the noise from incorrect reports or misconfigured systems. But there are no requirements around reporting vulnerabilities to the NVD. It is a voluntary process that many developers simply do not bother with. Thus, while CVEs are helpful, they only provide a slice of the total number of vulnerabilities that exist.
Another way to report vulnerabilities is to participate in a bug bounty program, one that is directly offered by a software developer. In terms of visibility of potential risks, the bug bounty process is not as transparent as discussions that happen in social media public forums. Many companies feature bug bounty programs, and some governments sponsor bounty programs for critical open-source technology. Further, while defect reporting can initiate a direct conversation with the people who created the application, a lot of companies do not offer bug bounty programs. Finally, as with the NVB, the process may take time to investigate—exposing other users of the software to the associated risks.
Reasons Why Social Media Captures New Vulnerabilities
These are some of the reasons why social media is seen as an effective method for certain types of vulnerability reporting. There is essentially no barrier to entry. It is an open and public conversation—stored on record for further input and reference. On social media, if I discover a potential vulnerability, then I can send a tweet targeting a specific project manager or join an existing thread in an open-source community forum to get answers. Social media offers instant access and a much faster format for initiating a discussion with the right people. And the risk associated with each specific problem can be tracked at large.
GitHub, in particular, is the world’s largest community for discussing source code—it is one of the primary locations where developers go to talk about their projects and compare notes. GitHub is huge and very influential—and it is still growing. In fact, the entire community for OpenJDK (open-source implementation of the Java Platform) recently moved over to GitHub from Mercurial. Nearly half of the time, a vulnerability discussion starts on GitHub and then moves over to Twitter and Reddit.
The PNNL report also shows that human-generated social alerts are a lot more effective than those that are machine generated. For developers, it seems they want to hear from other developers and not simply automated machines. Automated alerts help on regular tasks like building reports. However, human attention on vulnerabilities reduces false-positive distractions and enables developers to remain focused on writing code. The takeaway is that developers will pay less attention to a bot-generated alert unless a human has taken time to look at what the bot has to say about a vulnerability and determined that it was significant.
Is Open-Source Code More Vulnerable?
The reason open source is so prominently featured has nothing to do with open source being less secure or more secure than custom code. It is really a question of quantity: There is a lot of open-source code available—and it continues to grow. Due to cost, convenience, and time to market, developers use a lot of open-source components in their applications. Indeed, software today is often built from as much as 90% open-source code—including hundreds of discrete libraries in a single application.
Some additional findings in the PNNL report focus on vulnerabilities in open-source code. Research shows that 80% of code bases include at least one open-source vulnerability, with commercial code bases containing an average of 64 vulnerabilities.
High usage means more proliferation of the existing vulnerabilities within open-source code. It is the same reasoning behind the fact that most car accidents happen close to the home—because that is where people spend their time. And the risks of open source are not just in terms of application security but also licensing complexities. As a result, developers need specific ways to track open-source issues in real time—waiting for a CVE to be publicly identified is not sufficient for eliminating these risks from their systems.
Security Instrumentation Is More Effective Than Social Media
For developers, there are several ways that organizations can benefit from social media when it comes to application security. It starts with the basics—watching the way that people talk about their software to see if there are any conversations taking place that contain sensitive data. But as far as analyzing posts for vulnerabilities about different libraries or frameworks, there are probably too many conversations concurrently occurring for developers to effectively sort legitimate vulnerabilities from a sea of false alerts or general chatter. Gatekeepers like the NVD help filter out a lot of this sort of noise when vetting CVE submissions.
Rather than monitoring social media reports, organizations should dedicate their energy toward developing robust security programs that can actually monitor and protect their systems. This includes activities such as hooking sensors into software to gain telemetry about security incidents, integrating security tools around a security information and event management (SIEM) solution, and developing a bug bounty program so that a broader community can report issues to you.
To understand what systems are actually doing, organizations need sensors and telemetry information to observe their applications. Whether trying to decrease “mean time to reporting” or “mean time to resolution,” developers need detailed information about how applications actually operate in order to diagnose the problem and find a solution.
Relying solely on external identification of vulnerabilities is always going to be slower and more generalized than being able to watch what is actually happening to an application in real time. Having sensors in place within the application code offers the highest level of visibility for understanding what is going on inside the application. The reality is that it is unlikely anyone will submit a CVE about a vulnerability within a piece of custom code. But sensors and telemetry data from within application runtime offer insights that inform developers on what vulnerabilities exist and which ones are most critical.
That ability to observe application runtime also empowers developers to automate what comes next in most circumstances—whether sending alerts or taking a remediation action to prevent an exploit from successfully compromising an application. Or, when an attack against an application in production targets a CVE that is not actually present, security teams should just ignore that particular attack signature (thereby reducing time spent on false positives).
For more on vulnerabilities identified through social media, check out my interview on the Inside AppSec podcast—“When Application Vulnerabilities Are First Reported on Social Media: Strategies and Recommendations.”