Cyber Pop

A critical look at CVSS

Published on 24 May 2022

For a very long time, assessing the severity of a security breach remained a delicate exercise. On the one hand, security researchers described the problem in its darkest light, and on the other hand, publishers or manufacturers systematically tempered – when they did not outright deny the problem.

No common scale being defined, it was necessary to trust one or the other, and learn to decipher the different security bulletins to know if the risk was really low, medium, high, critical, if one was rather “in the green”, “in the red”, or in the middle of all this. Unless you had a good technical background, it was easy to fall into paranoia or – worse – ignore the problems.

It is from this observation that CVSS (Common Vulnerability Scoring System) was born, a vulnerability scoring system created by the FIRST (Forum of Incident Response and Security Teams) and widely supported by software publishers and manufacturers. Its main purpose is to calculate, compare and understand the severity of security breaches. It is used by the majority of databases (CVE, NVD, OSVD, Security Focus, Secunia, etc.) and by many IT players (Cisco, Oracle, Adobe, Qualys, etc.).

This article refers to the current version of CVSS (CVSSv2).

A detailed and transparent scoring

CVSS allows you to calculate three grades:

  • A basic score, allowing everyone to evaluate a problem. It is this score that is most commonly used, especially by listeners.
  •  An environmental rating, which makes it possible to take into account the specificities of the targeted environment to adjust the base score according to the context, which makes it possible to take into account the life cycle of the vulnerability: has it been corrected, are there palliative measures, is it actively exploited? Like the environmental rating, it makes it possible to revise the base score upwards or downwards.  

Without going into the details of the calculation of the score (whose algorithm is public), the basic score is based on the evaluation of two metrics: the exploitability of the vulnerability and the impact it can have on security. Each of these two metrics is evaluated based on three criteria.

For operability:

  • The access vector: can the vulnerability be exploited from the Internet, from an adjacent network, or requires local access?
  • Access complexity: Is it simple, moderately complex, or very complex to access the vulnerable component?
  • Authentication prerequisites: is it possible to anonymously access the vulnerable component to exploit the flaw? Or is there one or more levels of authentication?

For impact:

  • The impact on privacy: is it complete, partial or non-existent?
  • The impact on integrity: is it complete, partial or non-existent?
  • The impact on availability: is it complete, partial or non-existent?

Once the different criteria are evaluated, we obtain a final score out of 10. The higher the score, the more critical the vulnerability. We also obtain – and this is where CVSS is particularly interesting – a detailed reproduction vector, which makes it possible to know how each criterion was evaluated to calculate the score.

Let’s take an example: a website is configured to return extremely detailed error pages, containing references to the offending code. In this, it discloses potentially sensitive information that will allow a malicious person to better target attacks.

This is a website, very simply accessible from the Internet with a browser. No special technical knowledge is required to push the server to the fault, and no authentication is required.

Exploitability can therefore be expressed as follows:

  • Access Vector: Network (Access Vector: Network)
  • Acces Complexity : Low (Access Complexity: Low)
  • Authentification : None (Authentication: None

The partial vector representing this metric will be written as follows: AV:N/AC:L/Au:N

The impact, on the other hand, is only on confidentiality. And the damage is only partial:

  •  Confidentiality Impact: Partial (Confidentiality Impact: Partial)
  • Integrity Impact: None (Integrity Impact: None)
  • Availability Impact: None (Availability Impact: None)

The partial vector representing this metric will therefore be written: C:P/I:N/A:N

The final vector corresponding to this vulnerability will therefore be: AV:N/AC:L/Au:N/C:P/I:N/A:N. This vector corresponds to a score of 5.0, making it a medium criticality vulnerability. This score can be determined or verified very simply using the NIST CVSS calculator, available online.

A basic score that is sometimes inappropriate

The same problem can sometimes have a very different impact.

Let’s imagine that it is possible to anonymously access all the documents that are uploaded to a site by bypassing a validation step:

  • On a sharing site meant to host photos of users’ favorite dishes, the impact will potentially be minor. At worst, we will be able to access the photo of the failed dimsum of which Bernard was a little ashamed (but not enough not to keep it for himself, obviously).
  • On a site that allows confidential documents to be uploaded, on the other hand, the impact will be major. You could find contracts, copies of identity documents, pay slips…

In both cases, we will obtain the same vector and the same score (here 5.0, with the same vector as the one we just calculated above). This in no way reflects the actual situation, and it is a clear limit of the base score. This is one of the reasons why environmental criteria exist.

There are five such criteria, and are all optional:

  • The potential for collateral damage related to vulnerability (None, Low, Low to Medium, Medium to High, High)
  • The number of vulnerable systems in the target environment (None, Low, Medium, High)
  • The level of confidentiality requirement (Low, Medium, High)
  • The level of requirement in terms of integrity (Low, Medium, High)
  • The level of requirement in terms of availability (Low, Medium, High)

If we take the two examples above, in one case the collateral damage is low and in the other it is potentially catastrophic. Similarly, the level of requirement in terms of confidentiality is not the same (low for one, high for the other). By integrating these two criteria into the calculation, we will obtain two very different final scores:

  • 4.5 for the first case (AV:N/AC:L/Au:N/C:P/I:N/A:N/CDP:L/CR:L). This score is average and corresponds relatively well to the criticality level of the fault.
  • 8.0 for the second case (AV:N/AC:L/Au:N/C:P/I:N/A:N/CDP:H/CR:H). This score is consistent and corresponds to the high criticality of the fault.

Here we can clearly see the limits of the base score, and the interest of refining by means of environmental criteria.

However, it is sometimes difficult for an auditor to properly assess these environmental criteria. Without detailed knowledge of the target – which is quite classic in black box tests – assessing the potential for collateral damage is perilous. However, without this criterion, the score obtained in the second case is significantly lower (6.0 – AV:N/AC:L/Au:N/C:P/I:N/A:N/CR:H) and poorly reflects the criticality of the flaw.

Despite its qualities, CVSS therefore has sometimes complex limits to exceed, and a score can be emptied of its substance by underusing or abusing optional criteria.

Confusing temporal criteria

The third set of criteria makes it possible to follow the life cycle of a vulnerability and – once again – to adjust its rating. It covers:

  • Exploitability – is there any public code to exploit the vulnerability, is it functional, is it a proof of concept, is there any doubt about its existence?
  • The existence of countermeasures – are they available, are there only palliative solutions, or an unofficial fix?
  • The very existence of the vulnerability – has it been confirmed, or are there only suspicions?

When one has full confidence in the existence of vulnerabilities, as is often the case during an audit, these criteria are poorly adapted. Moreover, to the extent that they only lower the criticality of a vulnerability, one may wonder whether it is still wise to use them; on the pretext that a vulnerability has not yet been exploited, is it desirable to downgrade the rating? Isn’t this likely to give the reader a false sense of security, when a 0day vulnerability may already be compromising its perimeter?

In truth, there is debate. Nevertheless, a consensus seems to be emerging in the ongoing discussions around CVSSv3, which would consist in using temporal criteria to raise the score rather than reduce it. In the meantime, I am one of those who will prefer a high score attached to a clear course of action.

However, it may be interesting, for example for a CERT or in a process of continuous observation of the security of a perimeter, to update the score of a vulnerability over time.

CVSS being criticized

In addition to these difficulties in judging the context to obtain a score reflecting the reality of the problems, many criticisms have been made to CVSS.

The most common is the three-level impact assessment (none, partial, complete), and insists on its lack of granularity: indeed, an error message revealing internal IP addressing and a path traversal vulnerability to download the /etc/passwd file will both have a partial impact on privacy, but the consequences will potentially be very different.Several companies have floated the idea of introducing an additional level to help reflect this difference. This is particularly the case for Oracle, which has introduced a “partial+” level, thus deviating from the CVSSv2 standard.

Another common criticism is the inability to properly assess password security with CVSS. It is common to find devices or applications on a network that are configured with a default password; by using this password, we take control of the target, or even the entire environment; however, how to reflect security defects using CVSS? If we set the criterion “Authentication” to “Simple”, the score drops drastically, thus failing to reflect the reality of criticality. But by setting it to “None”, we also twist reality to get a score, which goes against the principle of transparency of the rating. During our audits, we are regularly confronted with this dilemma.

Another justified complaint is that it is impossible to assess the criticality of an attack scenario with CVSS. Sometimes, it is possible to exploit multiple vulnerabilities of moderate criticality to completely compromise a perimeter. However, the CVSS assessment will only focus on each of the flaws taken independently; it will be necessary for the auditor to present the scenario as a whole and to highlight the final risk, without being able to rely on a specific evaluation vector.

Finally, CVSS can be criticized for its too great granularity; it is sometimes difficult, if not unnecessary, to differentiate between a score of 5.6 and a score of 5.8. FIRST has also proposed a simplified three-level scale to respond to this criticism:

  • A score of 0 to 3.9 corresponds to a low criticality
  • A score of 4 to 6.9 corresponds to an average criticality
  • A score of 7 to 10 corresponds to a high criticality

A rather positive result

Despite the criticisms that can be made of it, the CVSS rating goes in the right direction:

  • For a company wishing to assess its level of security, it provides a better understanding of the auditor’s assessment. By using environmental criteria to refine the analysis, it also makes it possible to highlight high-risk vulnerabilities and better manage corrective action plans.
  • For a company carrying out audits, it pushes the consultant to carefully consider each problem and its impact, and limits alarmist or fanciful evaluations. It also makes it possible to support its purpose with the client, thanks to the detailed vector in particular.
  • Due to its wide adoption by the various players in the security market, it finally provides a basis for discussion and common evaluation. This normalization, despite its imperfections, is welcome.

Finally, FIRST tries to respond to the various criticisms made of CVSSv2 by including even more actors in the standardization process of version 3. This version should be stabilized during 2014. Let us bet that by then, the debate will be exciting.

To go further: