Add a new improver

This tutorial contains all the things one should know to quickly implement an improver. Many internal details about improvers can be found inside the vulnerabilites/improver.py file. Make sure to go through Improver Overview before you begin writing one.

TL;DR

  1. Locate the importer that this improver will be improving data of at vulnerabilities/importers/importer_name.py file.

  2. Create a new improver subclass inheriting from the Improver superclass defined in vulnerabilites.improver. It is conventional to end an improver name with Improver.

  3. Implement the interesting_advisories property to return a QuerySet of imported data (Advisory) you are interested in.

  4. Implement the get_inferences method to return an iterable of Inference objects for the given AdvisoryData.

  5. Add the newly created improver to the improvers registry at vulnerabilites/improvers/__init__.py.

Prerequisites

Before writing an improver, it is important to familiarize yourself with the following concepts.

Importer

Importers are responsible for scraping vulnerability data from various data sources without creating a complete relational model between vulnerabilites and their fixes and storing them in a structured fashion. These data are stored in the Advisory model and can be converted to an equivalent AdvisoryData for various use cases. See Importer Overview for a brief overview on importers.

Importer Prerequisites

Improvers consume data produced by importers, and thus it is important to familiarize yourself with Importer Prerequisites.

Inference

Inferences express the contract between the improvers and the improve runner framework. An inference is intended to contain data points about a vulnerability without any uncertainties, which means that one inference will target one vulnerability with the specific relevant affected and fixed packages (in the form of PackageURLs). There is no notion of version ranges here: all package versions must be explicitly specified.

Because this concrete relationship is rarely available anywhere upstream, we have to infer these values, thus the name. As inferring something is not always perfect, an Inference also comes with a confidence score.

Improver

All the Improvers must inherit from Improver superclass and implement the interesting_advisories property and the get_inferences method.

Writing an improver

Locate the Source File

If the improver will be working on data imported by a specific importer, it will be located in the same file at vulnerabilites/importers/importer-name.py. Otherwise, if it is a generic improver, create a new file vulnerabilites/improvers/improver-name.py.

Explore Package Managers (Optional)

If your Improver depends on the discrete versions of a package, the package managers’ VersionAPI located at vulnerabilites/package_managers.py could come in handy. You will need to instantiate the relevant VersionAPI in the improver’s constructor and use it later in the implemented methods. See an already implemented improver (NginxBasicImprover) for an example usage.

Implement the interesting_advisories Property

This property is intended to return a QuerySet of Advisory on which the Improver is designed to work.

For example, if the improver is designed to work on Advisories imported by ExampleImporter, the property can be implemented as

class ExampleBasicImprover(Improver):

    @property
    def interesting_advisories(self) -> QuerySet:
        return Advisory.objects.filter(created_by=ExampleImporter.qualified_name)

Implement the get_inferences Method

The framework calls get_inferences method for every AdvisoryData that is obtained from the Advisory QuerySet returned by the interesting_advisories property.

It is expected to return an iterable of Inference objects for the given AdvisoryData. To avoid storing a lot of Inferences in memory, it is preferable to yield from this method.

A very simple Improver that processes all Advisories to create the minimal relationships that can be obtained by existing data can be found at vulnerabilites/improvers/default.py, which is an example of a generic improver. For a more sophisticated and targeted example, you can look at an already implemented improver (e.g., vulnerabilites/importers/nginx.py).

Improvers are not limited to improving discrete versions and may also improve aliases. One such example, improving the importer written in the importer tutorial, is shown below.

from datetime import datetime
from datetime import timezone
from typing import Iterable

import requests
from django.db.models.query import QuerySet
from packageurl import PackageURL
from univers.version_range import NginxVersionRange
from univers.versions import SemverVersion

from vulnerabilities.importer import AdvisoryData
from vulnerabilities.improver import MAX_CONFIDENCE
from vulnerabilities.improver import Improver
from vulnerabilities.improver import Inference
from vulnerabilities.models import Advisory
from vulnerabilities.severity_systems import SCORING_SYSTEMS


class ExampleImporter(Importer):
    ...


class ExampleAliasImprover(Improver):
    @property
    def interesting_advisories(self) -> QuerySet:
        return Advisory.objects.filter(created_by=ExampleImporter.qualified_name)

    def get_inferences(self, advisory_data) -> Iterable[Inference]:
        for alias in advisory_data.aliases:
            new_aliases = fetch_additional_aliases(alias)
            aliases = new_aliases + [alias]
            yield Inference(aliases=aliases, confidence=MAX_CONFIDENCE)


def fetch_additional_aliases(alias):
    alias_map = {
        "CVE-2021-23017": ["PYSEC-1337", "CERTIN-1337"],
        "CVE-2021-1234": ["ANONSEC-1337", "CERTDES-1337"],
    }
    return alias_map.get(alias)

Note

Use make valid to format your new code using black and isort automatically.
Use make check to check for formatting errrors.

Register the Improver

Finally, register your improver in the improver registry at vulnerabilites/improvers/__init__.py.

 from vulnerabilities import importers
 from vulnerabilities.improvers import default

 IMPROVERS_REGISTRY = [
     default.DefaultImprover,
     importers.nginx.NginxBasicImprover,
     importers.example.ExampleAliasImprover,
 ]

 IMPROVERS_REGISTRY = {x.qualified_name: x for x in IMPROVERS_REGISTRY}

Congratulations! You have written your first improver.

Run Your First Improver

If everything went well, you will see your improver in the list of available improvers.

 $ ./manage.py improve --list

 Vulnerability data can be processed by these available improvers:
 vulnerabilities.improvers.default.DefaultImprover
 vulnerabilities.importers.nginx.NginxBasicImprover
 vulnerabilities.importers.example.ExampleAliasImprover

Before running the improver, make sure you have imported the data. An improver cannot improve if there is nothing imported.

$ ./manage.py import vulnerabilities.importers.example.ExampleImporter

Importing data using vulnerabilities.importers.example.ExampleImporter
Successfully imported data using vulnerabilities.importers.example.ExampleImporter

Now, run the improver.

$ ./manage.py improve vulnerabilities.importers.example.ExampleAliasImprover

 Improving data using vulnerabilities.importers.example.ExampleAliasImprover
 Successfully improved data using vulnerabilities.importers.example.ExampleAliasImprover

See Command Line Interface for command line usage instructions.

Enable Debug Logging (Optional)

For more visibility, turn on debug logs in vulnerablecode/settings.py.

DEBUG = True
LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
        },
    },
    'root': {
        'handlers': ['console'],
        'level': 'DEBUG',
    },
}

Invoke the improve command now and you will see (in a fresh database, after importing):

$ ./manage.py improve vulnerabilities.importers.example.ExampleAliasImprover

Improving data using vulnerabilities.importers.example.ExampleAliasImprover
Running improver: vulnerabilities.importers.example.ExampleAliasImprover
Improving advisory id: 1
New alias for <Vulnerability: VULCOID-23dd9060-3bc0-4454-bfbd-d16c08a966a6>: PYSEC-1337
New alias for <Vulnerability: VULCOID-23dd9060-3bc0-4454-bfbd-d16c08a966a6>: CVE-2021-23017
New alias for <Vulnerability: VULCOID-23dd9060-3bc0-4454-bfbd-d16c08a966a6>: CERTIN-1337
Improving advisory id: 2
New alias for <Vulnerability: VULCOID-fae4e06e-4815-45fe-ae95-8d2356ffb5b9>: CERTDES-1337
New alias for <Vulnerability: VULCOID-fae4e06e-4815-45fe-ae95-8d2356ffb5b9>: ANONSEC-1337
New alias for <Vulnerability: VULCOID-fae4e06e-4815-45fe-ae95-8d2356ffb5b9>: CVE-2021-1234
Finished improving using vulnerabilities.importers.example.ExampleAliasImprover.
Successfully improved data using vulnerabilities.importers.example.ExampleAliasImprover

Note

Even though CVE-2021-23017 and CVE-2021-1234 are not supplied by this improver, the output above shows them because we left out running the DefaultImprover in the example. The DefaultImprover inserts minimal data found via the importers in the database (here, the above two CVEs). Run importer, DefaultImprover and then your improver in this sequence to avoid this anomaly.