Duplicates
Data quality is very important for us. Therefore, our moderation team is investing additional time to determine false-positives and potential duplicates.
Early Identification
During the moderation process we try to identify potential duplucates as early as possible. Big data analysis and machine learning tools support our team in providing the best possible data quality. A similar feature is available within every vulnerability entry where the Relate View helps to determine vulnerabilities which share vulnerability and exploit attributes. For example VDB-266630 which indicates some very similar vulnerabilities but no true duplicates.
Handling of Duplicates
In rare occasions a duplicate is added to the database. As soon as we are aware of this, we initiate the following process:
- Identify parent entry, which is usually the first entry that got added to the database
- Merge the data of the new duplicate into the existing original
- Flag the duplicate as such and reference the original
Behavior of Entries
This leads to the following effect on the service which shall be demonstrated on VDB-243107 which was a duplicate of by VDB-233216:
- The duplicate is hidden in most overview lists on the web site (e.g. recent, archive, search)
- Accessing the duplicate entry will enforce an HTTP redirect to the correct entry
- Accessing Diff and History Views of a duplicate remains possible
- Accessing the duplicate via API shows the obsolete duplicate entry data which contains the additional data field
entry_replacedby
(this is the indicator that this is a duplicate that got merged) - Accessing the correct entry shows the correct entry data which might also contain the data field
entry_replaces
for backlinking purposes
CVE Duplicates
If a CVE is a duplicate we approach such like this:
- If we are the responsible CNA, we will immediately flag the duplicate CVE as such and revoke it
- If we are not the responsible CNA, we will add the duplicate CVE to our existing entry (we cannot reject CVEs of other CNAs)
Split Methodology
Please consider our splitting methodology which will affect duplicate handling as well.
Actualizado em: 06/08/2024 a partir de VulDB Documentation Team