Duplicates
Data quality is very important for us. Therefore, our moderation team is investing additional time to determine false-positives and potential duplicates.
Identification & Handling
In rare occasions a duplicate is added to the database. As soon as we are aware of this, we initiate the following process:
- Identify parent entry, which is usually the first entry that got added to the database
- Merge the data of the new duplicate into the existing original
- Flag the duplicate as such and reference the original
Behavior of Entries
This leads to the following effect on the service:
- The duplicate is hidden in all overview lists on the web site (e.g. recent, archive, search)
- Accessing the duplicate will enforce an HTTP redirect to the original entry
- Accessing the duplicate via API shows the obsolete duplicate entry data which contains the additional data field
entry_replacedby
(this is the indicator that this is a duplicate that got merged) - Accessing the original entry via API shows the correct entry data which might also contain the data field
entry_replaces
for backlinking purposes
CVE Duplicates
If a CVE is a duplicate we approach such like this:
- If we are the responsible CNA, we will flag the duplicate CVE as such and revoke it
- If we are not the responsible CNA, we will add the duplicate CVE to our existing entry
Split Methodology
Please consider our splitting methodology which might affect duplicate handling as well.