VirusShare.com - Because Sharing is Caring

Home • Hashes • Research • About

Account: Login

Monday, 10 August 2020

The global pandemic gave me a great deal more free time than I was expecting in 2020, but I put some of it to good use and made significant updates and improvements to VirusShare. Officially speaking, this is the first major update since quietly launching the project over 9 years ago. While there have been many updates and tweaks to the backend systems that track down and ingest all the samples, this may be the first published change to the website and search features since 2011.

Previously web-based search options included hash and antivirus detection strings. I am happy to announce that the available options have expanded a bit.

The full list of search options with examples can be found in the search help. New web search and filtering options include:

  • Fuzzy hash or Context Triggered Piecewise Hashes (CTPH)
  • Authenthentihash
  • Import Hash or Imphash
  • File size
  • File type
  • Mime type
  • Extension
  • TrID File Identifier
  • ExifTool metadata fields
  • Date added to VirusShare
  • Number of detections
  • Detection name by antivirus vendor
  • Regular-expressions may be used with many of the above
Additionally, you can now get results for benign or non-detected samples. Benign files are not yet available for download.

Web-crawlers have been used for finding new malware samples since the beginning and log data from these crawlers is now available to enhance your understanding of the provenance of a sample. You can perform searches of the crawler data by:

  • Hash value of the sample (limited to SHA256 at this time) to retrieve a list of unique URLs where the sample was found
  • URLs accessed by the crawlers
  • Fully Qualified Domain Names and the IP addresses resolved by the crawler at the time of connection
The crawler data also includes information about samples that are larger than the current size limit (32 MB) but were not retained at the time of crawling as well as benign files that may not have been retained in the earlier years of the project. Historic web-crawler data prior to July 2020 is in the process of being imported and indexed.

Thanks to the retention of benign files and changes to the backend database, it is now more practical to rescan and recategorize samples should the detection status of a particular sample change after being added to the corpus. Some rescans and adjustments will begin taking place in the near future with plans to address the false-positive detections of the zero-byte null file and the file containing a single ASCII space character as the very first samples to be "fixed".

Torrents of zip files containing collections of detected samples will continue to be created and shared, but these are only intended to serve as a snapshot of the state of the data at the time of creation, to provide a way for researchers to download a significant samples size efficiently, and not to be updated should the detection state of a sample change after the creation of the zip file. Unfortunately there is no practical way to modify the shared and hashed zip file should there be a need to remove samples that may later be considered benign. Likewise, the list of MD5 hashes of samples released with each zip file act as a historical record as the constant addition or subtraction of entries in these files are impractical to maintain in this manner.

A REST API is now generally available to programmatically query the VirusShare database and receive results as JSON formatted text. For more information about the API service, please refer to the APIv2 Documentation.

VirusShare will continue its mission to provide free access to malware and data for the greater research community. VirusShare is a service hosted and maintained by Corvus Forensics who will provide commercial services to support the specific needs of larger organizations including enhanced API access, data feeds, and specialized searches of VirusShare's data. Please contact virusshare@corvusforensics.com to discuss how the VirusShare dataset can supplement your organization's cybersecurity research.

Looking forward to the future,
J-Michael Roberts
@VXShare / @Forensication