A huge and free curated android malware, adware and PUP set, separated by years, for academic purposes

Curated Android for Researching Malware APK Set (CARMA) by ElevenPaths is a free service provided by the Innovation and Labs area of ElevenPaths. It provides a free set of malware samples, adware and other potentially dangerous files collected for the Android operating system. These samples may be exclusively used for research or academic purposes, so their use for any other purpose is forbidden. These sets are intended to provide quality samples that may be used for analysis within expert systems, Machine Learning, artificial intelligence or any method that allows improving the future detection of this kind of threats.

FAQ

What do you provide?

A set of complete malware samples in their original and unaltered format, sorted by year, origin and type of threat. In particular:

  • From Google Play:
    • Adware samples sorted by year 2017, 2018 and 2019.
    • Malware simples sorted by year 2017, 2018 and 2019.
    • PUA/Riskware samples sorted by year 2017, 2018 and 2019.
  • From other markets:
    • Adware samples sorted by year 2017, 2018 and 2019.
    • Malware simples sorted by year 2017, 2018 and 2019.
    • PUA/Riskware samples sorted by year 2017, 2018 and 2019.
  • Goodware:
    • Goodware samples sorted by year 2017, 2018 and 2019.

Each set has a maximum of 1000 samples and a minimum of 10.

Where do these samples come from?

We have several million samples in Tacyt, an ElevenPaths’ service.

How and Why have they been sorted this way?

Classifying malware based on antivirus has advantages, but disadvantages as well. If you train a system with the findings of an antivirus, you will only be able to learn at most what such antivirus knows or be closer to similar results. To make matters worse, if the samples used for the training and learning are unclearly labeled (and this usually happens in several antivirus engines) systems may learn from such different elements as an adware or a Trojan and consequently lose effectiveness.

For our set, we have worked on the basis of some renowned antivirus engines, provided by OPSWAT, but in addition we have applied other interesting rules. For instance, an agreement on the labels when assessing the threat, or that they were not overlapped sets. Moreover, we have considered more variables: the fact that the markets have removed the samples, that they have been on it long enough, or the consensus of several technologies on categorization.

The system is not perfect (it will never be), but it makes up for some usual flaws that we have found. If in addition we take into account the fact that we provide a significant number of samples (something appreciated by analysts), we are able to mitigate such flaw.

The goal is a quality research in the field of malware detection for Android.

May I use them for non-academic purposes?

No, you may not. And you can get no directly or indireclty payment or revenue from this samples.

How can I get it and in exchange for What?

You only need to warrant its use via this form. We will reply to you manually. You must sign an engagement and understanding document where the single commitment is mutual acknowledgement. This means that the work performed with the samples must include, once published, an explicit mention to this page as well as to ElevenPaths’ Innovation and Labs. If the report is considered to be of quality, it will be included in our website as well. However, in the event that finally the work does not result in a deliverable once the samples provided, an acknowledgement via social networks is required whenever possible.

Why do you provide this for free?

We detected academic researchers usually work with very poor malware sets or have problems getting a good malware set. We want the academic field to work with better samples, so their researches are better, so we all get a better malware, adware and PUP detection.

Contact Us


What specific set are you interested in?

Please consider asking just for the sets you really need, since they are quite big.