Detection of Algorithmically-Generated Domains: An adversarial machine learning approach

    Abstract

  • Domain name detection techniques are widely used to detect Algorithmically Generated Domain names (AGD) applied by Botnets. A major difficulty with these algorithms is to detect those generated names which are meaningful. In this way, Command and Control (C2) servers are detected. Machine learning techniques have been of great use to generalize the attributes of the meaningful names, generated algorithmically. To resist such techniques, the distribution of characters is used as a basis to generate meaningful domain names. Such techniques are called adversarial attacks attempting to fool machine learning methods. However, our experiments with more than 252757 samples show that in addition to character distribution of domain names, randomness property and pronounceability attributes are of great use to detect such meaningful names. Using these additional attributes, we have been able to identify malicious domain names with an accuracy of 98.19%.

    KeyWords:

  • malware, domain generation algorithms, poisoning attack, adversarial machine learning, pronunciation score.

    download-able sources

  • Here you can find some dataset and codes of this paper.[ dataset and codes]

The architecture of our proposed malicious domain name detector. This diagram presents three steps training, testing and applying adversarial attack on the generated model by machine learning techniques.
The architecture of our proposed malicious domain name detector. This diagram presents three steps training, testing and applying adversarial attack on the generated model by machine learning techniques.