Revealing non-alphabetical guises of spam-trigger vocables
Unsolicited bulk email (spam) nowadays accounts for nearly 75% of daily email traffic, a figure that speaks strongly for the need of finding better protection mechanisms against its dissemination. A clever trick recently exploited by email spammers in order to circumvent textual-based filters, invol...
- Autores:
-
Rojas Galeano, Sergio Andres
- Tipo de recurso:
- Article of journal
- Fecha de publicación:
- 2013
- Institución:
- Universidad Nacional de Colombia
- Repositorio:
- Universidad Nacional de Colombia
- Idioma:
- spa
- OAI Identifier:
- oai:repositorio.unal.edu.co:unal/42641
- Acceso en línea:
- https://repositorio.unal.edu.co/handle/unal/42641
http://bdigital.unal.edu.co/32738/
- Palabra clave:
- Uncovering of spam vocables
approximate string matching algorithm
- Rights
- openAccess
- License
- Atribución-NoComercial 4.0 Internacional
Summary: | Unsolicited bulk email (spam) nowadays accounts for nearly 75% of daily email traffic, a figure that speaks strongly for the need of finding better protection mechanisms against its dissemination. A clever trick recently exploited by email spammers in order to circumvent textual-based filters, involves obfuscation of black-listed words with visually equivalent text substitutions from non-alphabetic symbols, in such a way it still conveys the semantics of the original word to the human eye (e.g. masking viagra as v1@gr@ or as v-i-a-g-r-a). In this paper we discuss how a simple-yet-effective adaptation of a classical algorithm for string matching may meet this stylish challenge to effectively reveal the similarity between genuine spam-trigger terms with their disguised alpha-numeric variants. |
---|