Approximate Object Location and Spam Filtering on Peer-to-Peer Systems
Feng Zhou
Li Zhuang
Ben Y. Zhao
Ling Huang
Anthony D. Joseph
John Kubiatowicz
ACM/IFIP/USENIX International Middleware Conference (Middleware 2003)
[Full Text in GZIP PS Format, 131KB]
[Full Text in PDF Format, 263KB]
Paper Abstract
Recent work in P2P overlay networks allow for decentralized object location
and routing (DOLR) across networks based on unique IDs. In this paper, we
propose an extension to DOLR systems to publish objects using generic
feature vectors instead of content-hashed GUIDs, which enables the
systems to locate similar objects. We discuss the design of a distributed
text similarity engine, named Approximate Text Addressing (ATA),
built on top of this extension that locates objects by their text
descriptions. We then outline the design and implementation of a
motivating application on ATA, a decentralized spam-filtering service. We
evaluate this system with 30,000 real spam email messages and 10,000
non-spam messages, and find a spam identification ratio of over 97% with
zero false positives.