Efficient distributed privacy preserving clustering

Doğanay, Mahir Can

View/Open

File_319058 (672.8Kb)

Date

2008

Author

Doğanay, Mahir Can

Metadata

Show full item record

Abstract

İnsanların son yıllarda artan mahremiyet kaygıları veri madenciliği alanında çalışan araştırmacıları mahremiyeti koruyan veri madenciliği algoritmaları geliştirmeye zorluyor. Ancak şu ana kadar ortaya atılan algoritmalar ya çok büyük veri yığınları üzerinde etkisiz kalmakta, ya da insanları mahremiyet ve sonuç kalitesi arasında bir seçim yapmaya zorlamaktadır.Güvenli çok partili hesaplama yöntemleri veri madenciliği algoritmalarının sonuç kalitelerini bozmadan veri mahremiyetini korumayaimkan tanımaktadır. Ancak güvenli çok partili hesaplama yöntemi ile geliştirilmiş mahremiyeti koruyan veri madenciliği algoritmaları,açık anahtar şifreleme teknikleri kullanılarak geliştirilmektedir. Bu da bu algoritmaların gerçek hayatta büyük veri yığınları üstünde kullanılmasını imkansız kılmaktadır.Bu tezde açık anahtar şifreleme yerine paylaşımlı şifreleme tekniğine dayanan bir dağıtık kümeleme algoritması önerilmiştir. Yapılantestlere göre algoritma şu ana kadar ortaya atılan eni iyi algoritmalardan çok daha hızlı çalışmakta ve partiler arası daha az veri transferi gerektirmektedir.

With recent growing concerns about data privacy, researchers have focused their attention to developing new algorithms to perform privacy preserving data mining. However, methods proposed until now are either veryinefficient to deal with large datasets, or compromise privacywith accuracy of data mining results.Secure multiparty computation helps researchers develop privacy preserving data mining algorithms without having to compromise quality of data mining results with data privacy. Also it provides formal guarantees about privacy. On the other hand, algorithms based on secure multiparty computation often rely on computationally expensive cryptographic operations, thus making them infeasible to use in real world scenarios.In this thesis, we study the problem of privacy preserving distributed clustering and propose an efficient and secure algorithm for this problem based on secret sharing and compare it to the state ofthe art. Experiments show that our algorithm has a lower communication overhead and a much lower computation overhead than the state of the art.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/217600

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess