Collaborative filtering and content based hybrid models for recommending scientific articles

Öğretir, Mine

View/Open

File_10191400 (987.7Kb)

Date

2018

Author

Öğretir, Mine

Metadata

Show full item record

Abstract

Öneri sistemleri (RS), kullanıcıların, çok sayıda veri koleksiyonu içerisinden bilgiye erişmelerine yardımcı olan programlardır. Bu tezde, etiketler, yer imleri veya izlenimler gibi örtük derecelendirmeleri ve kullanıcının profili veya öğe özellikleri gibi içerik bilgilerini kullanan hibrit modelleri araştırıyoruz. Literatürde, bu tür bilgileri kullanan öneri yaklaşımları, sırasıyla işbirlikçi filtreleme (CF) ve içerik tabanlı yöntemler olarak bilinmektedir. Hesaplama metodolojisi olarak biri matris ayrıştırmasına, diğeri ise derin öğrenmeye dayanan iki tekniği karşılatırıyoruz. Matris ayrıştırma temelli yaklaşım olarak, örtük derecelendirme matrisinin yanı sıra, bilimsel makalelerin başlıklarını ve özetlerini yan bilgi olarak kullanan Bayesci negatif olmayan matris faktörizasyonu (BNMF) yöntemini araştırıyoruz. Derin öğrenme metodu kapsamında CF yöntemi olarak olasılıksal matris faktörizasyonunu ve içerik tabanlı özellik çıkarımı olarak istiflenmiş gürültü giderici otokodlayıcılarına (SDAE) Bayesci yaklaşımı kullanan işbirlikçi derin öğrenme (CDL) yöntemini araştırıyoruz. Deneylerimizde bu teknikleri /% 0.22'lik derecelendirme yoğunluğuna sahip bir CiteULike veri kümesine uyguluyoruz. Deneysel sonuçlarımız, CDL'nin bu veri setinde bağlaşık BNMF'den daha etkili olduğunu göstermektedir. Bizim görüşümüzce, CDL, doğrusal olmayan ve derin bir yapıya sahip olan Bayesci SDAE bileşeni nedeniyle daha iyi performans göstermektedir.

Recommendation systems (RS) are programs that assist users in accessing information in vast amount of data collections. In this thesis, we investigate hybrid models that use both implicit ratings such as tags, bookmarks or impressions, and content information such as user's profile or item properties. In the literature, recommendation approaches that use such information are known as collaborative filtering (CF) and content-based methods, respectively. As computation methodology we investigate and compare two techniques, one is based on matrix decomposition and the other one is based on deep learning. As a matrix decomposition based approach, we investigate Bayesian nonnegative matrix factorization (BNMF), that we enhance using side information, the titles and abstracts of scientific articles, besides the implicit rating matrix. As a deep learning method, we explore collaborative deep learning (CDL), which uses probabilistic matrix factorization as CF method and Bayesian stacked denoising autoencoder (SDAE) as content feature extraction. We apply these techniques in our experiments to a CiteULike dataset with a rating density of 0.22/%. Our experimental results show that CDL is more effective than coupled BNMF on this dataset. In our opinion, CDL performs better due to its Bayesian SDAE component which has nonlinear and deep structure.

URI

https://acikbilim.yok.gov.tr/handle/20.500.12812/72539

Collections

TEZLER

Except where otherwise noted, this item's license is described as info:eu-repo/semantics/openAccess