GLORY 리뷰

1. PROBLEM

뉴스 추천할 때, nlp 기술을 활용하여 semantic 정보를 추출해서 content-based recommendation에 활용해 왔음.
NLP와 ML로 user의 과거 읽었던 뉴스 기사(content)를 분석하여 user의 interest representations를 추출하여 candidate news articles와의 match를 확인함
최근에는 text data를 다루는 능력이 높아졌고(transformer attention), user의 reading history를 sequence로 인식하여 deep learning을 사용하는 방법이 빠르게 발전되고 있음.
또한, user preference를 모델링하는 방법으로 graph-based 방법이 제안됨.
그러나, 이전까지의 방법들은 한 명의 user reading history에만 초점을 맞추기 때문에 global한 관점이 부족하다고 지적함. 그래서 local representation과 global representation을 둘 다 활용하여 추천하고자 한다.
예를 들어, 위의 사진과 같은 경우 u1의 reading history에는 주황색 뉴스밖에 없고, 후보에 주황색과 비슷한 뉴스가 없으면 추천이 어려움. 그러나, 여러명의 user histories를 같이 사용하면 초록색,파랑색과 비슷한 뉴스를 추천할 수 있음
이때 u2, u3의 global news graph로부터 sub graph를 추출하면 초록색과 파랑색 뉴스에 대한 정보를 알 수 있음
중요한 것은 적절하게 global graph new를 발생(generate)시키는 것이고, 추출한 정보를 잘 활용할 수 있게 통합(integrate)하는 것이다.

2. SOLUTION : GLORY (Global-LOcal news Recommendation sYstem)

global-aware historical news encoder uses global news graph to provide global perspectives for historical news)
global-aware candidate news encoder uses global entity graph to solve user behaviors sparse problem of candidate news) * 후보군은 새로 나온 뉴스이기 때문에 user interaction 정보가 부족함
multi-head self-attention mechanism to extract user interests from historical news

3. METHOD

3.1 Problem Formulation

click history sequence of a user u : H𝑢 = [𝑑1, 𝑑2, ..., 𝑑𝐻 ]
Each news article d𝑖 has a title, which contains a text sequence T𝑖 = [𝑤1,𝑤2, ...,𝑤𝑇 ] consisting of 𝑇 word tokens
Entity sequence E𝑖 = [𝑒1, 𝑒2, ..., 𝑒𝐸] *여기서 entity는 nlp에서 특정한 개체나 데이터를 나타내는 단어
The objective is to predict the level of interest 𝑠𝑢,𝑐 for a given candidate news article 𝑑𝑐 and user

3.2 Local Representation : learn local news representation h 𝑙𝑛 and local entity representation h 𝑙e

title만 사용하여 GloVe word embedding을 하고,
각각의 word에 대해서 MSA 를 하여 word embedding vectors x𝑛 = [𝑥 𝜔 1 ; 𝑥 𝜔 2 ; . . . ; 𝑥 𝜔 𝑇 ]를 얻음
Aggregating word representiations : text attention layer를 사용
local entity representation도 위와 동일한 방식으로 뽑아냄. (*pre-trained TransE entity embedding 사용)

* self attention pooling

출처:https://www.datasciencebyexample.com/2023/04/30/what-is-pooling-in-transformer-model/

3.3 Global-aware Historical News Encoder : h_gn

3.3.1 Global News Graph

역할 : summarize user's reading histories
𝐺𝑛 = (𝑉𝑛, 𝐸𝑛), where 𝑉𝑛 and 𝐸𝑛 represent the sets of news articles and edges
a directed edge (𝑣𝑖 , 𝑣𝑗) : vi -> vj (읽는 순서)
edge weight is determined by the frequency of this occurrence from all reading history.

3.3.2 Graph Encoder

to merge global and local information

1. extract the sub-graph from the global news graph

2. select news neighbors of multiple hops from the global news graph

3. select M_n neighbors based on the edge weight

4. adopt graph neural networks(GNN) to encode the subgraph to obtain global news embedding h_gn

GNN으로 GGNN을 사용함 input data, hidden state

3.3.3 Historical News Aggregator : n_k

(4)와 유사하게 utilize an attention pooling network to learn the historical news representation n_k

3.4 User Encoder : emb_user

user history에 있는 각각의 news에 대해서 n_k를 얻을 수 있으므로 식(4)번처럼 multi-head attention mechanism을 사용한 attention pooling layer을 사용해서 emb_user를 뽑아냄

3.5 Global-aware Candidate News Encoder

: 매일 새롭게 나온 뉴스들은 user와의 interaction history가 부족하기 때문에 global entity graph를 사용한다.

𝐺𝑒 = (𝑉𝑒, 𝐸𝑒 ), where 𝑉𝑒 and 𝐸𝑒 are the sets of entities and edges
(𝑣𝑖 , 𝑣𝑗) implies that 𝑣𝑗 is the entity of the last news article and 𝑣𝑖 is the entity of the subsequent news article
edge weight of (𝑣𝑖 , 𝑣𝑗) is determined by the count of news edge occurrences
select the top Me neighbor entities for each entity in candidate news dc
local representation을 얻기 위한 방법과 동일하게 embedding하고 MSA를 통과한 entity token representation를 attention pooling network에 넣어 aggregate 하여 hc_ge를 얻음
Candidate News Aggregator 에서 (9)번 식 (h 𝑐 = [h 𝑙𝑛 𝑐 ; h 𝑙𝑒 𝑐 ; h 𝑔𝑒 𝑐 ])을 활요하여 emb_cand를 얻음

3.6 News Recommendation

non-clicked candidate에서 negative sampling -> 데이터 augmentation
NCE loss : Noise contrastive estimation loss https://velog.io/@maktub314159/NCE-MIL-NCE * 자세한 설명

3. Experiments

metrics :

Mean Reciprocal Rank (MRR) is a ranking quality metric. It considers the position of the first relevant item in the ranked list. (첫번째 추천 아이템에 대해서만 계산)
You can calculate MRR as the mean of Reciprocal Ranks across all users or queries.
A Reciprocal Rank is the inverse of the position of the first relevant item. If the first relevant item is in position 2, the reciprocal rank is 1/2.
MRR values range from 0 to 1, where "1" indicates that the first relevant item is always at the top.
Higher MRR means better system performance. (출처 :https://www.evidentlyai.com/ranking-metrics/mean-reciprocal-rank-mrr)

Ablation Study ( 아이디어를 제거해 봄으로써" 제안한 방법이 어떻게 성능이나 문제에 해결에 효과를 주는지를 확인하는 실험)

Recommendation Diversity metric: ILAD@N, ILMD@N ( Evaluating the Evaluation of Diversity in Natural Language Generation 논문 나와있다. )

NRMS를 고른 이유는 GLORY 같은 구조(MSA)라서 but graph 사용하지 않음

위의 예시를 후보로 고르지 못 함

4. 의의

1) To the best of our knowledge, we are the first to propose a global perspective for constructing a homogeneous global news/entity graph in the news recommendation domain, which enables more effective utilization of rich historical interaction information

2) We introduce a global-aware historical news encoder and a global-aware candidate news encoder that leverage the global news graph and global entity graph, respectively, to enhance the representations of historical news and candidate news

3) Extensive experiments on real-world datasets demonstrate that GLORY achieves state-of-the-art performance

5. 한계점

1) to increased memory and time requirements during training, compared to using only local information

2) 실험에서 use a static global graph and precludes testing on dynamically changing real-world data

6. Future work :

Dynamic global graphs that consider the freshness of news item and user behaviors -> real-time online recommendation systems

'AI' 카테고리의 다른 글

대략 01/12의 공부 일지 + 10주차 회고 (0)	2024.01.12
Feature Enigneering 공부 (0)	2024.01.12
9주차 회고 (0)	2024.01.05
8주차 회고 (0)	2023.12.29
5,6주차 회고 (0)	2023.12.16