Rethinking Personalized Ranking at Pinterest: An End-to-EndApproach 리뷰

Information systems -> Personalization; Content ranking


제가 생각하기에 PInterest를 한 번 살펴보면 level3 프로젝트 때 도움이 될 것 같아서 읽어봤습니다. 

개인화 추천을 아주 잘하는 PInterest! 


1, Pinterest Surfaces : Homefeed, Related Pins and Search

2. generate personalized recommendations based on user's interaction 

User-Pin Interaction (Action)

- 핀 저장 saving Pins to board (repin)

- 핀의 있는 링크 누르기 clicking through to the underlying link

- 핀 줌하기 zooming in on one Pin(close-up) 

- 숨기기 

3. user에 대한 이해 -> user의 embedding => 유저의 intention을 두 가지로 나눔 long-term interest, short-term intention

=> 각각 long-term future action, immediate next action을 도출 

Model Architecture

2. Encode Long-term User Interest - PinnerFormer

- 시퀀스 모델링으로 장기 선호를 인코딩하여 end-to-end long-term user embedding을 얻어낸다

- 이를 통해 모델이 앞으로 14일 동안의 유저의 positive future engagement를 예측할 수 있다. (14-day time window) 


2.1.1 Feature Representation 

PinSage embedding : aggregation of visual, text annotations, and engagement information + (user-pin interaction(action)) + metadata features (action type, timestamp, action duration, and surface)

(* using M most recent actions for tractability)


User : M recent actions -> transformer's hidden dimension + fully learnable positional encoding -> standard transformer



nlp - What is purpose of the [CLS] token and why is its encoding output important? - Data Science Stack Exchange

PreNorm - https://www.tutorialexample.com/post-norm-and-pre-norm-residual-units-explained-deep-learning-tutorial/


Pin (item) : MLP, L2 norm 사용


Training Objective : "Dense All Action" loss

predict all positive actions in the next 28 days, averaging each action contribution ->

모든 트랜스포머 인코더의 아웃풋에서 positive action loss를 계산하기 때문에 Dense loss 라고 부른다. 


2.2 Capture Short-term User Intention - Real-time User Sequences

short-term intention : past P actions from real-time logging  

Feature : action timestamp, action duration, action types and PinSage embeddings

-> Sequence Layers ( MHSA block) -> embedding 


유저의 최근 행동에 과도하게 반응하는 경우는 방지해야함 ->  time window mask를 사용함 -> model sensitivity 낮추고 diversity 증가시킴 


2.3 Model Serving

하지만 이 아키텍쳐는 전꺼보다는 당연히 성능 좋지만???? high price - increased infrastructure cost and serving latency



3.1 Homefeed Ranking / Related Pins Ads Ranking 

previous : weighted average of a user's top k PinnerSage embeddings as a feature


-> single PinnerFormer embedding 

-> real time action sequence 


long-term , short-term interest -> personaliezed 효과 좋아짐

future work : user의 action도 sequence로 해보자, candidate generation model에도 이 아키텍처를 사용하자


