0

Our system can collect data on user queries and domain clicks based on their behavior. We aim to enhance this system by predicting the domain corresponding to a user query. For instance, if a user enters a query not included in our list, such as "xoilac tv," we will predict the list of domains that the user is likely to click, such as [xoilac.tv, 'youtube']

Example dataset from log


data = [
    ("Lỗi Office 2013 không mở được file", "tainhanh.site"),
    ("yoko onsen", "www.klook.com"),
    ("bamboo", "www.bambooairways.com"),
    ("xoilac tv", "xoilac.tv"),
    ("xoilac tv", "youtube.tv"),
    ("hộp mực máy in 226dw", "toanphat.com"),
]

given q: 'xoi lac'

the output is the list of domain with highest probability :
[ {xoilac.tv, 0.99}, {'youtube.tv', 0.90},..]

Is it possible to approached as a sequence modeling problem, specifically a Seq2Seq ? where the input is the query and the output is the website domain users are likely to visit. Can anyone help?

phuongdo
  • 271
  • 1
  • 4
  • 9

0 Answers0