I have a question around best practices for attaching metadata to our keen.io pageview events. Internally we use 3 different keyword categories to identify a piece of content, and those keywords live in tags on every page. A good example would be something like this:
<meta name="namespace:tier1" content="Programming" />
<meta name="namespace:tier2" content="Web Development, Web Operations" />
<meta name="namespace:tier3" content="JavaScript, Analytics, jQuery, HTML, CSS" />
We want to be able to segment our users based on those tiers, and do queries like this:
- See all traffic segmented by tier1 keywords
- See the most popular tier2 keywords that belong to a specific tier1 keyword
- ... and so on.
Here's my question: It seems like we could just send this metadata along with the pageview event, but we'll end up having a lot of redundant data that could live in a separate place. For example, if we scraped the keywords every day for our pages, we could index them by URL, and not have all that duplicate meta data in keen.io.
How would you approach this? Am I stuck in SQL land, and should I just don't worry about the duplicate data?
A related question is that our keywords are basically lists, and the keen.io documentation says that we should stay away from lists. Would I need to create a Metadata
event for every single word then? Seems like a bit of overkill to send +10 requests on every pageview.