0

Task: I want to split a variable called "website" in a hive table to get all the websites that are delimited by character space or \n

Issue: When I use either of the following queries:

SELECT website,split(website, '[\\s]') as websites FROM temp_pages
SELECT website,split(website, '[\\s, \\n]') as websites FROM temp_pages

I am unable to achieve the desired results. Here are the results that I get

Expected Output - delimited on space
Input: http://www.insync4all.com http://www.insync4all.nl
Output: ["http://www.insync4all.com","http://www.insync4all.nl"]

Unexpected output - Delimited on \n.
When there is an \n character instead of splitting the websites based on \n character it introduces \\n

Input: www.imtherealthing.com\nwww.childmodelmagazine.com
Output: ["www.imtherealthing.com\\nwww.childmodelmagazine.com"]

Can someone help me to split the website field on \n. It will also be good to understand what is going wrong in the \n case.

ShikharDua
  • 9,411
  • 1
  • 26
  • 22

0 Answers0