0

Help me guys, I'm really lost here. I have a big text file, full of links, and I'm trying to separate them based on which website the link belongs. I was trying to do it with the csplit command, but I'm not really sure how I would do it, as it would depend on the text content.

Text example:

www.unix.com/man-page/opensolaris/1/csplit/&hl=en
www.unix.com/shell-programming-and-scripting/126539-csplit-help.html/RK=0/RS=iGOr1SINnK126qZciYPZtBHpEmg-
www.w3cschool.cc/linux/linux-comm-csplit.html
www.linuxdevcenter.com/cmd/cmd.csp?path=c/csplit+"csplit"&hl=en&ct=clnk

So in this example the first two links would be in one file, and the 2 left would be in one file each. How would this work? I really don't have any idea if this is even possible. (novice programmer)

  • possible duplicate of [Split one file into multiple files based on pattern](http://stackoverflow.com/questions/8061475/split-one-file-into-multiple-files-based-on-pattern) – NeronLeVelu Jan 26 '15 at 07:35

1 Answers1

2

try :

awk 'BEGIN{FS="/"} {print > $1}' [your file name]

output:

cat www.unix.com 
www.unix.com/man-page/opensolaris/1/csplit/&hl=en
www.unix.com/shell-programming-and-scripting/126539-csplit-help.html/RK=0/RS=iGOr1SINnK126qZciYPZtBHpEmg-
cat www.linuxdevcenter.com 
www.linuxdevcenter.com/cmd/cmd.csp?path=c/csplit+"csplit"&hl=en&ct=clnk
cat www.w3cschool.cc 
www.w3cschool.cc/linux/linux-comm-csplit.html

{print > $1} will redirect output to separate files based on $1, in this case, the domain name.

qqibrow
  • 2,942
  • 1
  • 24
  • 40