1

How to separate tokens in line using Unix?

[in]:

some sentences are like this.
some sentences foo bar that

[out:]

some
sentences
are
like
this.

some
sentences
foo
bar
that

I could have done this in python as below, but is there any unix way to achieve the same output?

>>> import codecs
>>> outfile = codecs.open('outfile.txt','w','utf8')
>>> intext = "some sentences are like this.\n some sentences foo bar that"
>>> infile = codecs.open('infile.txt','w','utf8')
>>> print>>infile, intext
>>> for i in codecs.open('infile.txt','r','utf8'):
...     for j in i.split():
...             print>>outfile, j
...     print>>outfile
... 
>>> exit()

alvas@ubi:~$ cat outfile.txt 
some
sentences
are
like
this.

some
sentences
foo
bar
that
thefourtheye
  • 233,700
  • 52
  • 457
  • 497
alvas
  • 115,346
  • 109
  • 446
  • 738
  • see also, http://stackoverflow.com/questions/21779272/reverse-newline-tokenization-in-one-token-per-line-files-unix?noredirect=1#comment32949628_21779272 – alvas Feb 14 '14 at 12:27

3 Answers3

2

Using sed:

$ cat infile.txt
some sentences are like this.
some sentences foo bar that
$ sed 's/\s\+\|$/\n/g' infile.txt > outfile.txt
$ cat outfile.txt
some
sentences
are
like
this.

some
sentences
foo
bar
that
falsetru
  • 357,413
  • 63
  • 732
  • 636
1

Using xargs

xargs -n1 < file
BMW
  • 42,880
  • 12
  • 99
  • 116
0
sed -e 's/ \|$/\n/g' < text

should do?

Ricardo Cárdenes
  • 9,004
  • 1
  • 21
  • 34