Shell Scripting: Fuse all duplcate lines inside a file into 1

Question

Here is the sample file with the duplicate lines:

abs
bsa
bsc
abs
bsa
bsb

Here is what the output should be (no duplicates):

abs
bsa
bsc
bsb

I tried out the uniq -u command, but it deletes out the duplicate lines, so would it be better to use sed or awk? Any suggestions?

Thanks!

Use a programming language which offers associative arrays (a.k.a. hashes) or sets. Store each line into the set/hash, when you encounter it the first time. You output the line only if it has not been in your array before. You **can** do it in *bash* (search the man-page for associative arrays) or *awk* (where **every** array is associative), or in nearly any other language (*zsh*, *Ruby*, *Perl*, ....). A problem could be if the input is so huge that you run out of memory. — user1934428, Aug 11 '17 at 06:32

score 1 · Answer 1 · answered Aug 11 '17 at 06:12

1

Note: 'uniq' does not detect repeated lines unless they are adjacent. You may want to sort the input first, or use 'sort -u' without 'uniq'.

sort -u yourfile

The output:

abs
bsa
bsb
bsc

answered Aug 11 '17 at 06:12

RomanPerekhrest

1 Answers1