2

Basically after I sort I want my columns to be separated by tabs. right now it is separated by two spaces. The man pages did not have anything related to output formatting (at least I didn't notice it).

If its not possible, I guess I have to use awk to sort and print. Any better alternative?

EDIT: To clarify the question, the location of the double spaces is not consistent. I actually have data like this:

<date>\t<user>\t<message>.

I sort by date by year, month, day and time which looks like

Wed Jan 11 23:44:30 CST 2012

and then have the output of the sorted data like the original file that is

<date>\t<user>\t<message>.

EDIT 2: Seems like my testing for tab was wrong. I was copy pasting raw line from bash to my Windows box. That's why it didn't recognize as a tab instead it showed spaces. I downloaded whole file to windows and now I can see that the fields are tab separated.

Also, I figured out that separation of fields (\t \n , : ;, etc) is same in the new file after sorting. That means, in the original file if I have tab separated field, my sorted file is also going to be tab separated.

One last thing, the "correct" answer was not exactly the correct solution to the problem. I don't know if I can comment on my own thread and mark it as correct. If it is OK to do that, please let me know.

Thanks for the comments guys. Really appreciate your help!

javaCity
  • 4,288
  • 2
  • 25
  • 37
  • `sort(1)` really aims to just sort lines. Editing the lines while you're working with them might be better done in another tools. – sarnold Jun 13 '12 at 02:49
  • Just to clarify, you are trying to replace two spaces with tabs as the output of a sort command? – Levon Jun 13 '12 at 03:02
  • I don't see how it is possible to know which of the possibly multiple double-blank spaces in the same line to replace with a TAB if the location isn't known minimally (e.g., it's always the 2nd double-blank in the line or something like that). I'll be curious to see what solutions emerge. – Levon Jun 13 '12 at 03:28

3 Answers3

5

Pipe your output to column:

sort <whatever> | column -t -s\t

jpaugh
  • 6,634
  • 4
  • 38
  • 90
  • I'm not sure what your sorted output looks like, but I use this for the output of `mount`, and it works well for that. (Although using `-s\t` makes each line of that hideously long. – jpaugh Jun 13 '12 at 02:57
  • it is sorting right now. takes a long time to do it (its like 2G of text file). I will update in comments after it is done. – javaCity Jun 13 '12 at 03:15
  • I think I might've missed out some essential part of the task I am supposed to do. If you can, please refer to the edit section of my main post. Thanks! – javaCity Jun 13 '12 at 03:25
  • 2
    @javaCity I struggled to understand a similar experience with `line too long` (on very short lines) until I learned that you will see that error unless the very end of whatever you input to `column` always ends in a newline. In other words, you might have short lines, and those lines (obviously) are delineated by `\n`, but you MUST also ensure that the final line ends in `\n`, too. – pestophagous Nov 04 '19 at 22:43
  • On my system, `sort` will add a trailing newline if it's missing. But, to be sure, you can pipe the output to `hexdump -C`. – jpaugh Nov 05 '19 at 18:55
2

From what I understood the file is already sorted and what you want is to replace the two separating spaces by a TAB character, in that case, use the following:

sed 's/ /\t/g' < sorted_file > new_formatted_file

(Be careful to copy/paste correctly the two spaces in the regular expression)

higuaro
  • 15,730
  • 4
  • 36
  • 43
  • thanks for your reply :) I have a problem though. Other strings in the line might have 2 spaces between them, which is valid. I don't want to split that valid spaces by tabs. – javaCity Jun 13 '12 at 03:07
  • Are they at the begining or at the end of the line? – higuaro Jun 13 '12 at 03:08
  • middle of the line. its like this `` I want that to be `` – javaCity Jun 13 '12 at 03:12
  • 1
    Could you please post a couple of lines of your data and the same lines in the format do you want them to be? – higuaro Jun 13 '12 at 03:28
  • Well, I found the mistake. I was copy pasting raw data from bash into my text file in local Windows box. Seems like it doesn't work that way. I copy pasted the whole file instead and then it worked with tab! – javaCity Jun 13 '12 at 05:04
2

You can use sed:

 sort data.txt  | sed 's/  /\t/g'
                         ^^
                         ||
                      2 blank spaces

This will take the output of your sort operation and substitute a single tab for 2 consecutive blanks.

Levon
  • 138,105
  • 33
  • 200
  • 191
  • your answer is similar to this http://stackoverflow.com/a/11007694/1140134. which will be problematic for me because the two space is in middle of the line and there might be other valid two spaces in the same line. – javaCity Jun 13 '12 at 03:13
  • @javaCity so you only want to replace *some* double-blank spaces on the same line with a TAB? That constraint is important enough to go into the original question I think. Is this location going to be consistent and will you know where it is? – Levon Jun 13 '12 at 03:15
  • thank you for your reply. no, the location is not consistent. I actually have data like this: `\t\t`. I sort by date by year, month, day and time which looks like `Wed Jan 11 23:44:30 CST 2012` and then have the output of the sorted data like the original file that is `\t\t`. – javaCity Jun 13 '12 at 03:19
  • I had a different problem as stated in main post. However, I have marked your answer as solved because it didn't cause any error and was an honest approach. Thank you. – javaCity Jun 13 '12 at 05:05