64

I need to edit a few text files (an output from sar) and convert them into CSV files.

I need to change every whitespace (maybe it's a tab between the numbers in the output) using sed or awk functions (an easy shell script in Linux).

Can anyone help me? Every command I used didn't change the file at all; I tried gsub.

Benjamin W.
  • 46,058
  • 19
  • 106
  • 116

9 Answers9

86
tr ' ' ',' <input >output 

Substitutes each space with a comma, if you need you can make a pass with the -s flag (squeeze repeats), that replaces each input sequence of a repeated character that is listed in SET1 (the blank space) with a single occurrence of that character.

Use of squeeze repeats used to after substitute tabs:

tr -s '\t' <input | tr '\t' ',' >output 
Alberto Zaccagni
  • 30,779
  • 11
  • 72
  • 106
  • 4
    I don't know the reason, however, only this method using "tr" works for my case. Both sed and awk failed to deal with blank spaces in my file that was generated by a Java program. – Leo5188 Nov 03 '11 at 02:13
  • Thanks! The squeeze option `-s` really is what what I was looking for. – Jona Engel Sep 20 '21 at 05:51
30

Try something like:

sed 's/[:space:]+/,/g' orig.txt > modified.txt

The character class [:space:] will match all whitespace (spaces, tabs, etc.). If you just want to replace a single character, eg. just space, use that only.

EDIT: Actually [:space:] includes carriage return, so this may not do what you want. The following will replace tabs and spaces.

sed 's/[:blank:]+/,/g' orig.txt > modified.txt

as will

sed 's/[\t ]+/,/g' orig.txt > modified.txt

In all of this, you need to be careful that the items in your file that are separated by whitespace don't contain their own whitespace that you want to keep, eg. two words.

dave
  • 11,641
  • 5
  • 47
  • 65
  • isn't sed a line-oriented tool? If so, it should not matter that \n is included in [:space:] – glenn jackman Aug 13 '09 at 17:31
  • 14
    GNU sed requires this syntax: sed 's/[[:space:]]\+/,/g' filename – glenn jackman Aug 13 '09 at 17:32
  • 1
    @glennjackman thanks that worked! and complementing your comment, I use `-r` so `sed -r "s'[[:blank:]]+','g"` – Aquarius Power Jul 06 '14 at 06:11
  • 3
    *OSX 10.10.5*: I would like `\s+` to work: `sed -E 's/\s+/,/g' orig.txt > modified.txt`, but it doesn't. And even `sed 's/[\t ]+/,/g' orig.txt > modified.txt` fails to match tabs. The only sed command that worked for me was: `sed -E 's/[[:space:]]+/,/g' orig.txt > modified.txt` – 7stud Jan 17 '17 at 19:07
27

without looking at your input file, only a guess

awk '{$1=$1}1' OFS=","

redirect to another file and rename as needed

ghostdog74
  • 327,991
  • 56
  • 259
  • 343
  • 2
    I assume the final 1 after the closing curly brace is an always-true pattern that prints the line? I'd go with the more readable `{$1=$1; print}`. – tzot Apr 20 '10 at 01:20
  • 2
    yes. its an awk idiom for true condition which default prints to stdout. – ghostdog74 Apr 20 '10 at 01:59
11

What about something like this :

cat texte.txt | sed -e 's/\s/,/g' > texte-new.txt

(Yes, with some useless catting and piping ; could also use < to read from the file directly, I suppose -- used cat first to output the content of the file, and only after, I added sed to my command-line)

EDIT : as @ghostdog74 pointed out in a comment, there's definitly no need for thet cat/pipe ; you can give the name of the file to sed :

sed -e 's/\s/,/g' texte.txt > texte-new.txt

If "texte.txt" is this way :

$ cat texte.txt
this is a text
in which I want to replace
spaces by commas

You'll get a "texte-new.txt" that'll look like this :

$ cat texte-new.txt
this,is,a,text
in,which,I,want,to,replace
spaces,by,commas

I wouldn't go just replacing the old file by the new one (could be done with sed -i, if I remember correctly ; and as @ghostdog74 said, this one would accept creating the backup on the fly) : keeping might be wise, as a security measure (even if it means having to rename it to something like "texte-backup.txt")

Pascal MARTIN
  • 395,085
  • 80
  • 655
  • 663
  • 1
    Yep, I edited my answer while you where posting your comment, to say about -i (even though I'd recommend not using it, to keep a backup of the file -- that can always be useful) ; didn't think about sed myfile.txt, though ; good point, thanks! – Pascal MARTIN Aug 13 '09 at 10:56
8

This command should work:

sed "s/\s/,/g" < infile.txt > outfile.txt

Note that you have to redirect the output to a new file. The input file is not changed in place.

Dawie Strauss
  • 3,706
  • 3
  • 23
  • 26
5

sed can do this:

sed 's/[\t ]/,/g' input.file

That will send to the console,

sed -i 's/[\t ]/,/g' input.file

will edit the file in-place

ezpz
  • 11,767
  • 6
  • 38
  • 39
3

Here's a Perl script which will edit the files in-place:

perl -i.bak -lpe 's/\s+/,/g' files*

Consecutive whitespace is converted to a single comma.
Each input file is moved to .bak

These command-line options are used:

  • -i.bak edit in-place and make .bak copies

  • -p loop around every line of the input file, automatically print the line

  • -l removes newlines before processing, and adds them back in afterwards

  • -e execute the perl code

Chris Koknat
  • 3,305
  • 2
  • 29
  • 30
1

If you want to replace an arbitrary sequence of blank characters (tab, space) with one comma, use the following:

sed 's/[\t ]+/,/g' input_file > output_file

or

sed -r 's/[[:blank:]]+/,/g' input_file > output_file

If some of your input lines include leading space characters which are redundant and don't need to be converted to commas, then first you need to get rid of them, and then convert the remaining blank characters to commas. For such case, use the following:

sed 's/ +//' input_file | sed 's/[\t ]+/,/g' > output_file
Technext
  • 7,887
  • 9
  • 48
  • 76
H.Alzy
  • 310
  • 4
  • 10
0

This worked for me.

sed -e 's/\s\+/,/g' input.txt >> output.csv
Nilucshan Siva
  • 433
  • 7
  • 16