-2

I am doing a ldapsearch query which returns the results as follow

John Joe jjoe@company.com +1 916 662-4727  Ann Tylor Atylor@company.com (987) 654-3210  Steve Harvey sharvey@company.com 4567893210  (321) 956-3344  ...

As you can see between each personal record output there is a blank space and the phone numbers might start with +1 or not and might have blank between the numbers or parenthesis and finally between personal records there are two blank spaces. For example:

I would like to transform these entries to the following format:

John,Joe,jjoe@company.com,(916) 662-4727
Ann,Tylor,Atylor@company.com,(987) 654-3210
Steve,Harvey,sharvey@company.com,(456) 789-3210,(321) 956-3344
...

So basically replace the one blanks with one comma "," and two blanks with , so that at the end I have one personal record (comma separated) per line. Example:

I am trying awk and have managed to replace with "," which makes

<blank><blank> to double comma ",,". 
But can't figure out how to turn ",," to <RETURN>

11/22/2017 ----****** UPDATE ******--------- 11/22/2017

I made this track too crowded. I will post a fresh questions with more details.

Asghar
  • 7
  • 5
  • What have you tried? Where does your attempt have problems? Please add your attempt and your results to the question, so that we know what you need help with. – ghoti Nov 17 '17 at 03:40
  • Also, does Shane Harvey have two phone numbers? The double space before the last telephone number in your sample input makes it appear to be a new record. – ghoti Nov 17 '17 at 03:43
  • I was doing: `ldapsearch -LLL -x -H ldaps: -b "ou=people,dc=,dc=edu" -D uid=,ou=applications,dc=,dc=edu -w givenname sn mail telephoneNumber | awk -F ":" '{printf $2}{printf "\n"}' | awk -F "uid" '{printf $1}' | tr " " ","` – Asghar Nov 17 '17 at 17:52
  • 1
    The stuff you have tried is a vital part of your question, and should be included in your question, not just added to comments after-the-fact. Next time you use StackOverflow, consider including your work so far in the question, and I'm sure you'll get a larger number of high quality answers. – ghoti Nov 17 '17 at 20:39
  • @ghoti Excellent point. This was my first time submitting (using) the StackOverflow. I certainly will adhere to your recommendation next time! – Asghar Nov 20 '17 at 17:13
  • I thought the solution from @tshiono will resolve my issue. It does to a certain point, but not totally, because I found there are other fields in the data which I was not aware of before. Here is the full picture of the problem and what I have done so far: The data in the file can have one of the following formats: – Asghar Nov 21 '17 at 19:14

3 Answers3

2

For your request, a lot of replaces needed to be done by using sed.

$ cat sed-script
s/\ \ ([A-Za-z])/\n\1/g;        # replace alphabets which appended double spaced to '\n'
s/\ \ /,/g;                     # replace remaining double spaces to ',' 
s/([A-Za-z]) /\1,/g;            # releace the space appended alphabets to ',' 
s/\+1//;                        # eliminate +1
s/[ ()-]//g;                    # eliminate space, parenthesis, or dash
s/([^0-9])([0-9]{3})/\1(\2) /g; # modify first 3 numeric embraced by parenthesis
s/([0-9]{4}[^0-9])/-\1/g;       # prepend a '-' to last 4 numerics

$ sed -r -f sed-script file 
John,Joe,jjoe@company.com,(916) 662-4727
Ann,Tylor,Atylor@company.com,(987) 654-3210
Steve,Harvey,sharvey@company.com,(456) 789-3210,(321) 956-3344,...
CWLiu
  • 3,913
  • 1
  • 10
  • 14
1

If your Input_file is same as shown sample then following awk may help you in same.

awk --re-interval '{gsub(/[0-9]{3}-[0-9]{4} +/,"&\n");print}'  Input_file

I am having OLD version of awk so I have mentioned --re-interval in it on new awk no need to mention it.

Explanation: Adding explanation for solution too here.

awk --re-interval '{               ##using --re-interval to use the extended regex as I have old version of awk.
gsub(/[0-9]{3}-[0-9]{4} +/,"&\n"); ##Using gsub utility(global substitute) of awk where I am checking 3 continuous dots then dash(-) then 4 continuous digits and till space with same regex match and NEW LINE.
print                              ##printing the line of Input_file
}'  Input_file                     ##Mentioning the Input_file here.
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
0

Just for your interest, you could say with Perl:

perl -e '
while (<>) {
    s/  /\n/g;
    s/ /,/g;
    s/(\+1,)?\(?(\d{3})\)?[-,]?(\d{3})[-,]?(\d{4})/($2) $3-$4/g;
    print;
}' file
tshiono
  • 21,248
  • 2
  • 14
  • 22