I have a huge text file of the following format. I want to manipulate this file to fetch the number of occurrence of the department field. Each section has a field called department:
As a result of my program, I need a CSV file of as mentioned in the Expected output
section. I appreciate if the solution uses sed or head/tail or awk. The file is really huge. I have about 50,000+ lines of code. So an effective method is much appreciated.
Input format:
# Person1 Perosn2, AADDC Users, dummydata.somecompany.com
dn: CN=Person1 Perosn2,OU=AADDC Users,DC=dummydata,DC=somecompany,DC=com
objectClass: top
department: 234ABC
name: Person1 Perosn2
objectGUID:: MbCDVZpKbEWRxDUA5iN5IA==
userPrincipalName: abcdef@dummydata.somecompany.com
objectCategory: CN=Person,CN=Schema,CN=Configuration,DC=dummydata,DC=somecompany
,DC=com
dSCorePropagationData: 16010101000000.0Z
lastLogonTimestamp: 132173602593105876
preferredLanguage: en-US
msDS-AzureADMailNickname: abcdef
# Person1 Perosn2, AADDC Users, dummydata.somecompany.com
dn: CN=Person1 Perosn2,OU=AADDC Users,DC=dummydata,DC=somecompany,DC=com
objectClass: top
department: 234ABC
name: Person1 Perosn2
objectGUID:: MbCDVZpKbEWRxDUA5iN5IA==
userPrincipalName: abcdef@dummydata.somecompany.com
objectCategory: CN=Person,CN=Schema,CN=Configuration,DC=dummydata,DC=somecompany
,DC=com
dSCorePropagationData: 16010101000000.0Z
lastLogonTimestamp: 132173602593105876
preferredLanguage: en-US
msDS-AzureADMailNickname: abcdef
# Person3 Perosn4, AADDC Users, dummydata.somecompany.com
dn: CN=Person1 Perosn2,OU=AADDC Users,DC=dummydata,DC=somecompany,DC=com
objectClass: top
department: XYZ012
name: Person1 Perosn2
objectGUID:: MbCDVZpKbEWRxDUA5iN5IA==
userPrincipalName: abcdef@dummydata.somecompany.com
objectCategory: CN=Person,CN=Schema,CN=Configuration,DC=dummydata,DC=somecompany
,DC=com
dSCorePropagationData: 16010101000000.0Z
lastLogonTimestamp: 132173602593105876
preferredLanguage: en-US
msDS-AzureADMailNickname: abcdef
Expected output
234ABC,2
XYZ012,1
what I did:
I used this command to grep the file.
grep '^department: *' file.txt
But I am not sure if there is a way to get the expected output using single commands like sed, grep etc.