7

I have a CSV file in which some column name have white spaces in it and some column names are without the white space between characters. I want to remove the white spaces from all the header names that has white space in it. Please help. Thank you!

Attaching screenshot for reference. Example: 'First Name' I want 'FirstName'

I am using ReplaceText processor in which under Search value I have passes \s to search just the header row white spaces and replacement value as Empty string. Also my evaluation mode is 'Line-by-Line'. so now the ouput file is showing as FirstName,LastNameshraddha,srivastavsanstuti,srivastav So it's showing everything in one line. I want white spaces only to be removed from header row and do not touch/merge the data values at all.Thanks

Please tell me how to remove them. Thank you!

enter image description here

Shrads
  • 883
  • 19
  • 39

3 Answers3

3

Try with below ReplaceText configs:

Search Value

(.*)

Replacement Value

${'$1':replace(" ","")} //we are applying NiFi expression language replace function on the captured group.

Character Set

UTF-8

Maximum Buffer Size

1 MB

Replacement Strategy

Regex Replace

Evaluation Mode

Entire text //works with Line-By-Line mode also

enter image description here

Refer to this link for more details regards to NiFi expression language.

Input flowfile:

First Name,Last Name
shraddha,srivastav
sanstuti,srivastav

Output flowfile:

FirstName,LastName
shraddha,srivastav
sanstuti,srivastav
notNull
  • 30,258
  • 4
  • 35
  • 50
3

@Shu is on the right direction, but the problem is that all spaces in the text will be replaced. In order to replace spaces ONLY in header row, in Shu's solution change:

  1. Search Value:

(?s)(^[^\n]*)(.*$)

  1. Replacement Value:

${'$1':replace(" ","")}$2

notNull
  • 30,258
  • 4
  • 35
  • 50
EdbE
  • 204
  • 1
  • 4
1

You should be able to do the following match:

/[\s]/g

It matches all white Space. Then simply replace with an empty string. Edit:

Try this instead:

/(?<!\n.+)[ ]/g

It will match a Space only if there's no Newline before it. In other Words, only the first line will match. Again, replace with an empty string.

Poul Bak
  • 10,450
  • 5
  • 32
  • 57
  • Thank you for replying. I tried your Regex in search value and set the replacement value as empty string but it didnt ermove the white space in the output file. Any reason why? – Shrads Sep 20 '18 at 16:39
  • May be a stupid question: Do you save the file, after you have replaced? – Poul Bak Sep 20 '18 at 17:06
  • So my flow is GetFile->ReplaceText->PutFile. So after replacement the file is there in the putfile directory. Can you tell me what I am doing wrong? My replaceText has Evaluation Mode as 'Line-by-Line'. Please help! – Shrads Sep 20 '18 at 17:12
  • I tried passing \s in search value and replacement value as Empty string but now the ouput file is showing as FirstName,LastNameshraddha,srivastavsanstuti,srivastav So it's showing everything in one line. Please help – Shrads Sep 20 '18 at 17:26