1

What I tried so far is as follow:

Nifi Flow for the record count

kindly describe in details how many ways you can read .csv, what I learned so far you need to provide a schema name for the file and then define a schema in the form of .avro or text. is it necessary to provide schema?

thanks in advance.

Lamanus
  • 12,898
  • 4
  • 21
  • 47
Izhar Ali
  • 17
  • 2
  • Do you just want the record count or do you want to do something else with the contents of the csv? – Fudgy Sep 19 '19 at 12:41

3 Answers3

0

GetFile -> CalculateRecordStats with CSV Reader. Then, you can get the record.count attribute without any settings.

Lamanus
  • 12,898
  • 4
  • 21
  • 47
0

Use CalculateRecordStats processor to read your CSV based on provided schema/NiFi can get schema from header.

  • CaluculateRecordStats processor adds records.count attribute to the flowfile.

  • You can also add user defined property then NiFi will user defined property filter counts to the flowfile also.

(or)

Use QueryRecord processor and add new property with SQL query:

select count(*) cnt from FLOWFILE

  • Define record reader/writer AVRO schema to get the count of records in the Flowfile.

  • Then use ExtractText processor to capture record count and keep as flowfile attribute.

  • Use the extracted attribute value to include in your email.

notNull
  • 30,258
  • 4
  • 35
  • 50
0

Apart from few great suggestions above, There is CountText processor which is simple enough if your csv is properly formatted and terminated with newlines. This processor will count the number of lines present in the incoming text. It provides attributes such as -

Name                        Description
text.line.count             The number of lines of text present in the FlowFile 
                            content

text.line.nonempty.count    The number of lines of text (with at least one non- 
                            whitespace character) present in the original 
                            FlowFile

text.word.count             The number of words present in the original 
                            FlowFile

text.character.count        The number of characters (given the specified 
                            character encoding) present in the original FlowFile

You can easily grab this attributes in PutEmail processors or even update the filename with count using expression language variables. For e.g use UpdateAttribute to update filename to "FooBar_"${text.line.count}.csv

Pushkr
  • 3,591
  • 18
  • 31