1

Problem:

Dumping filtered output throws an error and prints incorrect output with warnings:

Error-attempt to access non-existing field in input

Steps:

  1. Loaded a tab-delimited file into relation a:

    a = LOAD '/user/a6000518-a/AdobeHourlySampleHit/hit_data.tsv' USING PigStorage('\t'); 
    

This file contains 952 columns.

  1. I want to list the values in the 374th column. I did a null check and generated the 374th column values.

    b = FILTER a BY $373 is not null;
    c = FOREACH b GENERATE $373;
    DUMP c
    

Dumping the results produces the expected output but also prints this warning message:

2015-08-20 16:50:53,179 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigHadoopLogger - org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject(ACCESSING_NON_EXISTENT_FIELD): Attempt to access field which was not found in the input

Could you please let me know where I could've made a mistake?

Thanks!

sideshowbarker
  • 81,827
  • 26
  • 193
  • 197

0 Answers0