1

hello how i should modify my code to read dataset2 properly ?

 %%writefile read_rdd.py 
def read_RDD(argv):
  parser = argparse.ArgumentParser() # get a parser object
  parser.add_argument('--test_set', metavar='test_set', type =ParallelMapDataset) 
  args = parser.parse_args(argv) # read the value
  args.test_set.take(3) 
  for i in args.test_set:
    print(i)               

and to execute

test_set = dataset2     #dataset2 cannot be inserted
!gcloud dataproc jobs submit pyspark --cluster $CLUSTER --region $REGION \
    ./read_rdd.py \
    --  --test_set $test_set 

                                                                                                                                  aditional information                                                                                             

type(dataset2) = tensorflow.python.data.ops.dataset_ops

i tried to change type =ParallelMapDataset to type=argparse.FileType('r') but it didnt work as well

currently i cannot submit job im getting insted

/bin/bash: -c: line 0: syntax error near unexpected token (' /bin/bash: -c: line 0:gcloud dataproc jobs submit pyspark --cluster bigdatapart2-cluster --region us-central1 ./read_rdd.py -- --test_set '

1 Answers1

0

Please notice that the arguments you pass via gcloud dataproc jobs submit pyspark are translated to a standrad command line. Try to wrap the argument with quotes:

test_set = dataset2     #dataset2 cannot be inserted
!gcloud dataproc jobs submit pyspark --cluster $CLUSTER --region $REGION \
    ./read_rdd.py \
    --  --test_set "$test_set"
David Rabinowitz
  • 29,904
  • 14
  • 93
  • 125
  • Thanks that work well but when i try to make filetype as rdd using sc = pyspark.SparkContext.getOrCreate() test_set_rdd= sc.parallelize(args.test_set) im getting TypeError: 'FileType' object is not iterable or when i do argv.test_set.batch(s).take(n) im getting AttributeError: 'list' object has no attribute 'test_set' how to make test _set as readable rdd then ? – Tomasz Kaczmarski May 13 '20 at 22:15