0

My issue seems to be that the CLI can't see the models I've created in this project under the AutoML Tables product. Any help with this is greatly appreciated.

I'm trying to use the CLI because I can't submit 7000+ CSV's through the web interface one at a time (the web interface is limited to one input file. I've tried importing the CSVs into a BigQuery table, but the imports fail after about 3M rows. The CSV have about 7.3B rows.

I'd love to get all of them imported into BigQuery and take that 7.3B number down to what I can reasonably think will be non-zero results, but I can't get all of my CSVs into BigQuery either.

Anyway, for the CLI:

I've read through gCloud's documentation here: https://cloud.google.com/ai-platform/prediction/docs/batch-predict but this seems to refer to "AI Platform", not the AutoML Tables product. When I try to use the CLI instructions listed on this page:

gcloud ai-platform jobs submit prediction myJob --model risk_vs_reward_v2 --input-paths "gs://portfolio_ml/test sets/v2 tests/*.csv" --output-path "gs://portfolio_ml/test set results/v2 results" --region us-central1 --data-format text

I get:

Job [myJob] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ai-platform jobs describe myJob

or continue streaming the logs with the command

  $ gcloud ai-platform jobs stream-logs myJob
ERROR: (gcloud.ai-platform.jobs.submit.prediction) Project [portfolio-ml] not found: Field: prediction_input.model_name Error: The model resource: "risk_vs_reward_v2" was not found. Please create the model resource first by using 'gcloud ai-platform models create risk_vs_reward_v2'.
- '@type': type.googleapis.com/google.rpc.BadRequest
  fieldViolations:
  - description: "The model resource: \"risk_vs_reward_v2\" was not found. Please\
      \ create the model resource first by using 'gcloud ai-platform models create\
      \ risk_vs_reward_v2'."
    field: prediction_input.model_name

And when I check to see if gCloud can see any of the three models I now have in AutoML Tables:

gcloud ai-platform models list

It comes back with 0 models?

WARNING: Using endpoint [https://ml.googleapis.com/]
Listed 0 items.

I have verified that I have the correct project selected with:

gcloud config set project portfolio-ml
geeklimit
  • 33
  • 5

1 Answers1

0

I ended up using AutoIT to cycle through my CSVs in the Cloud Storage bucket and use them in cloud CLI commands on my local machine. I'm lucky that my input CSVs are numbered, but in case anyone else seeds to so something similar:

(Note: a .csv with 1M rows takes about 20s per CSV to import into BigQuery from Cloud Storage.)

#include <MsgBoxConstants.au3>

; press ESC to stop the script in case something goes crazy and you can't click the AutoIT window
HotKeySet("^{ESC}", "Terminate") ; press ESC to stop script
Func Terminate()
   Exit
EndFunc

local $fileStart = 1
local $fileEnd = 1000
local $loadCommand
local $i = $fileStart

local $startTime = TimerInit()
local $avgprocTime = 0
local $remainingTime = 0

while $i <= $fileEnd

   ; see 'bg load' documentation to customize this for your BigQuery table 
   $loadCommand = 'cmd.exe /c bq load --autodetect --noreplace v2.test_criteria "gs://project_ml/test sets/v2 tests/test sets/testset-' & $i & '.csv'
   RunWait($loadCommand,"",@SW_HIDE)

   ; This section is optional, but helpful. If you don't use it, monitor load progress on the BigQuery page or change the @SW_HIDE flag on the RunWait command above so you can see the cmd window and the bg loading status text it shows
   $avgprocTime = TimerDiff($startTime)/($i-$fileStart+1)
   $remainingTime = ($fileEnd-$i)*$avgProcTime
   ConsoleWrite('Loaded file ' & $i & ' of ' & $fileEnd & '. ' & Round((TimerDiff($startTime)/1000)/60,1) & 'm elapsed, ' & Round($remainingTime/1000/60,1) & 'm remaining, ' & Round($avgProcTime/1000,1) & 's per file average.' & @LF)

   ;this is not optional ;)
   $i = $i + 1

WEnd
geeklimit
  • 33
  • 5