0

For self learning purpose,I am trying to create a end to end dataflow in Google cloud:

1.Creating a mysql table using cloud sql 2.Using dataproc to create a temporary cluster to run the sqoop job using template. 3.Put the extracted data in BQ from storage bucket.

I am getting stuck in accessing the mysql table through sqoop.

ERROR manager.SqlManager: Error executing statement: java.sql.SQLException: Access denied for user 'test'@'localhost' (using password: YES)
java.sql.SQLException: Access denied for user 'test'@'localhost' (using password: YES).

I tried resolving this issue by: 1.Replacing the localhost with public ip. 2.Executing

GRANT ALL ON everlytics.* TO test@'<ip>' IDENTIFIED BY '1234';  
flush privileges;

My code snippet is below:

gcloud  dataproc workflow-templates set-managed-cluster $template_name --zone "asia-south1-a" \
 --cluster-name=$cluster_name \
 --region "asia-south1" \
 --scopes=default,sql-admin \
 --initialization-actions=gs://dataproc-initialization-actions/cloud-sql-proxy/cloud-sql-proxy.sh \
 --properties=hive:hive.metastore.warehouse.dir=$bucket/hive-warehouse \
 --metadata=enable-cloud-sql-hive-metastore=false \
 --metadata=additional-cloud-sql-instances=$instance_name=tcp:3306 \
 --master-machine-type n1-standard-1 \
 --master-boot-disk-size 20 \
  --num-workers 2 \
--worker-machine-type n1-standard-2 \
--worker-boot-disk-size 20 \
--image-version 1.2 &&
gcloud  dataproc workflow-templates add-job hadoop \
--step-id=customers_564456778 \
--region="asia-south1" \
--workflow-template=$template_name \
--class=org.apache.sqoop.Sqoop \
--jars=$bucket/sqoop-1.4.7-hadoop260.jar,$bucket/avro-tools-1.8.2.jar,$bucket/mysql-connector-java-5.1.48.jar  \
-- import -Dmapreduce.job.user.classpath.first=true \
--driver com.mysql.jdbc.Driver \
--username=test \
--password=1234 \
--query "select * from everlytics.customers where customerNumber>0 and \$CONDITIONS" \
--target-dir $bucket/$table_name \
--split-by customerNumber -m 2 

Mysql version-5.6

Please advise if I am doing it correctly.

Durga
  • 127
  • 11

1 Answers1

0

It might be configuration issues...

Verify by running below command in Cloud SQL 1)gcloud sql connect [cloud-sql-instnace-name] --user=root

And ensure below too 2)Makesure you added your machine ip address in connection list in Cloud SQL Instance.

  • Hi Varun,Thanks for your answer.But it didn't workout,I am still getting the issue.I am using Cloud-SQL and the cluster is created on a temporary basis.Once the job completes,the cluster is deleted and there is no concept to add ip in connections – Durga Mar 06 '20 at 11:11
  • I'm assuming you able to connect through Cloud shell and you are facing issues from Dataproc only. Could you confirm --scopes 'https://www.googleapis.com/auth/cloud-platform' parameter while you create cluster (If you use browser window : Under Project access --> Allow API access to all Google Cloud services in the same project.) – varunkumar inbaraj Mar 16 '20 at 17:08