0

Is there an equivalent method to the R/Python put_file() methods for taking an object from a Scala notebook in DSX and saving it as a data asset for the project? If so is there any documentation? Looking for something like what was outlined in this article:
https://datascience.ibm.com/blog/working-with-object-storage-in-data-science-experience-python-edition/
I have already written the csv file I want within the notebook, just need to save it to the project!

1 Answers1

0

Try following steps and code snippets -

Step 1 : First generate the credentials. You should be able to generate it by clicking (for any file already uploaded from your browser) the 'Insert to Code->Insert Spark Session Dataframe' from File tab of 'File and Add Data' pane in DSX.

def setHadoopConfig2db1c1ff193345c28eaffb250b92d92b(name: String) = {

    val prefix = "fs.swift.service." + name
    sc.hadoopConfiguration.set(prefix + ".auth.url", "https://identity.open.softlayer.com" + "/v3/auth/tokens")
    sc.hadoopConfiguration.set(prefix + ".auth.endpoint.prefix","endpoints")
    sc.hadoopConfiguration.set(prefix + ".tenant", "<tenant id>")
    sc.hadoopConfiguration.set(prefix + ".username", "<userid>")
    sc.hadoopConfiguration.set(prefix + ".password", "<password.")
    sc.hadoopConfiguration.setInt(prefix + ".http.port", 8080)
    sc.hadoopConfiguration.set(prefix + ".region", "dallas")
    sc.hadoopConfiguration.setBoolean(prefix + ".public", false)
}

val name = "keystone"
setHadoopConfig2db1c1ff193345c28eaffb250b92d92b(name)

val data_frame1 = spark.read.option("header","true").csv("swift://'Your 
DSXProjectName'.keystone/<your file name>.csv")

Step 2 : some code which creates data_frame2 from data_frame1 after say some transformation

Step 3 : Use the same container and project name while saving data of data_frame2 to a file in object store

data_frame2.write.option("header","true").csv("swift://'Same DSXproject name as before'.keystone/<name of the file u want to write the data>.csv")

Please note that you can generate the credential in step 1 and can use it for saving any dataframe in your current notebook without even reading data from any file.

Davis Broda
  • 4,102
  • 5
  • 23
  • 37