0

Good day, Im having trouble finding a way loading 28 million records into a bigquery datatable, I can do it with insert commands but its not efficient as it takes around a week to finish, does anyone know of a better way to load this through vb.net? thanks a lot

Wairhard
  • 19
  • 6

2 Answers2

0

Generally, the process to process lots of data at once is called "bulk" or "batch".

In your case, it's called batch loading data.

Different ways are available, you have to check out what makes the most sense for you. Maybe it is an API you can access directly from VB, maybe you can write it to a file and insert it by hand. Or automate that process.

nvoigt
  • 75,013
  • 26
  • 93
  • 142
  • thanks a lot, yeah Ive found some but theres no example code for it anywhere thats why I came here hoping someone has done it before and could point me in the correct direction. – Wairhard Jul 06 '23 at 12:59
0

Downloaded Google.Cloud.BigQuery.V2 from nuget and used this code "you need .net 6 or above for it to work" , files must be in .csv format. I download the data from oracle and split it into various csv files which I itterate to load them to gbq. Also to authenticate you'll need to download the GoogleCloudSDKInstaller.exe and through the shell run a command "gcloud auth application-default login" to generate the required files that will get your credentials automatically you can read more about it here: https://cloud.google.com/docs/authentication/provide-credentials-adc?hl=es-419#how-to

    Dim Drctrs As New System.IO.DirectoryInfo(Application.StartupPath & "\Results")
            Dim projectId As String = "edw-sandbox"
            Dim datasetId As String = "AD_HOC"
            Dim tableId As String = "TABLE"
            Dim client As BigQueryClient = BigQueryClient.Create(projectId)
            Dim uploadCsvOptions As UploadCsvOptions = New UploadCsvOptions()
            Dim stream As System.IO.FileStream
            Dim fleinfo() As System.IO.FileInfo = Drctrs.GetFiles
    
            For Each fle In fleinfo
                stream = System.IO.File.Open(fle.FullName, IO.FileMode.Open)
                Dim job As BigQueryJob = client.UploadCsv(projectId, datasetId, tableId, Nothing, stream, uploadCsvOptions)
                job.PollUntilCompleted().ThrowOnAnyError()
                Dim TABLE As BigQueryTable = client.GetTable(datasetId, tableId)
                MsgBox(TABLE.Resource.NumRows)
            Next
Wairhard
  • 19
  • 6