3

Currently i am importing data from azure data lake gen2 using pandas in Azure Data Bricks which is working fine. But after i am done with data processing, i want to export pandas data frame to azure data lake gen2 account which is still working fine but when i tried to open file from azure data storage, its showing that file is corrupted.

I have tried to install xlsxwriter to save the file but azure data bricks library is not supporting pip install xlsxwriter command

Below code shows how i import data from Azure data lake storage account and how i am trying to export data from azure data bricks to azure data lake storage account.

import pandas as pd

getpath = r'/dbfs/mnt/native/internal/sales/wholesalesellthru/123/current/123.xlsx' getweekdata = pd.read_excel(getpath, sheetnames = 0, skiprows=2)

getdata = pd.read_excel(getpath, sheetnames = 0, skiprows=3)

getdata = getdata.iloc[:, [0,1,2,3,4]]

Here i am trying to save pandas dataframe "getdata" to Azure Data Lake Gen2 Account.

outputpath = r'/dbfs/mnt/native/internal/sales/wholesalesellthru/123/current/output.xlsx'

getalldata.to_excel(outputpath, header=True)

File is getting corrupted in azure data lake gen2 account.

Aksen P
  • 4,564
  • 3
  • 14
  • 27
user2066958
  • 57
  • 1
  • 1
  • 11

0 Answers0