0

I have a script that converts a part of an nc file to a csv file. The script itself works, but the problem is that I would need to specify the exact directory including the name of the file and output csv. I am interested in running the script for all nc files from folder test1 and converting it to csv's in folder test2 with the same name. I attempted modifying the script but it hasn't worked. Here is my script.

import netCDF4
from netCDF4 import num2date, date2num, date2index
import pandas as pd
import numpy as np
import netCDF4
import sys
import os

path = r"C:\\Users\\chz08006\\Documents\\test1"

for filename in os.listdir(path):
    netcdf_file = r"C:\\Users\\chz08006\\Documents\\test1\\"+filename
    csv_file = r"C:\\Users\\chz08006\\Documents\\test2\\"+filename

    f = netCDF4.Dataset(netcdf_file)
    ssha = f.variables["ssha"]
    lon = f.variables['lon']
    lat = f.variables['lat']
    #time = f.variables['time']
    timedim = ssha.dimensions[0]
    times = f.variables[timedim]
    dates = num2date(times[:], times.units)

    dates = [date.strftime('%Y-%m-%d %H:%M:%S') for date in dates]
    lon_list= list(lon)
    lat_list = list(lat)
    ssha_list = list(ssha)
    lon_list = [x-360 if x>= 180 else x for x in lon_list]
    df = pd.DataFrame({'Time':dates,'Longitude':lon_list,'Latitude':lat_list,'SSHA':ssha_list})
    df.to_csv(csv_file)

My failed attempt at modifying the script was

path = r"C:\\Users\\chz08006\\Documents\\test1"

for filename in os.listdir(path):
    netcdf_file = r"C:\\Users\\chz08006\\Documents\\test1\\"+filename
    csv_file = r"C:\\Users\\chz08006\\Documents\\test2\\"+filename

Previously, it would have been

netcdf_file = r"C:\\Users\\chz08006\\Documents\\test1\\example1.nc"
csv_file = r"C:\\Users\\chz08006\\Documents\\test2\\exampleresult.csv"

where example1 was the nc file name and exampleresult would be the csv name.

martineau
  • 119,623
  • 25
  • 170
  • 301
Bob
  • 115
  • 10

1 Answers1

0

You can use glob module to get a list of files with .nc extensions.

import glob

for netcdf_file in glob.glob(r'C:\Users\chz08006\Documents\test1\*.nc'):
    print(netcdf_file)

You can use os.path.split to split the file path into a parent directory path and file name.

import glob
import os

for netcdf_file in glob.glob(r'C:\Users\chz08006\Documents\test1\*.nc'):
    directory, ncfilename = os.path.split(netcdf_file)
    print(directory)        # C:\Users\chz08006\Documents\test1
    print(ncfilename)       # *.nc

You can use os.path.splitext to split the file name and extension.

for netcdf_file in glob.glob(r'C:\Users\chz08006\Documents\test1\*.nc'):
    directory, ncfilename = os.path.split(netcdf_file)
    print(directory)        # C:\Users\chz08006\Documents\test1
    print(ncfilename)       # filename.nc

    name, ext = os.path.splitext(ncfilename)
    print(name)             # filename
    print(ext)              # nc

Now you can build CSV file name, then you can use os.path.join to build CSV file path.

import glob
import os

csvparent = r"C:\Users\chz08006\Documents\test2"

for netcdf_file in glob.glob(r'C:\Users\link\test1\*.nc'):
    directory, ncfilename = os.path.split(netcdf_file)
    print(directory)        # C:\Users\chz08006\Documents\test1
    print(ncfilename)       # *.nc

    name, ext = os.path.splitext(ncfilename)
    print(name)             # filename
    print(ext)              # nc

    csvname = name + ".csv"
    csvpath = os.path.join(csvparent, csvname)
    print(csvpath)          # C:\Users\chz08006\Documents\test2\filename.csv

Now, variable csvpath contains what you need. It is the path to the CSV file with the same name as *.nc file but .csv extension and is inside the test2 directory.

I hope this is helpful.

Takashi
  • 139
  • 5
  • Would suggest to use `pathlib.Path` instead, it's even nicer for splitting files from their directories and extensions. If you have `p = Path('qwerty/asdf.txt')` then `p.parent` is the directory the file is in and `p.stem` is the filename without the extension. – alkasm Jul 25 '19 at 01:57