How to find mean CTD profile from multiple CTD data files (row wise average of same variable in multiple data files) on python

Question

I have had a difficult time trying to write this question. I have multiple CTD data files (files that contain ocean temperature values with depth). I have plotted them onto one figure to see how temperature changes with depth. What I would like to do now is plot a mean profile (just one line) of the average temperature (amongst all the files) with depth. So like a row-wise average for each variable from the multiple data files.

My data is in cnv format which is just a column of temperature values and another column of depth values. Each data set does not have the same number of depth and temperature values (i.e. not the same number of rows).

This is what my code looks like just for lotting each file and I have attached the figure it produces:

from seabird.cnv import fCNV
import numpy as np
import matplotlib.pyplot as plt
from seabird.cnv import fCNV
import glob

filenames = sorted(glob.glob('dSBE19plus*.cnv')) #load multiple files
filenames = filenames[0:8]

fig1 = plt.figure()
ax1 = fig1.add_subplot(111)
for f in filenames:
    print(f)


    data = fCNV(f)
    # Assign variable names
    depth = data['prdM']
    temp  = data['tv290C']
    salt  = data['PSAL']
    fluo  = data['flECO-AFL']
    turbidity = data['turbWETntu0']


    ax1.plot(temp,depth)

    # Draw x label
    ax1.set_xlabel('Temperature (C)')
    ax1.xaxis.set_label_position('top') # this moves the label to the top
    ax1.xaxis.set_ticks_position('top') # this moves the ticks to the top
    # Draw y label
    ax1.set_ylim([0, 100])
    ax1.set_ylabel('Depth (m)')
    ax1.set_ylim(ax1.get_ylim()[::-1]) 
    ax1.set_xlim([15, 26])

fig1.savefig('ctd_plot.png')

Figure of each CTD data set plotted

I hope my question makes sense.

Many thanks

Marko Lipka · Answer 1 · 2018-08-20T21:05:40.857

you could combine the multiple CTD data files, bin according to depth (or pressure "prDM" in your case) and average each parameter grouped by the bins.

I don't know how to do this in Python but here is an R function for the binning of CTD data:

library("tidyverse")

binCTD.mean <- function(.data, # data.frame to be binned
                        .binvar, # variable name to bin over (usually depth or pressure)
                        .breaks, # breaks as in cut() (either single value giving the number of equally distributed bins or a vector of cut points)
                        .binwidth = NA # alternatively to .breaks the binwidth can be set (overwrites .breaks)
) {
    # calculate .breaks from .binwidth if provided:
    if (!is.na(.binwidth)) { 
        .breaks <- seq(0, ## starting from the water surface makes sense?!
                       ceiling(max(.data[, .binvar])), # to highest depth (rounded up)
                       by = .binwidth) # in intervals of given binwidth
    }

    # new parameter "bins", cut according to given breaks (or binwidth)
    .data$bins <- cut(x = .data[, .binvar], breaks = .breaks)

    # return new data frame with averaged parameters, grouped by bins
    .data %>% # I LOVE the pipe operator %>% !! 
        group_by(bins) %>% # dplyr function to group data
        summarise_all(mean, na.rm = TRUE) # dplyr function to apply functions on all parameters in the data frame (actually a tibble, which is kind of the better data.frame)
}

You find the R code for performing the CTD-binning also on GitHub together with an explanatory example. To read SBE CTD .cnv files you can use this R function (tested with SBE CTD .cnv files collected on different German research vessels) GitHub.

Cheers, Marko

Please use links only as a supplementary resource in your answer. Links can go dead, or the content on the other side can be changed to no longer answer the question. You can edit your answer to include the information you linked to, and still use the links as a citation. — mypetlion, Aug 20 '18 at 20:48
@mypetlion Thanks for pointing that out. I have now integrated the relevant function into the answer and only referred to the source and another helper function in the links. — Marko Lipka, Aug 20 '18 at 21:09

How to find mean CTD profile from multiple CTD data files (row wise average of same variable in multiple data files) on python

1 Answers1