3

i have process that generates a value. I want to forward this value into an value output channel. but i can not seem to get it working in one "go" - i'll always have to generate a file to the output and then define a new channel from the first:

process calculate{

input:
file div from json_ch.collect()
path "metadata.csv" from meta_ch

output:
file "dir/file.txt" into inter_ch


script:
"""
echo ${div} > alljsons.txt
mkdir dir
python3 $baseDir/scripts/calculate.py alljsons.txt metadata.csv dir/
"""

}

ch = inter_ch.map{file(it).text}

ch.view()

how do I fix this?

thanks!

best, t.

tristan
  • 105
  • 1
  • 2
  • 12

1 Answers1

3

If your script performs a non-trivial calculation, writing the result to a file like you've done is absolutely fine - there's nothing really wrong with this approach. However, since the 'inter_ch' channel already emits files (or paths), you could simple use:

ch = inter_ch.map { it.text }

It's not entirely clear what the objective is here. If the desire is to reduce the number of channels created, consider instead switching to the new DSL 2. This won't let you avoid writing your calculated result to a file, but it might mean you can avoid an intermediary channel, potentially.

On the other hand, if your Python script actually does something rather trivial and can be refactored away, it might be possible to assign a (global) variable (below the script: keyword) such that it can be referenced in your output declaration, like the line x = ... in the example below:

Valid output values are value literals, input value identifiers, variables accessible in the process scope and value expressions. For example:

process foo {

  input:
  file fasta from 'dummy'

  output:
  val x into var_channel
  val 'BB11' into str_channel
  val "${fasta.baseName}.out" into exp_channel

  script:
  x = fasta.name
  """
  cat $x > file
  """
}

Other than that, your options are limited. You might have considered using the env output qualifier, but this just adds some syntactic-sugar to your shell script at runtime, such that an output file is still created:

Contents of test.nf:

process test {

    output:
    env myval into out_ch

    script:
    '''
    myval=$(calc.py)
    '''
}

out_ch.view()

Contents of bin/calc.py (chmod +x):

#!/usr/bin/env python
print('foobarbaz')

Run with:

$ nextflow run test.nf 
N E X T F L O W  ~  version 21.04.3
Launching `test.nf` [magical_bassi] - revision: ba61633d9d
executor >  local (1)
[bf/48815a] process > test [100%] 1 of 1 ✔
foobarbaz

$ cat work/bf/48815aeefecdac110ef464928f0471/.command.sh 
#!/bin/bash -ue
myval=$(calc.py)

# capture process environment
set +u
echo myval=$myval > .command.env
Steve
  • 51,466
  • 13
  • 89
  • 103
  • 2
    Thank you very much for this elaborate answer! My pipeline works - but the way I did it felt very non-best practice -> hence the question. Thank you - I'll play around with env output qualifier! – tristan Dec 08 '21 at 14:27