Questions tagged [pipeline]

A pipeline is a sequence of functions (or the equivalent thereof), composed so that the output of one is input for the next, in order to create a compound transformation. Famously, a shell pipeline looks like "command | command2 | command3" (but use the tag "pipe" for this). It's also used in computer architecture to define a sequence of serial stages that execute in parallel over elements being fed into a pipe, in order to increase the overall throughput.

In a command line interface or shell, a pipeline uses the pipe operator ("|") to take output from one function or command and input it to another. This is done in a series like "command1 | function1 | command2". For questions related to the pipe operator use the tag.

In computer architecture, a pipeline is a process consisting of a sequence of stages that must be performed in serial order over each element passing the pipe, but may execute in parallel over the elements inside, such that the overall throughput does not depend on the length of the pipe. This is utilized by most CPUs' hardware to process instructions.

A similar technique is also done in software (software-pipelining) in order to optimize the parallelism of a given loop by reordering it to arrange data dependencies in a pipelined manner.

More broadly, "pipeline" is synonymous with "workflow."

See also:

5444 questions
20
votes
4 answers

How to perform string manipulation while declaring env vars in GitHub Actions

I have a github repository like the following johndoe/hello-world I am trying to set the following environment variables in github actions env: DOCKER_HUB_USERID: ${{ github.actor }} REPOSITORY_NAME: ${GITHUB_REPOSITORY#*\/} …
20
votes
3 answers

Efficient XSLT pipeline in Java (or redirecting Results to Sources)

I have a series of XSL 2.0 stylesheets that feed into each other, i.e. the output of stylesheet A feeds B feeds C. What is the most efficient way of doing this? The question rephrased is: how can one efficiently route the output of one…
Chris Scott
  • 1,721
  • 14
  • 27
19
votes
5 answers

Required context class hudson.FilePath is missing Perhaps you forgot to surround the code with a step that provides this, such as: node

When i load another groovy file in Jenkinsfile it show me following error. "Required context class hudson.FilePath is missing Perhaps you forgot to surround the code with a step that provides this, such as: node" I made a groovy file which contains…
manish soni
  • 515
  • 1
  • 7
  • 19
19
votes
4 answers

AttributeError when using ColumnTransformer into a pipeline

This is my first machine learning project and the first time that I use ColumnTransformer. My aim is to perform two steps of data preprocessing, and use ColumnTransformer for each of them. In the first step, I want to replace the missing values in…
Giulia
  • 205
  • 1
  • 2
  • 5
19
votes
3 answers

Pipeline: Multiple classifiers?

I read following example on Pipelines and GridSearchCV in Python: http://www.davidsbatista.net/blog/2017/04/01/document_classification/ Logistic Regression: pipeline = Pipeline([ ('tfidf', TfidfVectorizer(stop_words=stop_words)), ('clf',…
Christopher
  • 2,120
  • 7
  • 31
  • 58
19
votes
1 answer

how to compare two fields in a document in pipeline aggregation (mongoDB)

I have a document like below : { "user_id": NumberLong(1), "updated_at": ISODate("2016-11-17T09:35:56.200Z"), "created_at": ISODate("2016-11-17T09:35:07.981Z"), "banners": { "normal_x970h90":…
19
votes
1 answer

sklearn pipeline - Applying sample weights after applying a polynomial feature transformation in a pipeline

I want to apply sample weights and at the same time use a pipeline from sklearn which should make a feature transformation, e.g. polynomial, and then apply a regressor, e.g. ExtraTrees. I am using the following packages in the two examples…
stefanE
  • 193
  • 2
  • 7
19
votes
2 answers

how to use GNU Time with pipeline

I want to measure the running time of some SQL query in postgresql. Using BASH built-in time, I could do the following: $ time (echo "SELECT * FROM sometable" | psql) I like GNU time, which provides more formats. However I don't know how to do it…
stderr
  • 1,038
  • 1
  • 11
  • 18
19
votes
3 answers

Why does 2>&1 need to come before a | (pipe) but after a "> myfile" (redirect to file)?

When combining stderr with stdout, why does 2>&1 need to come before a | (pipe) but after a > myfile (redirect to file)? To redirect stderr to stdout for file output: echo > myfile 2>&1 To redirect stderr to stdout for a pipe: echo 2>&1 |…
Rob Bednark
  • 25,981
  • 23
  • 80
  • 125
18
votes
5 answers

frameworks for representing data processing as a pipeline

Most data processing can be envisioned as a pipeline of components, the output of one feeding into the input of another. A typical processing pipeline is: reader | handler | writer As a foil for starting this discussion, let's consider an…
ErikR
  • 51,541
  • 9
  • 73
  • 124
18
votes
2 answers

What's the difference between -> and |> in reasonml?

A period of intense googling provided me with some examples where people use both types of operators in one code, but generally they look just like two ways of doing one thing, they even have the same name
Crysknight
  • 291
  • 2
  • 8
18
votes
2 answers

Get last element of pipeline in powershell

This might be weird, but stay with me. I want to get only the last element of a piped result to be assigned to a varaiable. I know how I would do this in "regular" code of course, but since this must be a one-liner. More specifically, I'm interested…
dozacinc
  • 193
  • 1
  • 2
  • 5
18
votes
2 answers

Scrapy pipeline to export csv file in the right format

I made the improvement according to the suggestion from alexce below. What I need is like the picture below. However each row/line should be one review: with date, rating, review text and link. I need to let item processor process each review of…
W.S.
  • 647
  • 1
  • 6
  • 19
18
votes
8 answers

Is it possible to terminate or stop a PowerShell pipeline from within a filter

I have written a simple PowerShell filter that pushes the current object down the pipeline if its date is between the specified begin and end date. The objects coming down the pipeline are always in ascending date order so as soon as the date…
Dan Finucane
  • 1,547
  • 2
  • 18
  • 27
18
votes
2 answers

How to set the redis timeout waiting for the response with pipeline in redis-py?

In the code below, is the pipeline timeout 2 seconds? client = redis.StrictRedis(host=host, port=port, db=0, socket_timeout=2) pipe = client.pipeline(transaction=False) for name in namelist: key = "%s-%s-%s-%s" % (key_sub1, key_sub2, name,…
hupantingxue
  • 2,134
  • 3
  • 19
  • 24