Questions tagged [pipeline]

A pipeline is a sequence of functions (or the equivalent thereof), composed so that the output of one is input for the next, in order to create a compound transformation. Famously, a shell pipeline looks like "command | command2 | command3" (but use the tag "pipe" for this). It's also used in computer architecture to define a sequence of serial stages that execute in parallel over elements being fed into a pipe, in order to increase the overall throughput.

In a command line interface or shell, a pipeline uses the pipe operator ("|") to take output from one function or command and input it to another. This is done in a series like "command1 | function1 | command2". For questions related to the pipe operator use the tag.

In computer architecture, a pipeline is a process consisting of a sequence of stages that must be performed in serial order over each element passing the pipe, but may execute in parallel over the elements inside, such that the overall throughput does not depend on the length of the pipe. This is utilized by most CPUs' hardware to process instructions.

A similar technique is also done in software (software-pipelining) in order to optimize the parallelism of a given loop by reordering it to arrange data dependencies in a pipelined manner.

More broadly, "pipeline" is synonymous with "workflow."

See also:

5444 questions
13
votes
2 answers

how to tune parameters of custom kernel function with pipeline in scikit-learn

currently I have successfully defined a custom kernel function(pre-computing the kernel matrix) using def function, and now I am using the GridSearchCV function to get the best parameters. so, in the custom kernel function, there is a total of 2…
ZAWD
  • 651
  • 7
  • 31
13
votes
2 answers

How to do Onehotencoding in Sklearn Pipeline

I am trying to oneHotEncode the categorical variables of my Pandas dataframe, which includes both categorical and continues variables. I realise this can be done easily with the pandas .get_dummies() function, but I need to use a pipeline so I can…
Desiré De Waele
  • 152
  • 1
  • 1
  • 10
13
votes
2 answers

Scrapy pipeline spider_opened and spider_closed not being called

I am having some trouble with a scrapy pipeline. My information is being scraped form sites ok and the process_item method is being called correctly. However the spider_opened and spider_closed methods are not being called. class…
Jim Jeffries
  • 9,841
  • 15
  • 62
  • 103
13
votes
3 answers

Unix tr command to convert lower case to upper AND upper to lower case

So I was searching around and using the command tr you can convert from lower case to upper case and vice versa. But is there a way to do this both at once? So: $ tr '[:upper:]' '[:lower:]' or $ tr A-Z a-z Will turn "Hello World ABC" to "hello…
truffle
  • 455
  • 1
  • 9
  • 17
13
votes
1 answer

Script only seems to process the last object from the pipeline

I have a script that I'm trying to add pipeline functionality to. I'm seeing the strange behavior, though, where the script seems to only be run against the final object in the pipeline. For example param( [parameter(ValueFromPipeline=$true)] …
ASTX813
  • 313
  • 3
  • 13
13
votes
4 answers

Designing an extensible pipeline with Python

Context: I'm currently using Python to a code a data-reduction pipeline for a large astronomical imaging system. The main pipeline class passes experimental data through a number of discrete processing 'stages'. The stages are written in separate…
user943354
12
votes
3 answers

Publish a pipeline Azure Devops code coverage report

I am trying to publish a detailed report online in my Azure DevOps Pipeline, but all I got is a link to download this Coverage file. (That can not be read anymore with the community version since the Visual Studio 2019) This is my…
12
votes
4 answers

Gitlab Shell Script Permission Denied

I am running a simple shell script with Gitlab CICD and I am getting Permission denied. Kindly suggest When I do chmod +x test.sh it says operation not permitted. stages: - build build: stage: build script: - ls - ./test.sh Shell…
Mani
  • 721
  • 3
  • 10
  • 24
12
votes
1 answer

Step references task InvokeRESTAPI at version 1.152.1 which is not valid for the given job target

I have an Azure pipeline which needs to send a REST request to an endpoint. I am trying to use the built in task InvokeRESTAPI@1 to do this, but it errors when running on Azure DevOps. Script: --- trigger: batch: true branches: include: …
Timmo
  • 2,266
  • 4
  • 34
  • 54
12
votes
4 answers

Kubernetes can analytical jobs be chained together in a workflow?

Reading the Kubernetes "Run to Completion" documentation, it says that jobs can be run in parallel, but is it possible to chain together a series of jobs that should be run in sequential order (parallel and/or…
Kermit
  • 4,922
  • 4
  • 42
  • 74
12
votes
1 answer

How to combine features with different dimensions output using scikit-learn

I am using scikit-learn with Pipeline and FeatureUnion to extract features from different inputs. Each sample (instance) in my dataset refers to documents with different lengths. My goal is to compute the top tfidf for each document independently,…
Abrial
  • 421
  • 1
  • 5
  • 20
12
votes
2 answers

Alternate different models in Pipeline for GridSearchCV

I want to build a Pipeline in sklearn and test different models using GridSearchCV. Just an example (please do not pay attention on what particular models are chosen): reg = LogisticRegression() proj1 = PCA(n_components=2) proj2 = MDS() proj3 =…
sooobus
  • 841
  • 1
  • 9
  • 22
12
votes
2 answers

Scikit-Learn: Avoiding Data Leakage During Cross-Validation

I've just been reading up on k-fold cross-validation and have realized that I'm inadvertently leaking data with my current preprocessing setup. Usually, I have a train and test dataset. I do a bunch of data imputation and one-hot encoding on my…
anon_swe
  • 8,791
  • 24
  • 85
  • 145
12
votes
1 answer

I see it, but I don't believe it. Legal names in R, piping operations, and the dot

In trying to understand the base R "Bizarro pipe" as described in the Win Vector blog, I confirmed that simple examples produce pipelike behavior in R with no packages installed. For example: > 2 ->.; exp(.) [1] 7.389056 I found that the dot is…
andrewH
  • 2,281
  • 2
  • 22
  • 32
12
votes
2 answers

How to optimize a sklearn pipeline, using XGboost, for a different `eval_metric`?

I'm trying to use XGBoost, and optimize the eval_metric as auc(as described here). This works fine when using the classifier directly, but fails when I'm trying to use it as a pipeline. What is the correct way to pass a .fit argument to the…
sapo_cosmico
  • 6,274
  • 12
  • 45
  • 58