Questions tagged [pipeline]

A pipeline is a sequence of functions (or the equivalent thereof), composed so that the output of one is input for the next, in order to create a compound transformation. Famously, a shell pipeline looks like "command | command2 | command3" (but use the tag "pipe" for this). It's also used in computer architecture to define a sequence of serial stages that execute in parallel over elements being fed into a pipe, in order to increase the overall throughput.

In a command line interface or shell, a pipeline uses the pipe operator ("|") to take output from one function or command and input it to another. This is done in a series like "command1 | function1 | command2". For questions related to the pipe operator use the tag.

In computer architecture, a pipeline is a process consisting of a sequence of stages that must be performed in serial order over each element passing the pipe, but may execute in parallel over the elements inside, such that the overall throughput does not depend on the length of the pipe. This is utilized by most CPUs' hardware to process instructions.

A similar technique is also done in software (software-pipelining) in order to optimize the parallelism of a given loop by reordering it to arrange data dependencies in a pipelined manner.

More broadly, "pipeline" is synonymous with "workflow."

See also:

5444 questions
25
votes
9 answers

Inform right-hand side of pipeline of left-side failure?

I've grown fond of using a generator-like pattern between functions in my shell scripts. Something like this: parse_commands /da/cmd/file | process_commands However, the basic problem with this pattern is that if parse_command encounters an error,…
Bittrance
  • 2,202
  • 2
  • 20
  • 29
25
votes
3 answers

CI/CD pipeline with PostgreSQL failed with "Database is uninitialized and superuser password is not specified" error

I'm using Bitbucket pipeline with PosgreSQL for CI/CD. According to this documentation PostgreSQL service has been described in bitbucket-pipelines.yml this way: definitions: services: postgres: image: postgres:9.6-alpine It worked just…
neverwalkaloner
  • 46,181
  • 7
  • 92
  • 100
25
votes
1 answer

Notify all group members of failed pipelines in GitLab

The Goal is to have everyone get a notification for every failed pipeline (at their discretion). Currently any of us can run a pipeline on this project branch, and the creator of the pipeline gets an email, no one else does. I have tried setting the…
JynXXedRabbitFoot
  • 996
  • 1
  • 7
  • 17
25
votes
3 answers

Put customized functions in Sklearn pipeline

In my classification scheme, there are several steps including: SMOTE (Synthetic Minority Over-sampling Technique) Fisher criteria for feature selection Standardization (Z-score normalisation) SVC (Support Vector Classifier) The main parameters to…
24
votes
3 answers

sklearn pipeline - how to apply different transformations on different columns

I am pretty new to pipelines in sklearn and I am running into this problem: I have a dataset that has a mixture of text and numbers i.e. certain columns have text only and rest have integers (or floating point numbers). I was wondering if it was…
Javiar Sandra
  • 827
  • 1
  • 10
  • 25
23
votes
3 answers

Haskell performance implementing unix's "cat" program with Data.ByteString

I have the following Haskell code, implementing a simple version of the "cat" unix command-line utility. Testing performance with "time" on a 400MB file, it's about 3x slower. (the exact script I am using to test it is below the code). My questions…
statusfailed
  • 738
  • 4
  • 15
22
votes
3 answers

Performance of x86 rep instructions on modern (pipelined/superscalar) processors

I've been writing in x86 assembly lately (for fun) and was wondering whether or not rep prefixed string instructions actually have a performance edge on modern processors or if they're just implemented for back compatibility. I can understand why…
RyanS
  • 253
  • 2
  • 9
22
votes
2 answers

IIS7 Integrated vs Classic Pipeline - which uses more ASP.NET threads?

With integrated pipeline, all requests are passed through ASP.NET, including images, CSS. Whereas, in classic pipeline, only requests for ASPX pages are by default passed through ASP.NET. Could integrated pipeline negatively affect thread…
frankadelic
  • 20,543
  • 37
  • 111
  • 164
21
votes
2 answers

How to use Github Release Version Number in Github Action

I have created a Github repo that has got an action to build the npm package and publish it to npmjs.com. The trigger for my action is the creation of a new release in Github. When creating the new release, Github is asking me for a version number.…
Woozar
  • 1,000
  • 2
  • 11
  • 35
21
votes
10 answers

Pipeline OrdinalEncoder ValueError Found unknown categories

Please take it easy on me. I’m switching careers into data science and don’t have a CS or programming background—so I could be doing something profoundly stupid. I've researched for a few hours without success. Objective: get Pipeline to run with…
Pablo Honey
  • 311
  • 1
  • 2
  • 5
21
votes
3 answers

How to pass a parameter to only one part of a pipeline object in scikit learn?

I need to pass a parameter, sample_weight, to my RandomForestClassifier like so: X = np.array([[2.0, 2.0, 1.0, 0.0, 1.0, 3.0, 3.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 5.0, 3.0, 2.0,…
makansij
  • 9,303
  • 37
  • 105
  • 183
21
votes
7 answers

Scrapy, Python: Multiple Item Classes in one pipeline?

I have a Spider that scrapes data which cannot be saved in one item class. For illustration, I have one Profile Item, and each Profile Item might have an unknown number of Comments. That is why I want to implement Profile Item and Comment Item. I…
Nina
  • 211
  • 1
  • 2
  • 4
21
votes
3 answers

Is the following possible in PowerShell: "Select-Object ."?

The scenario: I'm using Select-Object to access properties of a piped object, and one of those properties is itself an object. Let's call it PropertyObject. I want to access a property of that PropertyObject, say Property1. Is there any nice and…
Simon Elms
  • 17,832
  • 21
  • 87
  • 103
21
votes
5 answers

Output binary data on PowerShell pipeline

I need to pipe some data to a program's stdin: First 4 bytes are a 32-bit unsigned int representing the length of the data. These 4 bytes are exactly the same as C would store an unsigned int in memory. I refer to this as binary data. Remaining…
johnnycrash
  • 5,184
  • 5
  • 34
  • 58
21
votes
3 answers

Writing items to a MySQL database in Scrapy

I am new to Scrapy, I had the spider code class Example_spider(BaseSpider): name = "example" allowed_domains = ["www.example.com"] def start_requests(self): yield self.make_requests_from_url("http://www.example.com/bookstore/new") …
Shiva Krishna Bavandla
  • 25,548
  • 75
  • 193
  • 313