Questions tagged [pipelining]

17 questions
54
votes
5 answers

What is the exact ingenuity of Unix pipe

I have heard the story of how Douglas Mcllroy came up with the concept and how Ken Thompson implemented it in one night. As far as I understand, pipe is a system call which shares a piece of memory between two processes where one process writes and…
aoak
  • 661
  • 5
  • 7
5
votes
2 answers

What are the distinction and relation between batch processing and stream processing systems?

Design Data Intensive Applications says Batch processing systems (offline systems) Chapter 10 A batch processing system takes a large amount of input data, runs a job to pro‐ cess it, and produces some output data. Jobs often take a …
5
votes
1 answer

OO design in a data processing pipeline

I'm wondering how to design a fairly simple class whose properties are complex to compute. Also, the properties depend upon each other for computation. An example using graphs and graph processing (think nodes and edges, not charts or scatterplots)…
kdbanman
  • 1,447
  • 13
  • 19
2
votes
0 answers

Data pipeline with fallbacks and callbacks

I'm refactoring our current design for how we download static data. It's a mess of deep class hierarchy and callback hell and I want to convert it to a more elegant straightforward design. Here are two examples of the steps we take to download…
pek
  • 121
  • 4
1
vote
1 answer

Data pipeline design - robust and resilient to future variations

I need to build a data pipeline to populate a database from various files. This is a common scenario. However, I want to have expert opinions for implementing a pipeline that is robust, modular and resilient to future variations. There are several…
Imtiaz
  • 23
  • 5
1
vote
0 answers

Design patterns for versioning steps across data/workflow pipelines?

I'm sure this has been touched upon by a number of questions, but I'm struggling on drawing the boundaries between code, data and configuration versions when working with a large DAG (think airflow or Dagster). There are a few things that change…
MYK
  • 311
  • 1
  • 6
1
vote
1 answer

Should I use different Buildspec files for branch builds and deployment builds?

Very recently a discussion came up regarding the usage of different buildspec.yaml files, one for branch builds and the other for deployment builds and I was wondering, since after some research I wasn't able to find the best practices on this topic…
Pmsmm
  • 113
  • 3
1
vote
3 answers

What makes a data pipeline scalable? Best practices for scalable design?

I have been searching about this topic for a few days and have not yet found anything on books, courses or tutorials. What is a way to make data pipelines more scalable, that doesn't involve NoSql or major investments like hadoop clusters? Most of…
1
vote
2 answers

Remove all side-effects from business logic

I'm looking for feedback for a design pattern that aims to remove all side-effects from business logic. I'm using PHP but the pattern can be applied to any OOP language. The point is to enforce pure business logic from the framework, by not…
1
vote
0 answers

HTTP/REST and chained processing protocol/convention

Is there a protocol or a convention that supports REST (ok, maybe we should use HTTP here instead) processing chain and some neat features to help with that? Let me explain what I mean. Let's assume I have some public REST service available. Using…
1
vote
1 answer

Perfect video processing pipeline?

I'm working on a modular video processing pipeline. It's currently presented as a tree of modules. Each module has a set of "results" and can dynamically request data results from other modules. Each request and result are marked with frame ID. User…
Anton3
  • 127
  • 1
1
vote
1 answer

pipeline step with two outputs which will be used by different later steps

I am creating a java package which offers an API based on a pipeline pattern. That is I have a series of steps which can be plugged together in any combination provided their inputs match the Output of the step before. Initial source and final sink…
0
votes
2 answers

Does Automated Pipeline Mean CI/CD Pipeline?

I am not sure if I have been using the term wrongly (and including it in my CV), so some inputs from the community will be appreciated. I am not a DevOps, but a noob machine learning engineer. So I basically developed certain pipelines to perform…
0
votes
3 answers

Is microservices architecture a good candidate for a pipeline?

I have a monolithic application which can be divided into multiple steps and different steps have variable scaling requirement, however the input of the next step is the output of the current step. Currently, we are not able to handle the no of…
0
votes
1 answer

How could I pipeline two sequential workflows where there is room for overlap in the processes?

I'll break this post into two parts, because I'm trying to abstract the concept, but will explain my implementation at the end. I have two workflows, Workflow A and Workflow B. Part of Workflow B relies on the results of Workflow A. They can be…
Sidney
  • 191
  • 5
1
2