2

I have a reporting system which gets time-series data from numerous meters (here I am referring it as raw_data)

I need to generate several reports based on different combinations of the incoming raw_data

eg:
report1 = raw_data_1+raw_data34-raw_data15
report2 = ....

Also there are several higher order reports depending on other reports

report67 = report3+report5 ..  etc

In these reports I am aggregating the data for all time units such as hour,day,month,year. These reports are now running once a day

Currently each report is processed one by one in a loop. which is not an efficient way of processing.

I am looking for a way to combine all the operations for all the reports and supply the whole raw_data to it in a single Spark job

0 Answers0