25

We have a lot of business logic in Excel files and we would like them to integrate in a web application (a Node.js web application). We don't want to rebuild the logic in a programming language. Instead, we would like to insert data into the Excel files we have, and read the calculation results from the same Excel file back.

Note: "We do not use Excel as data source. We want use Excel for the actual calculations."

What would be an appropriate setup for that? Is this possible?

(Brainstorming ideas: virtual machine with Windows and Office installed, OneDrive, SharePoint, etc.)

kaiser
  • 193
  • 1
  • 6
Martin Böschen
  • 380
  • 3
  • 8
  • Do you mean using these excel files as data sources? – Laiv Jan 24 '22 at 10:45
  • No, not as data source, I want them to use for the calculations. – Martin Böschen Jan 24 '22 at 11:28
  • 21
    You will need Windows [COM support for node.js](https://stackoverflow.com/questions/14356064/using-activex-com-components-with-nodejs-is-it-possible). However, you should read [this SO post](https://stackoverflow.com/questions/7215664/so-i-really-cant-use-excel-object-model-to-be-used-server-side), what you are trying has a high risk to become a pretty unstable solution which won't scale well. – Doc Brown Jan 24 '22 at 12:21
  • 40
    What is the *fundamental* reason that you don't want to move the calculation out of Excel? If the problem is the *work* involved in doing so, then be prepared for far more work getting the web interface to work properly and maintaining that approach over the long term. – Steve Jan 24 '22 at 13:41
  • 2
    Rebuilding the logic seems more reasonable. Even if you have hundred of excels. Besides the risk introduced by Hans, there are others like concurrency and version control, debugging or traceability. Some of these things are solved by databases out of the box. If the solution is harder and more expensive than the problem it solves, then it's not a good solution. – Laiv Jan 24 '22 at 14:19
  • 3
    Thank you very much for the engaging discussion so far. I have two more reasons why I would like to do this besides saving our invest in developing the excel logic. 1. I would like the business department to develop templates, so they can change the business logic in future without having to go through an IT change process. 2. Our calculations produce a bill, and I would like to hand over an excel with all the logic over to the customer, so in doubt he can follow the complete logic. Feel free to comment on these additional points. Thank you! – Martin Böschen Jan 24 '22 at 16:15
  • 12
    @MartinBöschen, why then are you thinking a web-based front end is desirable? It won't be possible for the departmental users to alter the Excel template without potentially breaking the web front end which interacts with it. And if you're still handing the Excel file out freely, then why have the complications of a separate non-Excel front end? – Steve Jan 24 '22 at 19:05
  • 4
    You might just want to use google sheets. You could cut and paste into it and there's a lot of solutions for interacting with them. – Mark Rogers Jan 24 '22 at 22:06
  • 3
    If the administrative overhead to ask your devs to make a change ends up making it less efficient for your devs to alter the logic than having your business department do it in Excel themselves, then you _really_ need to address the administrative barriers your company puts in place. – Flater Jan 24 '22 at 23:29
  • 1
    Re #2, it is possible, using libraries that read/write Excel files, to have a backend that outputs Excel format files, which is different than actually integrating the Excel runtime into your application. So you could potentially have an Excel template, have your backend read in the template and fill values into certain cells, and then save out the modified file to send to customers. However, this is likely a bad idea, and I hesitate to even mention it, as your backend won't have any access to the calculations in the spreadsheet, your customers will need Excel, and this solution is fragile. – Zach Lipton Jan 25 '22 at 00:01
  • Might be worth the time to look into [this](https://betterprogramming.pub/google-spreadsheet-as-a-backend-b6b51541f1e1) – Thomas Jan 25 '22 at 11:36
  • If you read/write xlsx files using some library, it should be possible to include the formulas into files written, and it could also be possible to transform Excel files created by the business people into executable code for your backend. The second path isn't easy, and probably not cost effective, but it would be possible if manual translation by developers isn't feasible. – Hans-Martin Mosner Jan 25 '22 at 11:37
  • Perhaps [add](https://softwareengineering.stackexchange.com/posts/436215/edit) some *description* of the ***level of complexity*** to your question? *"a lot of business logic"* may not necessarily mean complex. E.g., *"We don't want to rebuild the logic in a programming language"* may indicate it isn't (currently) complexified by the use of [VBA](https://en.wikipedia.org/wiki/Visual_Basic_for_Applications). Is the complexity so low that it could almost trivially be ported to [OpenOffice](https://en.wikipedia.org/wiki/OpenOffice.org) [Calc](https://en.wikipedia.org/wiki/LibreOffice_Calc)? – Peter Mortensen Jan 25 '22 at 13:41
  • - cont': (The OpenOffice part is not a suggestion - just for illustration.) In other words, is the complexity in the business rules themselves or in the sub set of Excel features used by the application? – Peter Mortensen Jan 25 '22 at 13:43
  • I haven't used it but Excel has a javascript API that might be useful https://docs.microsoft.com/en-us/office/dev/add-ins/reference/overview/excel-add-ins-reference-overview – hojusaram Jan 25 '22 at 16:04
  • This is generally what Power BI is built around. Its business intelligence is based around Excel and your power users would manipulate and upload dashboards and calculations from Excel. – Chris Schaller Jan 26 '22 at 12:35
  • 1
    @MartinBöschen Have you thought about using a JDBC driver? https://sourceforge.net/projects/xlsql/ There are also ODBC drivers that can treat Excel workbooks as databases. As the accepted answer says, it isn't a good idea to do this but if you really need to... I don't have a high enough reputation post an answer, hence this comment. – Dave Gremlin Jan 26 '22 at 16:10
  • The answer is a bit ambiguous as the title mentions Excel, but it seems the only thing you really care about is actually having the application logic in Excel. The bit about having the Excel.exe doing the computation is not really the core here. It is about not having to duplicate the efforts, having a single source of truth. In that case, I think the answer mentioning SmartXLS is a good option, as it lets you do that. – oligofren Jan 27 '22 at 04:39

6 Answers6

66

Not the answer you were hoping for

While this may be possible somehow, it is likely a dead-end solution. You should seriously reconsider the decision to not want to rebuild the logic in a language that is better suited for server operation.

Running Excel as a backend processor would create a number of difficulties:

  • You need to design some way of running multiple instances of Excel without interference between them, which means that you would need to copy the spreadsheet file for each instance and use that instance for only one session. A related problem is to tear down the Excel process once the related session isn't active anymore, which isn't easy to detect.
  • You create a dependency on a runtime backend that is able to run Excel in the way your application expects. Since Excel is intended as an interactive desktop application, your use case probably isn't covered in Microsoft's future plans, and it is possible that with a newer Excel version you will be forced to either rebuild the integration, or keep your old version that does not get security updates anymore.
  • Speaking of security, you're probably (not) aware of the security issues of using an application that isn't meant to be accessed by internet users. Web applications using SQL database backends have been riddled with SQL injection vulnerabilities, and unless the interface between your web server and the Excel-based calculation backend is either really restricted or very well-designed to be secure, you might be in for some unpleasant surprises.

If you do a serious cost/risk analysis, your Excel based solution idea will probably come out way behind a rewrite (which isn't easy or cheap, but given well formulated requirements, can be done using a straightforward and reliable software development process).

Hans-Martin Mosner
  • 14,638
  • 1
  • 27
  • 35
  • 3
    Indeed. Note that with your approach, it might still be a good idea to somehow be able to call Excel from a script. E.g. in Python, in order to automate tests, and compare the output of the new branch to the old spreadsheets. – Eric Duminil Jan 24 '22 at 19:24
  • 10
    I know way too many business that think excel is a good platform for software engineering. Every way you measure it, it's going to be cheaper in the long run to just re-write it in an actual programming language that can be maintained and managed properly. – Steve Butler Jan 24 '22 at 22:14
  • 8
    I did something similar (essentially, re-purposed a desktop application as a rendering engine for a custom file format), and you're missing what's probably the biggest downside: performance. I estimate that the desktop app is one to two orders of magnitude slower than a dedicated rendering engine would be, and I expect an Excel backend would have similar issues. – Mark Jan 25 '22 at 03:49
  • 8
    Is it not also the case that you would need an Excel licence for every user authorized to access the application? – grahamj42 Jan 25 '22 at 07:43
  • @SteveButler Interestingly there's a current effort at Microsoft to turn Excel's formula language into a Turing complete programming language. I'm not sure this is useful to the OP because I think it still might require a rewrite, but I'm curious to see how it works out. Excel is like a religion to people and they are blind to its flaws. – JimmyJames Jan 25 '22 at 16:11
  • 1
    @JimmyJames that sounds like an abomination. I've definitely seen the cult of excel in the wild. Just because you can doesn't mean you should. – Steve Butler Jan 25 '22 at 19:39
  • @SteveButler I just think it's kind of funny that they've had this sort of degenerate formula language for decades and then, all of the sudden, they are going to try to fix that. Is making this a proper language useful for the people who have made a career out mastering the vagaries of Excel? Does the flock just follow the shepherd? I'm expecting this to go over like a lead balloon but who knows. – JimmyJames Jan 25 '22 at 22:00
  • @JimmyJames Just as the turing completeness of various generic type specifications wasn't the goal but just a consequence of adding additional features, the same is probably also true here. In this case from a quick look, the addition that made excel turing complete (a relatively low bar to clear, lots of things you wouldn't think are turing complete - it's also not a very useful property in itself) are custom functions. And even as someone who hasn't done much with Excel I can see the usefulness in those. – Voo Jan 26 '22 at 13:58
  • 4
    *is possible that with a newer Excel version you will be forced to either rebuild the integration, or keep your old version that does not get security updates anymore.* Based on my experience with Excel automation over the decades, it's not "possible", it's *guaranteed* that updates will break stuff - the question is merely one of how long you've got until that happens – Chris H Jan 26 '22 at 15:51
  • Do note that your answer depends on the basic assumption that the OP requires Excel (the program) to do the calculations. If you think about his main goals: reusing the excel sheet as a single source of truth for both data and logic, there is no dependency on the actual program, just being able to have _something_ do the calculations present in the formulas. If you have that _something_ (another answer mentions one), all those downsides are no longer present. You have new ones, of course :) – oligofren Jan 27 '22 at 04:43
  • +10 If I could. Also, web servers run as services on servers and Excel is not designed to run in such a context - it's designed to, and must, run in a user's desktop context so this is just folly trying to Frankenstein it to a web server. – J... Jan 27 '22 at 13:51
  • 88% of spreadsheets contain errors - https://www.marketwatch.com/story/88-of-spreadsheets-have-errors-2013-04-17 . A lay person doing an occasional odd job might use a Swiss army knife. The expert will have a toolkit. – Graham Laight Jan 27 '22 at 14:50
  • @Graham Compared to the 100% of business applications that contain errors that number is impressively low. – Voo Jan 27 '22 at 15:43
23

Microsoft has an official in-depth response to this question: Considerations for server-side Automation of Office

They provide no support for automating Excel directly in the backend, and they recommend that you use various programming libraries for accessing and manipulating Excel documents directly.

Most server-side Automation tasks involve document creation or editing. Office 2007 supports new Open XML file formats that let developers create, edit, read, and transform file content on the server side. These file formats use the System.IO.Package.IO namespace in the Microsoft .NET 3.x Framework to edit Office files without using the Office client applications themselves. This is the recommended and supported method for handling changes to Office files from a service.

The Open XML file formats are a public standard.

You may be able to find a library for evaluating Excel formulas that works in your programming language of choice, or you could implement a basic one that supports the subset of functions you need for your business logic.

DBN
  • 339
  • 1
  • 4
  • 6
    This doesn't answer OPs question. Because his main concern is to use Excel as a Calculation Platform so he doesn't have to rewrite business logic. In this case, although useful, the mentioned platform only serves to manipulate the files, but doesn't get the calculations made by Excel. – fernando.reyes Jan 25 '22 at 21:55
10

This support doc, while warning against attempting this, and clearly stating that it's a completely unsupported scenario, nevertheless lays out a roadmap for what you'd need to do if you attempt it.

  1. User Identity: You must run as a regular user with a full profile loaded.

  2. Interactivity with the desktop: You must run in a desktop session, and you need access to the desktop session.

  3. Reentrancy and scalability: You must serialize access to the Office application to avoid potential deadlocks or data corruption.

  4. Resiliency and stability: You must plan on the Office apps crashing, throwing up popup windows or becoming unresponsive

  5. Server-side security: It's just not very secure.

  6. Licensing: All your end users must have Office licenses.

7

If you can call out to a .NET or Java library or process, I can recommend SmartXLS (I have no affiliation other than a satisfied customer circa 2018).

It provides a headless object model/library that has absolutely none of the runtime issues of Excel itself (e.g. pop up dialogs, COM etc.). Unlike some that just allow you to build files, this one actually performs the calculations etc. and lets you read back calculated results.

Some of the object model & methods are not exactly how I'd do it but the underlying engine is sound, fast and very flexible. The support is pretty good too.

LoztInSpace
  • 1,149
  • 6
  • 8
  • This seems pretty good and actually answers the question! _Formula support (absolute and relative references, names, 3D cell references, more than 260 supported functions)._ The OP never said he needed Excel to do the computations, just hand over the excel document and get the calculations back. This should fit like hand in glove. That being said, if you go this route, you need to have automated testing in place and make good use of Git so that you store all versions of the excel file, as it is a good chunk of the actual application and might break easily without the business dep knowing – oligofren Jan 27 '22 at 04:30
  • At 200 USD/developer for a year the price is an absolute steal too. – oligofren Jan 27 '22 at 04:36
  • Sounds interesting, maybe sufficient for the OP. However, I am pretty sure this solution has limits. For example, I guess it cannot evaluate Excel formulas using embedded UDFs (written in VBA), otherwise they must include the full VBA execution environment. – Doc Brown Jan 27 '22 at 06:41
  • 1
    @DocBrown Correct. Last time I used it there was no VBA support but all native Excel functions were implemented. Of course all solutions will have limitations but unless the OP engages my consultancy services, I'm not going to do a full analysis or speculate on whether they need VBA. :) – LoztInSpace Jan 27 '22 at 08:07
3

You might be interested in ExcelJS. Using this Node.js library, you can load, modify, and save Excel files. So instead of invoking Excel you would load the spreadsheet into your Node backend and persist it in memory, and then update the cells you need.

We want use Excel for the actual calculations.

ExcelJS does not perform calculations itself. To perform the calculations you could use the library formula.js. You would need to read the formulas and cells from ExcelJS first. Depending on the complexity of your spreadsheet this may or may not be a viable solution.

  • 2
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jan 25 '22 at 12:38
  • 5
    What does the calculation? Some Excel server? Independent of Excel? Preferably, please respond by [editing (changing) your answer](https://softwareengineering.stackexchange.com/posts/436242/edit), not here in comments (***without*** "Edit:", "Update:", or similar - the answer should appear as if it was written today). – Peter Mortensen Jan 25 '22 at 16:11
  • 1
    Being able to access the cell values as a DB does not fix the issue of computation, e.g. find the result of a forumula, which was the whole point here. From its docs: _Note that ExcelJS cannot process the formula to generate a result, it must be supplied._ The answer mentioning SmartXLS is different in this regard as it actually computes. – oligofren Jan 27 '22 at 04:28
0

Excel is not meant to be a database. And it's not meant to be an application that is interconnected.

Still you might be able to transform your companies way of digitizing processes in a multi-step process. The easy way out of your current situation might be converting Excel to Google Spreadsheets. You get an API for free, which involves proxies, request throttling and everything else you can imagine. Google even hosts the whole stack for you. It's basically a whole app stack including a database.

While I have no idea what the actual logic is, you can write and update source cells, while reading derived and calculated values from target cells.

Keep the data limit in mind:

Your dataset is 5 million cells or fewer.

kaiser
  • 193
  • 1
  • 6