4

The established procedure regarding releasing code out in the open is to accompany it with a license. Out of the many flavors that exist (BSD, MIT, GPL, LGPL, Boost ...) there are some that prohibit using the code in proprietary software, unless you pay an agreed fee ofcourse.

I have two questions on this :

  • How is such a violation detectable ? The question stems from the fact that comercial products almost never show their code so you can't know what they use.

I am NOT asking about the legal ramifications - that's off-topic - I'm asking about the technical approach to detect such usage. - Are there any known cases where such a violation has been reported and persecuted? If so, how was it detected?

Philipp
  • 23,166
  • 6
  • 61
  • 67
  • 1
    I'm voting to close this question because legal advice or aid are explicitly off-topic per [help/on-topic] – gnat Aug 28 '15 at 09:57
  • 3
    @gnat I had to put a tag so I put legal. What I'm interested in is how such a violation is detectable in terms of software engineering (as the question states) – Nikos Athanasiou Aug 28 '15 at 09:59
  • 2
    If there are technical ways would very much depend on the type of code and used technology. But one rather common way how such things become known are disgruntled former employees or possibly competitors of a company. – thorsten müller Aug 28 '15 at 10:05
  • 7
    @gnat the OP is obviously not asking for legal advice, he's explicitly asking "how is such a violation **detectable**" - it's a technical question – Jivan Aug 28 '15 at 10:12
  • 3
    Are there any known cases? Yes quite a few, there was a big case against Cisco that was settled out of court. Also see https://en.wikipedia.org/wiki/BusyBox#GPL_lawsuits . In the case of busybox, a firmware upgrade was analysed and found to contain the code. – Jaydee Aug 28 '15 at 10:12
  • "Out of the many flavors that exist (BSD, MIT, GPL, LGPL, Boost) there are a lot that prohibit using the code in comercial products" – Huh? Not a single one of those you listed prohibit using the code in commercial products. All of the licenses you listed are both Open Source and Free Software, and both the Open Source Definition and the Free Software Definition explicitly *forbid* such usage restrictions. – Jörg W Mittag Aug 28 '15 at 10:15
  • @JörgWMittag Do you use GPL code in commercial applications (and applications where you can't disclose the code) ??? You know you'd have to make your source code available then right? – Nikos Athanasiou Aug 28 '15 at 10:19
  • 1
    Updated title to reflect what is actually being asked. Removed the legal tag. Added open-source tag – Michael Durrant Aug 28 '15 at 10:32
  • recommended reading: **[Why do 'some examples' and 'list of things' questions get closed?](http://meta.programmers.stackexchange.com/a/7538/31260)** – gnat Aug 28 '15 at 10:34
  • 1
    I was recently driving a Volkswagen Polo, and when toying around with the buttons on the console, I actually found a menu item labeled "Copyrights", which displayed a copy of the GPL right on the car's console. If a 10000€ car isn't a commercial product, I don't know what is. My phone runs a GPL'd operating system as does my router, my DVD player and one of my tablets, and as my bank account can tell you, those are *definitely* commercial products. My iPad and my MacBook run BSD-licensed OS's. Until recently, I ran Windows XP, which includes a lot of BSD-licensed software. – Jörg W Mittag Aug 28 '15 at 10:36
  • @JörgWMittag There's a distinction between a product using GPL software and a product that **is** software. The Android code is open source and you can download it, but LG doesn't mind. If I made a program that I plan to sell there would be no meaning to realease all its code since users could then get it for free. [Whatever touches GPL becomes GPL](http://programmers.stackexchange.com/a/47048/120443) eg if Autodesk used GPL components then AutoCAD would become open source: why pay 1400$ a license when I can get it for free? Also there are free products you pay for the packaging or support... – Nikos Athanasiou Aug 28 '15 at 10:48
  • 2
    This question might be more successful on https://opensource.stackexchange.com - but please take the comments you get serious and improve your question accordingly. The GPL does not forbid commercial use. It makes the pay-by-copy business model infeasible, but [there are many other business models for software which work well or even better with a license like the GPL](https://opensource.stackexchange.com/questions/88/how-can-large-open-source-projects-be-monetized). – Philipp Aug 28 '15 at 10:52
  • XCode is commercial software which until recently used GCC, which is probably *THE* prototypical GPL-licensed software. Apple switched to Clang, but not for reasons of licensing, the main reason to switch to Clang was that Clang is explicitly designed as an online interactive compiler library for embedding into IDEs, refactoring tools, syntax highlighting editors, whereas GCC was (at that time) a monolithic command-line batch-compiler. – Jörg W Mittag Aug 28 '15 at 10:55
  • 1
    @Philipp This is not discussion on GPL or any other specific license type per se. I wasn't planning to elaborate a [13 page document](http://mchapman.com/amb/soft/gpl.pdf) but on the techniques that can be used to enforce it. – Nikos Athanasiou Aug 28 '15 at 10:57
  • @JörgWMittag There's an [exception](http://www.gnu.org/licenses/gcc-exception.html) to GPL regarding GCC I suggest reading – Nikos Athanasiou Aug 28 '15 at 11:02
  • Where did I mention the GCC runtime library? I'm talking about embedding GCC into XCode, not about embedding the GCC runtime library into code compiled with GCC. – Jörg W Mittag Aug 28 '15 at 11:11
  • Not a single one of the licenses you list as examples prohibits use in commercial applications, so there is nothing to detect. You are basically asking us how to detect something which doesn't exist, which simply doesn't make sense. – Jörg W Mittag Aug 28 '15 at 11:14
  • @JörgWMittag Have you ever installed XCode in your PC ? I'm not mocking, that's a serious question, I want to know if you get the difference between using the compiler and mixing the compiler's source code with code that will be compiled into a new program. VS can use clang, eclipse can use gcc. I mentioned packaging and services earlier. The fact that XCode uses **the binaries** and runtime libraries (which are excluded from GPL) of a compiler to compile source you write with it (ie it can "talk" to a compiler) does not mean that the source of a specific compiler is embeeded into its code – Nikos Athanasiou Aug 28 '15 at 11:28
  • @JörgWMittag On the "asking us how to detect something which doesn't exist" : Most of those licenses require attribution to the original author and distribution of a copy of the license itself. In commercial **software** products that do not disclose source code (lots of times on purpose as it's not desired to disclose what technology they used) how would these terms be satisfied ? Are there no cases where they're violated? If I talked about "creative commons license" would you be OK? I already mentioned that this is not a question on licenses per se – Nikos Athanasiou Aug 28 '15 at 11:31
  • 5
    What you really want to know is how a license breach in **closed source** software can be detected, not in "proprietary" or "commercial" software. I suggest you consider to edit your question to make that clear to people like @JörgWMittag. And the specific license where "keeping the source code closed" *is* a license breach is GPL, and maybe LGPL, so why don't you ask specificially for this? – Doc Brown Aug 28 '15 at 11:59
  • The question is also relevant for the case where a commercial proprietary code base is suspected to be stolen by a competitor, though the question isn't worded this way. – rwong Aug 28 '15 at 13:36

1 Answers1

4

Disclaimer: this is my layman understanding. I am not involved or educated in anything like this. Consider my answer to be somewhat untrustworthy. I have no legal knowledge or training.

This is fairly basic software forensics, which has plenty of overlap with software reverse engineering as far as the technical skill set is concerned.

It is not sufficient for one to identify the evidence. One has to testify in court that during the search for evidence, one does not break any law (including contracts, copyrights, etc., or anything forbidden by law). Otherwise the evidence might not be used in a lawsuit.

As a result, such work must be carried out with heavy supervision of legal experts, etc. But the work in that environment would be pretty much like standard software reverse engineering. (See disclaimer above. This is just my imagination and from my reading of popular articles describing the same.)

As for practical techniques, notice that a lot of software contains constants, string literals (hardcoded strings), etc., which allow for quick indiscriminate scanning for potential targets. Typically, this "scanning" has to be done by yet another third-party - not by the same business entity that will be performing the detailed analysis - in order to comply with the law. The accuracy from this potential target search might be low, as it has been frequently publicly reported that false-positives do occur, and occasionally causes legitimate software to be temporarily taken offline. (Reports on the latter can be searched online; the former viewpoint about "low accuracy" is personal opinion and is not substantiated.)

After the initial targets have been found and the work is transferred to a proper software forensics team, the team would just reverse-engineer the binaries to recreate the structure, sequence, organization of the compiled software. In the superficial sense the compiled code do not directly resemble the source code, but software forensics can identify plenty of evidence that can lead a court judge to conclude that it is highly unlikely that the binary is not produced other than by compiling from the source code that is allegedly being infringed.

rwong
  • 16,695
  • 3
  • 33
  • 81
  • The keywords found in this answer gave much to think and google about; thank you for that and the presentation of a cohesive process of detection; made asking worthile even if they close this thread – Nikos Athanasiou Aug 29 '15 at 08:18
  • This might give you an idea, but someone who wanted to be sneaky could trivially alter the source so this wouldn't work; and someone who forgot to be sneaky could trivially produce other programs, with source, which would provide false positives. – ddyer Aug 29 '15 at 17:33
  • @ddyer: that's correct; that is the obfuscation part which I didn't mention. My guess is that most infringement investigations focus on the easier cases first, that is, when the infringing party did not use obfuscation. Otherwise it would have taken an entirely different route - it would require one or more whistleblowers, and then some subpoena to collect evidence directly from the infringing party's workplace. The cost would only be justified if it is a dispute between two large commercial software companies where code theft is suspected. – rwong Aug 30 '15 at 03:29