Is there any algorithm pattern to protect any content in the web to ensure I am the first one who created it?

Question

A few years ago there was this hacker (don't remember who he was) that full disclosed a vulnerability in a given system, but to make sure nobody took credit for that, he created some kind of PGP key.

What I understood at the time is that he created a key to ensure he was the one who discovered it, but didn't disclosed who he was actually, just created some mechanism to be able to prove that he was the one who created the disclosure.

Ok. I get how algorithms and cryptography works. But I still don't understand how you can create a key to protect a given content disclosed in the web to prove you are the one who created it first! It is just words!

Is it really possible? What should be the process to ensure you can empirically prove it? Did I understood it correctly or I probably missed something regarding this case?

I hope this question is specific enough, basically it is just how to protect a content that you created in the web (a paragraph, a code, a word, etc.) and make sure you are the one who created it first, inside a given context.

With my knowledge I don't see how that is possible, but I am intrigued if there is a practical way to do it. Is there?

Maybe asking at http://crypto.stackexchange.com/ would be better idea? — Euphoric, Dec 15 '14 at 13:08
Can I just copy/paste the question there? I mean, this is pretty general and doesn't related directly into how a given algorithm works, it is more about a practical usage that would enable someone to achieve the desired outcome. — Fagner Brack, Dec 15 '14 at 13:10
@FagnerBrack - Don't copy/paste. Just flag the question for migration to crypto SE. — mouviciel, Dec 15 '14 at 13:12
`But I still don't understand how you can create a key to protect a given content disclosed in the web to prove you are the one who created it first! It is just words!` There's a difference between proving you created something and proving you were the first to create something. The PGP key can prove the hacker submitted the vulnerability report. There's no guarantee the vulnerability wasn't discovered or reported before him, but he can at least show that he had reported it on a certain date. — Doval, Dec 15 '14 at 13:16
There was a story about a mathematician that discovered a proof and didn't want to publish it yet but also didn't want someone else to get credit. So instead he published the letters of sentence of the proof sorted alphabetically. I forgot what his name was. — Paul, Dec 15 '14 at 20:25
@Paul - that would be Robert Hooke, Hooke's Law en.wikipedia.org/wiki/Hooke's_law — James McLeod, Dec 15 '14 at 20:45
@Paul Read [establishment of priority](http://en.wikipedia.org/wiki/Anagram#Establishment_of_priority) in the Wikipedia page on Anagrams. Galileo did it, Hooke did it, [Huygens](http://www.sil.si.edu/DigitalCollections/HST/Huygens/huygens-introduction.htm) did it... it was a fairly standard approach back then. — , Dec 16 '14 at 01:27
Take a photo of you holding the content in front of an authoritative clock. — Petah, Dec 16 '14 at 06:52
[Zero knowledge proof](http://en.wikipedia.org/wiki/Zero-knowledge_proof). — Damon, Dec 16 '14 at 16:13
@mouviciel: Flagging for migration is a cumbersome process. An alternative is to delete the question and repost it on the correct site. — Robert Harvey, Dec 16 '14 at 19:08
@FagnerBrack if you do wish to take it to Crypto.SE you should either (a) flag this question for migration so that all the answers and comments go with it, or (b) reconsider the cryptographic aspects of the question and as a *new* one that builds on what you have learned here and asks a deeper crypto question. The copy, paste, and delete option is only a viable choice when you *can* delete the question (once there's an up voted answer, you can't delete the question). — , Dec 16 '14 at 23:40
I am not going to flag for migration. My arguments are that this question is pretty general and doesn't related directly into how a given algorithm works, it is more about a practical usage that would enable someone to achieve the desired outcome. — Fagner Brack, Dec 17 '14 at 08:58
I understand the crypto site to be more like stackoverflow, when there is technical details to be discussed deeper regarding cryptography. Programmers, in other hand, is more about concepts and practical usage, and that is what this question is all about (whether you have an implementation with code, or not). — Fagner Brack, Dec 17 '14 at 08:59

score 39 · Answer 1 · answered Dec 15 '14 at 13:21

In days of old, scientists would publish anagrams of their work to be able to say "I thought of this idea." (look at the 'history' and 'establishment of priority' sections) The thing is, they wanted to be able to take credit for it, but also give other scientists to publish their results if they had other ideas without building on the original idea.

For example Gallileo published SMAISMRMILMEPOETALEVMIBVNENVGTTAVIRAS which was an anagram of altissimvm planetam tergeminvm obseravi which translated from Latin reads "I observed the highest planet in threefold shape". He got it wrong - Saturn (the 'highest' planet known at the time) isn't built of three parts. Fifty years later, Christiaan Huygens published AAAAAAA CCCCC D EEEEE H IIIIIII LLLL MM NNNNNNNNN OOOO PP Q RR S TTTTT UUUUU which in Latin is Annulo cingitur, tenui, plano, nusquam cohaerente, ad eclipticam inclinato which translates to "It is surrounded by a thin flat ring that does not touch it and is inclined against the ecliptic."

While those are historical bits now of interest, they show an important concept back then - providing a 'hash' that is easy to say "this hash encodes this text." It is easy to go from the known text to the anagram or the hash, but hard to figure out what it is if you don't know what it is in the first place.

With the modern mechanisims, we have other ways of doing hashes. Many of them are very closely related to cryptography. There is the cryptographic hash function. The idea being, still, it is easy to go from the text you know to the hash, but hard to go from the hash to the text you don't know.

And so, if you have a program you could publish a hash of the program that does something and then when you are ready to disclose it (possibly after the company fixes it or a period of time later), you can publish the actual code and everyone can see that, yes, this code matches that hash.

Very nice bit of history about the use of hash before the digital age :) — mika, Dec 17 '14 at 12:48

valenterry · Answer 2 · 2014-12-15T13:45:13.240

30

You can do that quite easy. If you have a plaintext text, secret key S and public key P you do S(text) and get the cipher.

Now you can publish cipher and P but not S. Therefore, everyone can decrypt the cipher with P by doing P(cipher). If you now want to prove, that you are the one who created the cipher (and therefore the original text), you can either publish S, or - if you don't want anyone to know S - you can create another S("I was really the one who found the text first") and publish it. Because there is no way to create a cipher that is decrypted with P(cipher) that results in some meaningful text.

That is how you can prove it.

edited Dec 15 '14 at 13:45

answered Dec 15 '14 at 13:18

valenterry

2,429
16
21

13

How does this help? Anyone who can decrypt cipher can just republish the same way with their own secret key and you have no way of proving which party published first and which copied from the other without trusting some third-party record of publication time. – R.. GitHub STOP HELPING ICE Dec 15 '14 at 18:04
2

@R.. Content on the internet usually has a date associated with it (e.g.: forum posts). If there are multiple people claiming to be the original inventors/ discoverers, then you just check the dates. – Paul Dec 15 '14 at 18:26
2

You can also include some content in the plaintext that is encrypted via P(content). The true owner of S will be able to decrypt that additional information, whereas a reposter will not be able to do that. – Dancrumb Dec 15 '14 at 19:30
10

@Paul: But then the encryption is irrelevant. It's equivalent to just posting your plaintext content and relying on the date. – R.. GitHub STOP HELPING ICE Dec 15 '14 at 20:16
@R.. In this case, you have no chance. Even if you go for a steganographical approach, anyone can just "get the idea" of what you wrote and tell it in his own words. – valenterry Dec 15 '14 at 20:25
2

@R.. the idea is that, without knowing `S`, it is very hard to come up with a piece of text `cipher2` such that `P(cipher2)` isn't gibberish. With `S` it is trivial, as `S("the text you want")` = `cipher2`. Doing the reverse amounts to defeating the crypto method. – congusbongus Dec 16 '14 at 02:07
@congusbongus: Doing that is not necessary to mount an attack, because as the attacker you can change P. – R.. GitHub STOP HELPING ICE Dec 16 '14 at 05:14
@R.. no, as the original author publishes the cipher along with P. Only the real original author can easily claim authorship of the cipher + P pair since they can produce more messages that decode properly with P; an attacker who must change P cannot claim to be the same author that published the original cipher. – congusbongus Dec 16 '14 at 05:17
@congusbongus: Like I said, the attacker can easily publish the same plaintext with a different P, and it's impossible for an observer to determine who is the original author of the plaintext and who is the attacker who copied it without trusting a third-party timestamp. The cryptography here is purely security theater. – R.. GitHub STOP HELPING ICE Dec 16 '14 at 06:20
@R.. but it's the cipher text and P that are published, not the plaintext. The plaintext is produced by deciphering the cipher text using P. – congusbongus Dec 16 '14 at 06:28
@congusbongus: That's an irrelevant detail since having P and the ciphertext gives you the plaintext, with which you can construct a new S', P', and new ciphertext that decrypts (with P') to the same plaintext. – R.. GitHub STOP HELPING ICE Dec 16 '14 at 06:31
Let us [continue this discussion in chat](http://chat.stackexchange.com/rooms/19501/discussion-between-congusbongus-and-r). – congusbongus Dec 16 '14 at 06:37

JoelFan · Answer 3 · 2014-12-16T19:40:19.160

It's possible to hash the data you wish to timestamp and turn it into a Bitcoin address. This is known as trusted timestamping. By making a small payment (a satoshi, or 0.00000001 BTC) to it, the payment is stored on the blockchain along with the address you paid to.

Since only the hash is stored on the Bitcoin blockchain, no one can tell what data you stored, but given the pre-hashed data you can prove the data was created prior to the block that contains the payment made to that address.

score 1 · Answer 4 · answered Dec 15 '14 at 15:49

1

A very simple way to establish that you are the first one to publish something, without revealing who you are immediately but having the option to do this later:

Publish it on a well known public source (there everyone can see that you published it)
In this publication, add a line: Originally published on dd/mm/yyyy by the owner of xxx@gmail.com

No need to encrypt anything.

Of course there is the chance that you don't want to publish your results yet. In that case you need to encrypt everything except the line with the email address. However, now it may be harder to get this published on a renowned site.

answered Dec 15 '14 at 15:49

Dennis Jaheruddin

215
1
5

4

What if the "renowed site" changes the date or the publication e-mail? It can happen due to bad intentions of even if the site is hacked. (Ok I am being quite paranoid here, but that's the point. If there is a way someone else besides the original author changes the proof, then your solution is not really deterministic). The point here is not to rely only in human witnesses, but in a way that anyone can confirm authorship in a certain way without your content having to be "published" or "patented" by a trusted entity. – Fagner Brack Dec 15 '14 at 16:08
@FagnerBrack, http://arxiv.org/ has proven sufficient for nearly 1M research papers. – Brian S Dec 15 '14 at 22:45
1

Can I put just anything there or it should pass through some peer review and will only be authorized under specific constraints? This question is not aimed directly to research papers, it is about the possibility to achieve the same result (or equivalent) in a programatic way. – Fagner Brack Dec 15 '14 at 22:52
@FagnerBrack arXiv is strictly for eprints of academic papers (a vulnerability in a system _can_ be the subject of a paper); however, it does show that trusted repositories can certainly exist online. – cpast Dec 16 '14 at 03:41
2

Use archive.org to publish - like arXiv, it's a very long lasting and trusted repository, but unlike arXiv it's free to upload content there. The timestamp on anything you publish there would be considered very reliable. – Steve Midgley Dec 16 '14 at 05:28
1

I agree, fame is the only way to spread knowledge fast and reliably. – bigstones Dec 16 '14 at 06:35

mika · Accepted Answer · 2014-12-17T13:27:19.250

I probably missed something regarding this case?

I think the bit you are missing is a trusted entity.

When you hash the file with the content you want to certify, you can show to the world that you are the owner of this document, without disclosing this document. This is all very well, but how can you prove you had this document at some specific time in the past ?

This is what Trusted Timestamping is about. Here is an extract from wikipedia:

The technique is based on digital signatures and hash functions. First a hash is calculated from the data. A hash is a sort of digital fingerprint of the original data: a string of bits that is different for each set of data. If the original data is changed then this will result in a completely different hash. This hash is sent to the TSA*. The TSA concatenates a timestamp to the hash and calculates the hash of this concatenation. This hash is in turn digitally signed with the private key of the TSA. This signed hash + the timestamp is sent back to the requester of the timestamp who stores these with the original data (see diagram).

(*) Time Stamping Authority

I have been using Universign's Timestamping Service, which has a nice set of tools to make the whole operation easy to perform. There are plenty of companies offering similar services.

It is interesting to note that, as @JoelFan mentioned, that bitcoin provides a way to get a trusted entity that is not centralized (why should I trust anyone ?). The bitcoin chain provides a timeline (you can prove one document incrusted in the bitcoin chain was created before another further down in the chain). To my understanding, however, you would still miss the effective date and time of the event.

Also, Trusted Timestamping is a valid reference in litigation.

There are date-times embedded in the block headers in bitcoin... although they are not enforced by the protocol, they are generally trusted to be at least "ballpark" accurate (i.e. well within 1 day accuracy)... it's also possible to inspect the entire blockchain from the block in question until today to make sure the date-times are monotonically increasing — JoelFan, Dec 17 '14 at 15:12
Sorry to take too long to accept an answer. The hacker probably used a trusted entity, I couldn't find a reasonable way to protect the content authorship without a trusted entity. — Fagner Brack, Mar 12 '15 at 08:53

score 0 · Answer 6 · answered Dec 16 '14 at 14:29

This is a different take on valenterry's answer.

Here's how you would do it using PGP:

Generate a public/private key pair.

You keep the private key and you make sure it stays secret.
You encrypt your idea with your public key: P(idea)
You put P(idea) somewhere that is trusted(not by you, but in general) and will log the time.
When you need to prove you made the idea first, you get the timestamp from when you stored the data, and you decrypt your data with your secret key. S(P(idea)) => idea

This method doesn't require releasing your secret key, which in general is a bad idea. Granted, you can always make a new PGP key pair - it doesn't cost anything, but you shouldn't be recklessly handing out secret keys if you want to be credible.

The hardest part is proving the time, but in terms of documenting a vulnerability, we don't need a 100% bullet-proof-and-verifiable-in-court solution, we just need something that's 'good enough'. The logs of a cloud storage provider(dropbox, rackspace, google, etc) is probably good enough, assuming they implement a secure service.

It's also worth noting that being the first one with an idea and timestamping it has never meant that you were the first person to think of the idea. If someone thought of it before you but never registered the idea with a timestamping mechanism, then they can't prove they made it before you. So, if we're trying to figure out who made it first, and all we know is the time you came up with it, then we have to assume you came up with it first(the other person could be spewing lies).

score -1 · Answer 7 · answered Dec 17 '14 at 21:06

Ugh, so many of these answers are missing the point.

1) What the hacker did had nothing to do with encryption.

2) What the hacker did had nothing to do with time (the time stamp, etc).

What the hacker did was Publicly Sign the release document. When you PGP sign something (an e-mail, a word document, etc), you create a hash which is the sum of the hash of the document being signed, and your own private key. Now, to prove that you are the creator of the document, you just need to "show" the private key, as presumably only the author knows it. Cryptographically speaking, you can "show" that you are in possession of the private key without actually showing the key itself.

So, in effect, he digitally signed the document. The only person who can copy that signature, is someone with his private key. There is nothing to say that the document was made today, or yesterday, or was the first instance of it to exist ever. No amount of hashing timestamps or whatever will change that.

The only way to digitally sign something IN TIME, is to use the blockchain a la bitcoin. There could be no digital currency without time verification - the fact that person A sent money to person B is irrelevant unless we know when. You cannot go into a shop with a piece of paper and say "my mum sent me $100 once. I would like to buy some bread", because a receipt of a transaction doesn't mean the money still belongs to you. You might have given it to someone else in the interim. The blockchain solves this issue by getting a large number of people (bitcoin miners) to all agree on the fact that the transaction happened at a certain time (and then by recording that time in the blockchain forever).

this doesn't seem to offer anything substantial over points made and explained in prior 6 answers (in particular, a lot was written about trusted timestamping, and bitcoin approach was presented already) — gnat, Dec 17 '14 at 21:55

Is there any algorithm pattern to protect any content in the web to ensure I am the first one who created it?

7 Answers7