4

One of the features in our project is to implement a comparison algorithm between two versions of text and provide a % change between the two versions. While I was researching, I came across google java-diff-utils project.

Has anyone used this for comparing text using java-diff-utils ? Using this utility, I can get a list of "delta" which I assume I can use it for the % of difference between two versions of the text? Is this a correct way of doing this?

If you have done any text comparison algorithm using Java, could you give me some pointers?

gnat
  • 21,442
  • 29
  • 112
  • 288
java_mouse
  • 2,627
  • 15
  • 23

1 Answers1

1

What does "the % of difference" mean? If you start with a block of text and replace the characters in every other word with "q"s has it changed by 50%? If every other word is replaced with a single "q" has it changed by more than 50%? How much more?

I think the problem is too complex to have a single number as the answer.

This is normally handled with 3 numbers; inserted, deleted & replaced. But the definition of "replaced" can become problematic.

Jim
  • 11
  • 1