Let's say I have
procedure1() {
--body of first procedure--
}
Then I rename it into procedure2
and create a procedure1
above it:
procedure1() {
--body of second procedure--
}
procedure2() {
--body of first procedure--
}
Not once a line-based diff tool has highlighted the code from --body of second procedure
all the way to procedure2() {
as new code inside procedure1
.
This is bound to happen, since most diff tools are oblivious of the underlying structure of the source code. AST-based diff tools can't work very well either, because of several reasons and I know what people really want is a semantic diff tool, but that's not going to happen.
I haven't seen a discussion though, about whether or not it would be practical to annotate the code in such a way that a line-based diff tool would understand the underlying structure of the source code.
For example, I could throw in some UUIDs in the code, like this:
//BeginBlock{E999A3BF-626E-428F-A2C1-6AFF0CD22BF2}
procedure1() {
--body of first procedure--
}
//EndBlock
And the modified code would look like this:
//BeginBlock{7C734F0A-92F4-45EB-B653-DBB9A0F18354}
procedure1() {
--body of second procedure--
}
//EndBlock
//BeginBlock{E999A3BF-626E-428F-A2C1-6AFF0CD22BF2}
procedure2() {
--body of first procedure--
}
//EndBlock
The point is to assign some tokens (unique within a file or project) to some marks that would reflect part of the structure of the source code.
An IDE could update those annotations automatically and they could help a diff tool better detect structural changes. The tool would scan the code once to identify the sections of the program (and how they've been moved around) and then compare those blocks having the same ID.
Do you think this approach is practical?