I am looking into the Rope Data Structure, used by some text editors like the xi editor.
I get the basics of how it works, and have seen some sample implementations such as here or here. But I am confused as to how you properly (optimally) handle "upgrading" a rope to be syntax highlighted. It seems to work in the following steps.
- You start of with a string, not a rope, which is an array of contiguous characters.
- You then convert that to a rope, maybe by breaking the string every 128 characters (I saw that recommendation somewhere). So now you have a basic, non-syntax-highlighted rope which is just a bunch of 128 character chunks concatenated together.
- Then somehow you convert this rope into a syntax highlighted rope.
So it is as if you go from here (the "original" tree):
_ 1 _
/ \
2 5
/ \ / \
3 4 6 7
| | | |
abc xyz foo bar
To something like this lets say (the "target" tree):
_ 1 _
/ \
2 5
/ \ / \
3 4 6 7
| | | |
/ \ xyz foo / \
R G B R
| | | |
a bc ba r
where R = red, G = green, B = blue.
That is, some extra "style" nodes were inserted to give the letters color. Basically just simple syntax highlighting for demonstrating the question.
So the question is, how you treat the two different trees. I am confused about if you should be keeping the "original" tree unmodified, and then construct a new tree for the syntax highlighted version, and somehow keep a mapping between the two. Or if you should just modify the original tree to create the target tree directly from it, so you end up with only 1 tree instead of 2. But then the tricky part is if you want to then save the string back into unsyntax-highlighted form, you have to constantly iterate through the syntax-highlighted target tree and regenerate an original / basic / unsyntax-highlighted tree from it. If you are doing text editing, then this means every file save it would have to iterate through the whole syntax highlighted tree, generate a new tree, and override the file contents with the new basic trees bytes. That somehow seems ineffecient and I'm getting confused in thinking about what to do here.
Another way to look at this is if you had 2 or n views rendering the same rope. In view 1 you want unsyntax-highlighted, and view 2 you want syntax-highlighted. Not sure what to do here. If they referenced the same rope data structure, then it seems you would have to construct an additional tree for the syntax-highlighted version separate from the original tree, so it didn't mess with the original tree for the first view. So then it seems like there would need to be some sort of mapping between these "derived" ropes.