Changing to relocatable code for PIC microcontrollers

Question

I've come back to PIC programming after 10 years and I've been relearning everything. I'm looking at the section in the MPASM manual where it discusses relocatable code. And I've come to the conclusion that I'm puzzled to why absolute code is used at all?

Take this case:

        processor pic16f88
        #include p16f88.inc

.data1  udata      0x20
var1    res        1
var2    res        1

.reset  code 0
        pagesel    Init
        goto       Init

        code
Init:
        ....

        end

For all intents and purposes, isn't that absolute code?

So, from what I can see you can change the udata line and remove the 0x20. Then the linker will place it where it wants to. But you can override that in the linker script and specify an exact position:

section name=.data1 ram=gpr0

I'm mentioning this on stackexchange because it's rarely mentioned and it's the only reason one would ever use absolute code over relocatable (again, in my humble opinion).

While typing this, Stackexchange suggested this link which is a superb response and one of the few places I've seen the linker mentioned.

I do have a few questions though:

Can I use a dot to lead a section name? Like ".reset" above. I've seen it used but I'm not sure of its validity. In my mind its a way of keeping labels (Initial caps) and variable names (all lower) and section names (start with dot) in seperate "namespaces".
I'm puzzled as to the idata directive. If the data is initialised, who does so? Is there a code block I have to call to set initial data. I'd love to use this instead of setting initial data in an init section.
I want to use a large block of data for a buffer. I'm using a PIC16F1829 so I can use the FSR in a flat memory mode to point cleanly across banks. My issue is - how do I tell the assembler and the linker that I'm using, say, banks 3 and 4 for this purpose. If I use the keyword "protected" in the linker, do I have do use udata in the assembler? Or can I just pick some memory, put it in FSR and start writing?

And: Why on earth do books and tutorials (Gooligum excepted) insist on using absolute and why do lecturers keep using it? It does seems totally bizarre to promote absolute over relocatable long past its due date.

I realise this is opinion and might not be within Stackexchange's guidelines either but it's terribly important to know that relocatable code exists!

What's also rarely mentioned (thank you gooligum) is that the udata and code directives keep any variables/code so declared within the same bank. You are not required to banksel for every variable. — carveone, Jul 09 '14 at 11:39
Related: For an utterly marvellous tool set (and whole environment and more) from the PIC master see http://www.embedinc.com/pic/dload.htm - Olin could tell you much about it BUT you'd have to pay his consulting rate, which you'd like to avoid. SO just wade into the page. You'll find that absolute code has no place in his world. — Russell McMahon, Jul 09 '14 at 11:51
When I checked back to see why I used the dot in ".reset" that was from a post by Olin too so this all stemmed from him to begin with :-) — carveone, Jul 09 '14 at 11:59
@Russell: If someone wanted personal support for their project, I'd charge them. However, I'm willing to answer reasonable questions about the tools here in a public forum where everyone can see the answers and I can point others to them in the future. On a separate note, Russell, my consulting fee can often be a lot less than the cost of a screwup in your PIC project, and having your employees re-invent the wheel on company time. — Olin Lathrop, Jul 09 '14 at 14:54
I posted the question partially as I had a question and partially because a) I felt aggrieved that the book I'd learnt from didn't mention the linker and b) I wanted others (everyone!) to know that it's there and it solves some aggravating management problems. — carveone, Jul 09 '14 at 14:58
@OlinLathrop - I hope that you did not take any of that as a complaint or negative comment - it very genuinely wsan't meant that way. [We disagree on some aspects of how to do things, as you know, but when it comes to PIC technical aspects I'm an enthusiastic part of your cheering squad :-) ]. I expected you'd have something to say on this question as it's one of your specialist and/or favorite areas. I am also aware of your dislike of people contacting you privately on matters that can be handled in public. So a little advance path pointing seemed to be liable to be helpful :-). — Russell McMahon, Jul 09 '14 at 15:30
@Russell: No problem. I was mostly responding for others to see. I didn't want someone walking away thinking they'd have to pay to get any support at all. Yes, don't expect me to answer your *private* question unless it comes with a check, because I don't get any other benefit from that. Asking here though is different. I want people to use my tools, so will try to provide *public* support like answering questions here when I can. — Olin Lathrop, Jul 09 '14 at 15:47

score 5 · Accepted Answer · answered Jul 09 '14 at 14:48

There are two main reasons you still see absolute mode MPASM code out there: Originally that's all there was, and there are a lot of religious people out there with strongly held beliefs. They think absolute mode is "easier" somehow or that using the linker is hard to learn. Basically, that's what they know and don't want to bother learning a different way, even if it's a far better way, so they make excuses.

Yes, using relocatable mode is obvious. It is all I have ever used in well over 100 PIC projects. I did have the advantage of starting with PICs right after the linker was introduced (1998?), so I never had any investment in absolute code to protect. In any case, this is now ancient history, and there simply is no good reason today. Note that absolute mode isn't even a option with any of the newer toochains, like for the dsPIC.

Some substantial advantages of using relocatable code:

It is possible to actually allocate RAM for variables. The RES directive, which is the only way to do this, is only available in relocatable mode.
The common hack of using CBLOCK in absolute mode to define variables creates symbols with sequential values that only you know represent addresses of variables. Since the system doesn't know the memory locations are used for these variables, it can't detect and tell you about collisions or overflows.
You get to use modules, meaning different parts of your code are separately built. This provides, among other things, a separate namespace for local symbols in each module. Separate modules can be written separately, each having a local variable called COUNT or a label called LOOP without conflict, for example.
You can easily prevent code sections from crossing page boundaries.
In absolute mode, the code just ends up where it ends up, with no detection or warning that different parts are on different pages, and therefore require PCLATH manipulation to jump or call between them. Worse yet, this can change every build as code is modified. It might be fine one build, them you get a subtle bug when a page boundary happens to end up between the start and end of a loop.
Code is somewhat insulated from memory layout details. Some old PICs started user RAM at 20h. Code that used CBLOCK h'20' to define variable symbols (remember that CBLOCK doesn't really define variables) will break on many newer PICs without warning. Code that used UDATA and RES will be fine, or will get a linker error if the RAM region is overflowed.

There are other advantages, but these are so compelling as to make absolute code a blatantly stupid choice. There really is no excuse.

While the overall advantages are overwhelming, there are some issues that need to be considered in relocatable mode as apposed to absolute mode.

The main one is that when using a bare UDATA, you don't know the bank of variables at build time. This prevents bank-setting optimizations. I get around this by specifying the bank for local variables of a module, and usually a single bank for the limited global state. Local variables within a module are forced to a particular bank by something like UDATA .BANK2, where .BANK2 is a section defined in the linker file that is forced to bank 2. That still lets the linker allocate variables within the bank, and you will get a error if you put too much stuff in any one bank. This scheme means you end up doing bank allocation per module, but that's still a lot better than the all-manual bank allocation without overflow detection that you get in absolute mode.

Since in my code the banks of most variables are known at build time, I can optimize bank setting. I have macros that set the bank and track the current bank setting. On a classic PIC 16, the DBANKIF (set direct bank if needed) macro emits 0, 1, or 2 BSF/BCF instructions on the bank bits in STATUS. A second redundant DBANKIF never emits any code. I can therefore use DBANKIF in front of most variable references, and only the minimum necessary bank setting instructions are actually included in the code. This results in nicely optimized code, with a single assembly constant to change if I want all the local variables in a different bank. The bank switching code will be automatically adjusted accordingly.

Since the DBANKIF and related macros track the bank state in source code order, you do have to pay some attention. For example, this system can't know that code from another place with a different live bank setting may jump into other code. For this reason, I have macros that either tell the build-time logic what the bank setting actually is, or tell it explicitly that it doesn't know. For example, most code labels have UNBANK following them. That tells the build-time bank tracking system to invalidate any assumptions. The next DBANKIF will explicitly set both bank bits, then the system starts tracking from there again.

The way I deal with pages is to use the convention that the upper two bits of PCLATH are always set to the page of the currently executing code, and that each page is defined as a separate memory region in the linker file. That guarantees that any one code section won't straddle a bank boundary. Usually each module contains a single named code section, so effectively code within a module can use local GOTO and CALL without PCLATH manipulation. The flip side is that you have to assume code in any other section can be on another page, if your PIC has more than one page. This means before a CALL to a remote subroutine, you have to set PCLATH for the target page, then restore it after the return. I have GCALL (global call) and GJUMP (global jump) macros that do just that in a single line of source code.

Yes, I hit a certain point in code size and started run into problems. Specifically the management of variables and code and the approaching code page wall. If I moved a variable I had to manually check all the bank selection code to see that it was consistent. Now that I know each source file can be held within a code page I can goto (prefer BRA) at will within that source and then use a macro to far call outside them. Heck, I'm not a computer and it's not within my competence to go around hand placing code! Thanks for the answer - it details a lot of the edge cases I wasn't convinced about. — carveone, Jul 09 '14 at 15:16
The only thing I did wonder about was: if I place sections using the linker script, I presume I can use every other part of memory for things like buffers. Seems wrong somehow though. I presume there's no problem with doing a "buffer res 80" and taking the whole page! — carveone, Jul 09 '14 at 15:30
@carv: "buffer res 80" would take a whole bank (not page), on many classic PIC 16. There is nothing wrong with that. The linker will guarantee that nothing else will collide with the buffer, or give you a error if you try to force something to the same bank. If you define the buffer in a named section, you can even know the bank at build time. That matters littleon a 16F1xxx with full-width FSRs, but it can help on a classic PIC 16 that only have 8 bit FSR. — Olin Lathrop, Jul 09 '14 at 15:41
Sorry yes, "bank" not "page". That works for me! I need (ok, _prefer_) a 120 byte SPI buffer and knowing the banks means I can have two of them adjacent on a full-width FSR which simplifies the SPI write code. That'll do nicely. Thanks! — carveone, Jul 09 '14 at 16:41
I've not used MPASM much, but when coding for another platform where code paging was important (6502) I much preferred to use an absolute-mode assembler along with macros to flag page-boundary problems. When programming under tight constraints, it may be necessary to impose "interesting" conditions upon how things are laid out; trying to get a relocatable-code toolchain to produce code that obeys constraints it doesn't understand can be a lot harder than trying to make absolute-mode code do likewise. — supercat, Apr 09 '15 at 21:28
@supe: Perhaps that how it was with the 6502 tools, but this is not true of MPASM/MPLIB/MPLINK. Relocatable mode is a superset of absolute mode. You can force pieces of code in relocatable mode to specific addresses when you want to, for example. — Olin Lathrop, Apr 09 '15 at 22:02

score 0 · Answer 2 · answered Jul 17 '14 at 13:20

I'd like to follow up with a longer comment about this question I had: "I want to use a large block of data for a buffer. I'm using a PIC16F1829 so I can use the FSR in a flat memory mode to point cleanly across banks...."

I want to post this answer for other's interest as it shows how incredibly easy it is on the PIC16F1XXX processors to use larger buffers in tandem with relocatable. If you look at the LKR for the processor of interest, in my case the PIC16F1829, you see these parts:

DATABANK   NAME=gpr0  START=0x20  END=0x6F  SHADOW=linear0:0x2000
DATABANK   NAME=gpr1  START=0xA0  END=0xEF  SHADOW=linear0:0x2050
....
SECTION   NAME=LINEAR0 RAM=linear0         // Linear Memory

Note the SHADOW= parts and the LINEAR0 section. Then just do this in your code:

; Allocate two buffers 100 and 150 bytes in "linear" memory.
LINEAR0      UDATA
buf1         RES        d'100'
buf2         RES        d'150'

And that's it! It really is that straightforward. The linker knows exactly where it puts these so you don't have to even think about it. Use the fsrs to write

 movlw   high    buf1
 movwf   fsr0h
 movlw   low     buf1
 movwf   fsr0l

FSR0 now points to buffer and you can use movwi and moviw to write to the buffers. Relocatable, the linker and the PIC16F1x makes this so easy.

Changing to relocatable code for PIC microcontrollers

2 Answers2