44

Can anyone recommend a small, free implementation of AES-128 Rijndael for microcontrollers. Ideally, for the PIC18, though a general implementation in C would be useful.

Compiling the axTLS implementation for PIC18 and encrypting/decrypting a block requires 6KB ROM and 750b of RAM.

Compiling rijndael-alg-fst.c for PIC18 and encrypting/decrypting a block requires 28KB ROM and 0.5KB RAM.

Compiling Brian Gladman's 8-bit AES for PIC18 and encrypting/decrypting a block requires 19KB of ROM and 190 bytes of RAM.

Are there better optimised PIC specific variants available?

(updated RAM requirements for axTLS version)

davidcary
  • 17,426
  • 11
  • 66
  • 115
Toby Jaffey
  • 28,796
  • 19
  • 96
  • 150
  • 1
    Is this for bootloader? – Daniel Grillo Apr 20 '11 at 11:00
  • No, it's for a network application – Toby Jaffey Apr 20 '11 at 11:04
  • Microchip has an implementation for dsPIC and PIC 24 that has a code size of 3,018 bytes, but it only had encryption, no decryption. Guessing this doesn't cut it for you though. – Kellenjb Apr 21 '11 at 02:28
  • @Kellenjb Interesting, but I'm looking for something small for 8 bit micros – Toby Jaffey Apr 21 '11 at 09:31
  • Might try asking this on StackOverflow.com with the "embedded" tag on the quesiton – Tall Jeff Apr 21 '11 at 18:02
  • A major question is does it need to be AES? There are other crypto schemes more optimised to low-end micros. Similarly does it have to be PIC18? There may be more suitable targets. Which compiler are you using? The paid-for versions of hitech C have significantly better code efficiency than the free ones. – mikeselectricstuff Apr 22 '11 at 08:38
  • 1
    @mikeselectricstuff Yes, it needs to be AES. I am trying to interoperate with an existing system using AES-128. I'm interested in any small AES implementation, but I am currently targetting PIC18. I'm using the HiTech Pro picc18 compiler. – Toby Jaffey Apr 22 '11 at 15:10
  • you need both encrypt and decrypt? – old_timer Apr 24 '11 at 14:56
  • @dwelch Yes, CBC mode for encrypt/decrypt and CBC MAC – Toby Jaffey Apr 24 '11 at 18:18

8 Answers8

19

I'm wondering how did you get 7.5kB of RAM usage with axTLS. Looking at the code, all the context is stored in this structure:

typedef struct aes_key_st 
{
    uint16_t rounds;
    uint16_t key_size;
    uint32_t ks[(AES_MAXROUNDS+1)*8];
    uint8_t iv[AES_IV_SIZE];
} AES_CTX;

Size of this structure is 2 + 2 + 4 * 15 * 8 + 16 = 504. I see no global variables in aes.c, automatic variables are all small, so stack usage is also reasonable. So where does 7.5kB go? Perhaps you're trying to use the whole library instead of just extracting AES implementation from it?

Anyway, this implementation looks pretty simple, I'd rather stick to this code and try to optimize it. I know it can be tricky, but learning the AES details can help you at least to estimate the absolute minimum RAM usage.

Update: I've just tried to compile this library on IA-32 Linux and write a simple CBC AES-128 encryption test. Got the following results (first number is the section length hex):

 22 .data         00000028  0804a010  0804a010  00001010  2**2
                  CONTENTS, ALLOC, LOAD, DATA
 23 .bss          00000294  0804a040  0804a040  00001038  2**5
                  ALLOC

That's just 660 bytes of .bss (I've declared AES_CTX as a global variable). Most of .data is occupied by IV and key. I don't include .text here, as you'll get totally different result on PIC (data sections should be nearly the same size on both architectures).

Code Painters
  • 1,070
  • 9
  • 12
  • I misread by a factor of 10 on the axTLS version. You're right. But, I'm still interested in more efficient versions of AES... – Toby Jaffey Apr 22 '11 at 15:10
  • 6
    Efficient in terms of size or speed? What are the constraints, actually? Keep in mind that smaller libraries will likely be slower - if you look into the source code of the bigger (in terms of code section) libraries, most of the bloat is due to pre-calculated constant arrays. – Code Painters Apr 23 '11 at 19:35
  • 1
    In terms of RAM and ROM footprint. Speed isn't an issue, but I'm looking to cram a lot of functionality into a small device. – Toby Jaffey Apr 23 '11 at 21:13
14

I know this question is a bit old, but I've just recently had to research it myself as I'm implementing AES128 on a PIC16 and an 8051, and so I was curious about this question too.

I've used something like this: http://cs.ucsb.edu/~koc/cs178/projects/JT/aes.c and my ram usage is a couple hundred bytes and the binary size is less than 3kb ROM.

My best advice is to read up on the Wikipedia page http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation and understand the different modes, for instance how AES in OFB mode sorta utilizes ECB mode as a basic building block. Also the XOR'ing (in OFB-mode) makes it a symmetrical operation, so encrypt/decrypt is the same function which also saves space.

When I understood how AES really worked, I could implement it in C and then test it against the NIST specification** (do this! much code found online is flawed) and only implement what I absolutely needed.

I was able to fit AES128 on an 8051 alongside with some other RF firmware by doing this customization and optimization. The RAM usage (for the whole system) went down from ~2.5kb to just below 2kb, meaning we did not have to upgrade to an 8051 with 4kb SRAM, but could keep using the cheaper 2kb SRAM version.

** Test Vectors are in Appendix F in: http://csrc.nist.gov/publications/nistpubs/800-38a/addendum-to-nist_sp800-38A.pdf

EDIT:

Finally got the code on Github: https://github.com/kokke/tiny-AES-c

I've optimized a bit for size. GCC size output when compiled for ARM:

$ arm-none-eabi-gcc -O2 -c aes.c -o aes.o
$ size aes.o
   text    data     bss     dec     hex filename
   1024       0     204    1228     4cc aes.o

So the resource usage is now 1KB code, 204 bytes RAM.

I don't remember how to build for the PIC, but if the 8bit AVR Atmel Mega16 is anything like the PIC, the resource usage is:

$ avr-gcc -Wall -Wextra -mmcu=atmega16 -O2 -c aes.c -o aes.o
$ avr-size aes.o
   text    data     bss     dec     hex filename
   1553       0     198    1751     6d7 aes.o

So 1.5K code and 198bytes RAM.

Morten Jensen
  • 966
  • 1
  • 8
  • 18
  • I wonder how [an implementation I made back in 2001](http://www.kylheku.com/~kaz/rijndael.html) would stack up. It doesn't generate the S-boxes; they are static. – Kaz Sep 13 '13 at 00:31
6

I recently took the axTLS implementation and worked on shrinking it as much as I could. You can easily generate the S-boxes yourself and save yourself a few hundred bytes.

static uint8_t aes_sbox[256];   /** AES S-box  */
static uint8_t aes_isbox[256];  /** AES iS-box */
void AES_generateSBox(void)
{
    uint32_t t[256], i;
    uint32_t x;
    for (i = 0, x = 1; i < 256; i ++)
    {
        t[i] = x;
        x ^= (x << 1) ^ ((x >> 7) * 0x11B);
    }

    aes_sbox[0] = 0x63;
    for (i = 0; i < 255; i ++)
    {
        x = t[255 - i];
        x |= x << 8;
        x ^= (x >> 4) ^ (x >> 5) ^ (x >> 6) ^ (x >> 7);
        aes_sbox[t[i]] = (x ^ 0x63) & 0xFF;
    }
    for (i = 0; i < 256;i++)
    {
         aes_isbox[aes_sbox[i]]=i;
    }
}

You can get the full source at: http://ccodeblog.wordpress.com/2012/05/25/aes-implementation-in-300-lines-of-code/

Andrew
  • 61
  • 1
  • 2
3

I've been doing an implementation in C, AES-128 only, called aes-min, with MIT license. It targets small microprocessors (e.g. 8-bit) with little RAM/ROM.

It has optional on-the-fly key schedule calculation to reduce memory requirements (avoiding the need for the full expanded key schedule in RAM).

Craig McQueen
  • 743
  • 1
  • 5
  • 13
2

This rather old question already has some great answers. But I see no harm in throwing my own two cents since it satisfies some requirements that none of the others have. So please allow me to introduce

µAES

  • It is fully compliant with ANSI-C or ISO C89, i.e. portable.
  • Supports almost all standard block cipher modes and some less-popular ones including ECB, CBC, CFB, OFB, CTR, GCM, CCM, OCB, EAX (EAX') and even Poly1305.
  • All-in-one with no dependencies on any other external library. The whole code is included in a C file and its header: micro_aes.h
  • It is highly flexible and many features are controllable by macros so that, one can discard the unessential parts simply by disabling their associated macros.
  • Small, even a bit smaller than tiny-AES. Also the block-cipher APIs are optimized as much as possible. For example, if you disable all macros except the GCM related ones and set the key size to 256-bit, compiled code size for the resulting AES-256-GCM API would be around 3 KB.

Full disclosure: I am the owner of this repository.

polfosol
  • 227
  • 2
  • 8
1

You may find this implementation interesting. Its from an open source AVR crypto-libary.

You can find some general (outdated) information and statistics about code size and performance here.

AES:

AES information

I only played around with the SHA-1 source from that lib, so I can't comment on AES.

Rev
  • 10,017
  • 7
  • 40
  • 77
1

I'm using the implementation of Texas for msp430 in a Freescale microcontroller S08SH8 with 512 RAM and 8k of flash and also in Arduino without any rework.

http://www.ti.com/lit/an/slaa547a/slaa547a.pdf

http://www.ti.com/tool/AES-128

0

Smallest AES128 I wrote for PIC series can run in 900 instructions and 42 bytes of RAM. I use it myself on the PIC12 series but PIC10F206 is also possible :-).

I can't disclose the code since it from my company but I wrote it in asm for PIC10-12-16 series. Encryption takes 444 bytes of code including lookup table of 256 bytes, this code also included the key load function which is some 25 bytes.

I would all advice to check for the AES paper and implement it yourself !. Most implementations are very bad and use way to much ram and rom.

I also implemented AES128 for the dsPIC and PIC24 and use about 70% less code space compared to microchip's lib and my code is also a bit faster. dsPIC and PIC24 implementation numbers:

" Encryption takes about 2995 cycles. 79.10uS @ 40 MIPS, 197.75uS @ 16 MIPS"

" DecKeySetup takes about 567 cycles. 14.20uS @ 40 MIPS, 35.43uS @ 16 MIPS"

" Decryption takes about 3886 cycles. 97.15uS @ 40 MIPS, 242.88uS @ 16 MIPS"

" Total code size is 1050 Words incl tables."

The beauty about the PIC24 core is that some instructions are 32 bits and this makes life much more easy for building a small AES128 implementation, my code uses all 32 bit instructions available and is completely 32 bit in operation so I can port the code quickly to PIC32 or other 32 bit cpu's.

AES is very simple to implement only most people do not try even !.

Look at the link: http://www.cs.bc.edu/~straubin/cs381-05/blockciphers/rijndael_ingles2004.swf

Kevin Vermeer
  • 19,989
  • 8
  • 57
  • 102
  • Is it open source? Can you post the code? – Toby Jaffey Jul 17 '11 at 21:59
  • 2
    @Paul - Welcome to Electrical Engingeering! Your answer is interesting and encouraging, but it isn't really useful without more detail. 900 instructions could probably fit in a code block! Please se the "edit" link below the answer to improve it. – Kevin Vermeer Jul 17 '11 at 22:08
  • @PaulHolland great news, where is the code? – Frank Jul 19 '11 at 09:11
  • 2
    @Paul - You'd be getting a pile of upvotes instead of the downvote you currently have if you explained how you wrote it and posted the code! If you can't post the code for licensing reasons, at least explain how you wrote it and how Joby could parallel your work. – Kevin Vermeer Jul 19 '11 at 13:13