0

I have built a string to enum converter, which converts a known set of strings that it receives into enum, in order to satisfy a function pointer. Some of you may recognize some elements of this question elsewhere. This is written in Arduino, so C++.

enum input {FIRMV, VALCN, RQSPC, LIDVM, LIDVC, PREVC, STPWM, CHPWM, STPER, CHPER, SHIGH, PREST, PREND, RQPRE, CLPRE,
            RQDAT, CUPTC, LIDTC, CURPR, CUPTT, CUFAC, LIFAC, STFAC, MTFAC, PPFAC, PCOFF, CUPFM, CUPFC, LIDFM,
            LIDFC, PREFC, FCPWM, FCPER, FHIGH, INVAL
           };

input convert(String str) { //A string to enum converter. Strings are sequences of characters used in Arduinos.
  if (str == "FIRMV") return FIRMV;
  else if (str == "VALCN") return VALCN;
  else if (str == "RQSPC") return RQSPC;
  else if (str == "LIDVM") return LIDVM;
  else if (str == "LIDVC") return LIDVC;
  else if (str == "PREVC") return PREVC;
  else if (str == "STPWM") return STPWM;
  else if (str == "CHPWM") return CHPWM;
  else if (str == "STPER") return STPER;
  else if (str == "CHPER") return CHPER;
  else if (str == "SHIGH") return SHIGH;
  else if (str == "PREST") return PREST;
  else if (str == "PREND") return PREND;
  else if (str == "RQPRE") return RQPRE;
  else if (str == "CLPRE") return CLPRE;
  else if (str == "RQDAT") return RQDAT;
  else if (str == "CUPTC") return CUPTC;
  else if (str == "LIDTC") return LIDTC;
  else if (str == "CURPR") return CURPR;
  else if (str == "CUPTT") return CUPTT;
  else if (str == "CUFAC") return CUFAC;
  else if (str == "LIFAC") return LIFAC;
  else if (str == "STFAC") return STFAC;
  else if (str == "MTFAC") return MTFAC;
  else if (str == "PPFAC") return PPFAC;
  else if (str == "PCOFF") return PCOFF;
  else if (str == "CUPFM") return CUPFM;
  else if (str == "CUPFC") return CUPFC;
  else if (str == "LIDFM") return LIDFM;
  else if (str == "LIDFC") return LIDFC;
  else if (str == "PREFC") return PREFC;
  else if (str == "FCPWM") return FCPWM;
  else if (str == "FCPER") return FCPER;
  else if (str == "FHIGH") return FHIGH;
  else return INVAL;
}

Is there a better way of converting the string to enum that doesn't involve slowly running through every layer of an if-else stack? Please let me know if more information is needed.

HFOrangefish
  • 151
  • 7
  • 3
    "This is written in Arduino, so C++." You need to clarify what you mean by this. In particular, do you have the STL available, or are you just doing "C with classes and a dodgy hacked up `string` implementation"? – Philip Kendall Aug 31 '22 at 15:00
  • I haven't written c++ since college, so I might sound a bit daft, but can't you use a `String` in a switch statement? That was my initial thought. The compiler would basically create a table for you. – Greg Burghardt Aug 31 '22 at 22:34
  • It's worth noting that `else` is extraneous code in this case. The branching will work exactly the same without it. You could autogenerate this code from the a specification doc by cutting and pasting into Excel and writing a formula. Would take all of five minutes and would be highly reliable. There is no reason to get too clever about this. – John Wu Sep 01 '22 at 06:46
  • I would also consider putting them in alphabetical order. Otherwise commits and diffs could be difficult to follow if the list should ever change. – John Wu Sep 01 '22 at 06:53
  • 1
    @JohnWu wouldn’t ordering according to the probability (most ised command first) lead to better performance (especially on a slall arduino cpu and if this function is frequently called)? – Christophe Sep 01 '22 at 07:22

3 Answers3

6

As already hinted at in the answer to the first question, a table might be a good idea, even if only mapping from string to enum. The code would be a bit more readable, even though not necessarily shorter nor faster.

If you need speed, you might sort the entries and use a binary search which is simple enough to implement, but that's probably overkill for 33 entries - my rough estimate is about 4-5 comparisons on average compared to about 16 comparisons on average for the simple linear search, and you might improve on that if you put often-used commands at the start of the table. Likely not worth the effort of the binary search.

Hans-Martin Mosner
  • 14,638
  • 1
  • 27
  • 35
  • 1
    Good point about choosing a good ordering when using linear search - I like that. – Toby Speight Aug 31 '22 at 15:50
  • Of course, binary search isn't much effort, given that it's right there in the C++ Standard Library. – Toby Speight Aug 31 '22 at 15:51
  • I don't want to write a separate answer, [here's an implementation](https://gist.github.com/jaskij/d80601be11cd477e052f5653de102079) using an array lookup with minimal C++ feature usage (`std::string` is a placeholder here, and the only change needed to have the code work in C is using `strncmp` and removing that `static_cast`). I skipped some of the entries in the array for brevity. – jaskij Aug 31 '22 at 17:34
  • Why O(logn) for a binary search when you could use hashing in O(1) ? – Christophe Sep 01 '22 at 07:21
  • 2
    O(x) is helpful in when the sets are large, but for really small sets hashing a string might be more expensive than a small number of naive comparisons. It's possible that 30 elements is above the break-even point, but that would have to be measured, and it might have code size costs that you don't want to bear in a microcontroller. – Hans-Martin Mosner Sep 01 '22 at 07:30
  • @Christophe advanced data structures are, in general, quite problematic in embedded programming, especially if they have to sit in RAM (although in this case it could likely be put in the flash). In this particular case, unless it's part of a hot loop, doing a hash table is just not worth the effort. – jaskij Sep 01 '22 at 14:18
  • @jaskij I hear what you say, but keeping a balanced binary tree for an ordinary map sounds more time consuming than an unordered map and the current code risks to create a lot of conversions/constructions to create the String objects from the string litteral in the if-else chain. I don’t know how String is defined and if the compiler could use constant propagation to avoid this step. – Christophe Sep 01 '22 at 18:19
  • @Christophe urgh. You're right. I thought it was smarter - you could easily overload the comparison operators for `char*` and use `strncmp` internally, but Arduino's library doesn't do it. You can easily work around that by using `String::c_str` and `strncmp`. – jaskij Sep 01 '22 at 19:40
  • 1
    As a sidenote, that documentation is horrible. It doesn't even tell you the return type, at least not straight. Depending who you're targeting to consume your API, `String::String(char*)` should be `explicit`. – jaskij Sep 01 '22 at 19:41
6

First thought that comes to mind is skepticism that this is a bottleneck in your program's execution. Most arduino programs spend most of their time asleep, so you should have plenty of leftover CPU available to run through the list as-is.

If for some reason, you still want to speed it up a little bit, I think for a good trade-off between execution speed and code complexity, I think I would switch on the first character then test the full string after that.

Of your 34 commands, they start with 8 unique characters. More evenly distribute the names of the commands if possible, and you can further speed up things.

You'd end up with something that looks like this:

input convert(String str) { //A string to enum converter. Strings are sequences of characters used in Arduinos.
  switch(str[0]) {
    case 'C':
        if (str == "CHPER") return CHPER;
        else if (str == "CHPWM") return CHPWM;
        ...etc...
        else return INVAL;    
    case 'F':
        ...etc...    
    case 'M':
        if (str == "MTFAC") return MTFAC;        
        else return INVAL;
    case 'S':
        if (str == "SHIGH") return SHIGH;
        else if (str == "STFAC") return STFAC;
        else if (str == "STPER") return STPER;
        else if (str == "STPWM") return STPWM;        
        else return INVAL;             
    return INVAL;
  }
}

If you have an abundance of leftover flash, you could use a code generation tool to create an tree of nested switches to narrow down the branches each character at a time.

Additionally, you can skip a comparison by also skipping the first character once you've switched on the first: if (str[1] == "HPER") return CHPER; at the expense of a little bit of readability, and possibly flash usage.

Arduinos, and embedded devices in general play by different rules than desktop apps and webserver applications, so some of the approaches that make sense on an arduino would be crazy-talk for a server application, and vise-versa. There are many situations where just iterating through a loop can be many times faster than more sophisticated algorithms, particularly when dealing with small datasets as is common on embedded devices like these.

Using the C++ STL on arduinos can be pretty treacherous on arduinos because if you aren't very familiar with it you can suddenly explode your code size.

whatsisname
  • 27,463
  • 14
  • 73
  • 93
3

Sounds like a good candidate for a std::map, or perhaps a std::unordered_map.

If you already have a function (probably using switch) that converts the enum to string, then you could use that to populate the map, ensuring the two are consistent.

Also, I've no idea what String is (not being an Arduino developer), but consider accepting a std::string_view rather than passing by value.

Toby Speight
  • 550
  • 3
  • 14
  • 1
    `String` is an [Arduino thing](https://www.arduino.cc/reference/en/language/variables/data-types/stringobject/); picking from [an answer to the OP's earlier question](https://softwareengineering.stackexchange.com/a/440732/136413), they are [possibly a bad idea](https://cpp4arduino.com/2020/02/07/how-to-format-strings-without-the-string-class.html). – Philip Kendall Aug 31 '22 at 15:05
  • 1
    Excellent idea not to build an additional object for parameter passing. `const&String` should do, even for Arduino specific types. – Christophe Sep 01 '22 at 07:14
  • Maybe it’s worth to explain that unordered_map uses hashing and provides a result in O(1) rather than O(log(n)) for an ordered map or binary trees. – Christophe Sep 01 '22 at 07:17
  • @Christophe, anyone that doesn't know the difference between the types really ought to re-read their documentation! The scaling properties might be misleading, because for small containers like this, the overheads of hashing and bucketing (both memory and CPU) may well dominate. – Toby Speight Sep 01 '22 at 09:05
  • @TobySpeight the assumption here is to build the map once and use an easy hash. Keeping things as they are requires a lot of conversions (between const char* and String) a an awful lot of bytes compared. But of course, if you have a convincing benchmark, I may withdraw my claim ;-) – Christophe Sep 01 '22 at 10:25
  • @Christophe, `std::map()` should be able to look up dissimilar types without conversion, provided the appropriate relational operators have been defined. As I said, I don't have the definition of the non-standard `String` here, but I anticipate no reason why it couldn't be comparable to a standard string or string-view without conversion. – Toby Speight Sep 01 '22 at 10:36
  • Or just make the map keys be `String` objects. – Toby Speight Sep 01 '22 at 10:38