Pythonic way of writing `for my_var in my_iterable`

Question

A very common pattern in python is:

for something in ['an', 'iterable']:
    func(something)

I almost always see the iterable as a list (like in this example), but it could very well be a tuple, set, or something else. Is the list the most performant and pythonic way to do this?

Similarly, I often see the following:

if something_else in ['another', 'iterable']:
    func(something_else)

In this case, we're actually searching for something_else (note the if instead of for) which would probably be more performant if the iterable was a set. However, this might not necessarily be true for very small quantities. Is a list still the pythonic way to do this?

Are you asking which data structure is more appropriate for iterating and determining object presence, or how best to write these exact scenarios of having a list of string literals in a `for` or `if` statement? — Kyle McVay, May 01 '19 at 02:55

score 2 · Answer 1 · answered May 01 '19 at 03:40

All of the options you've laid out for the first case (iteration) are perfectly Pythonic. If you're going to write a for loop and iterate through an iterable, then it doesn't really matter what that iterable is (unless order matters, then obviously use a tuple, list, or something that maintains order and not a set or dict). That's the whole point of having an iterable as an abstraction!

For the second case (checking a collection contains an item), I'd always recommend using a set. Sure, it barely matters if your list is small. But code changes over time, and usually that change is an addition, not a subtraction. It's common for people to extend the collection to include more elements to test for. Set up future maintainers (possibly including yourself) for success, not a performance headache. Also, a set is just the right tool/abstraction for the job, since if you just want to know whether an item is in a collection, then duplicates never matter.

If you're curious about performance, never guess! Always measure. It does make a slight difference even with a small iterable:

In [1]: def test_if_in(x):
   ...:     if x in ['one', 'two', 'three']:
   ...:         return 'Yes'
   ...:     else:
   ...:         return 'No'
   ...:

In [2]: def test_if_in_set(x):
   ...:     if x in {'one', 'two', 'three'}:
   ...:         return 'Yes'
   ...:     else:
   ...:         return 'No'
   ...:

In [3]: %timeit test_if_in('abc')
The slowest run took 10.23 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 167 ns per loop

In [4]: %timeit test_if_in_set('abc')
The slowest run took 11.04 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 116 ns per loop

In [5]: %timeit test_if_in_set('one')
The slowest run took 11.25 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 114 ns per loop

In [6]: %timeit test_if_in('one')
The slowest run took 11.73 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 109 ns per loop

Even though both are blindingly fast, the set version is 1.5x faster in the case where the element is not in the iterable (which makes sense, since it has to iterate through everything instead of doing an O(1) lookup like you can with a set). It's about the same if happens to be the first item in the iterable. It'll only matter if you're calling the function a lot. It will only get worse as the iterable grows though. The set starts to dominate list/tuple/non-hashed-containers very quickly.

dicts keep order now – Boris Verkhovskiy May 15 '19 at 11:59 — Boris Verkhovskiy, May 15 '19 at 11:59

Pythonic way of writing `for my_var in my_iterable`

1 Answers1