43

Erlang and Ruby both come with functions for flattening arrays. It seems like such a simple and useful tool to add to a language. One could do this:

>>> mess = [[1, [2]], 3, [[[4, 5]], 6]]
>>> mess.flatten()
[1, 2, 3, 4, 5, 6]

Or even:

>>> import itertools
>>> mess = [[1, [2]], 3, [[[4, 5]], 6]]
>>> list(itertools.flatten(mess))
[1, 2, 3, 4, 5, 6]

Instead, in Python, one has to go through the trouble of writing a function for flattening arrays from scratch. This seems silly to me, flattening arrays is such a common thing to do. It's like having to write a custom function for concatenating two arrays.

I have Googled this fruitlessly, so I'm asking here; is there a particular reason why a mature language like Python 3, which comes with a hundred thousand various batteries included, doesn't provide a simple method of flattening arrays? Has the idea of including such a function been discussed and rejected at some point?

Hubro
  • 676
  • 1
  • 7
  • 13
  • 2
    What is an example of when you would need such a function? I’ve definitely needed to flatten an *n*-D array into a 1D array, but I don’t think I’ve ever needed the heterogeneous equivalent. – Jon Purdy Aug 24 '14 at 05:52
  • 3
    @detly: I happened to miss flattening lately when using several queries to retrieve data from different sources. Each query returns a list of dictionaries, so in the end I have a list of lists of dictionaries to be turned into a list of dictionaries. I used a loop + `extend` but flatten would have been much more elegant. However, I wounder if this pattern is common enough to justify having flatten in the standard library. – Giorgio Aug 24 '14 at 09:19
  • 1
    @Giorgio - I think the general argument against it is that it's a little bit magic. I mean, imagine if you introduce a bug into your code that inadvertently changes the structure of your data. `flatten` will still work, but produce completely the wrong results. Functions that expect a certain data structure (a) partially document your data structures, and (b) fail when they should. There's also the edge cases... How should it work on lists containing generators, which could be infinite? What about lists containing strings? – detly Aug 24 '14 at 12:23
  • 4
    "I mean, imagine if you introduce a bug into your code that inadvertently changes the structure of your data. flatten will still work, but produce completely the wrong results.": This is one reason why I like statically-typed languages. ;-) – Giorgio Aug 24 '14 at 12:50
  • What evidence do you have that flattening is common? In the past five years or so, I think I've needed to do this once or twice. – Bryan Oakley Aug 28 '14 at 12:43
  • 4
    @detly [I guess you haven't been around StackOverflow much?](http://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python) – Izkata Aug 29 '14 at 00:42
  • 2
    @BryanOakley See prior comment as well (though not for multi-level lists, flattening in general _is_ common) – Izkata Aug 29 '14 at 00:42
  • @Izkata - that is a question about flattening a list of lists, not a list of arbitrarily nested lists. – detly Aug 29 '14 at 01:33
  • 1
    @BryanOakley: I said it's common because multiple languages has it "built-in" and because I use it a lot. For instance, some times I have a function with an argument that accepts a single object or an array of objects: `[arg].flatten`. Other times I need to run a shell command and I have the executable path in one variable and the arguments spread over two other variables. These all need to be in one flat array, so flatten is convenient: `system([exec, args1, args2].flatten)`. I have loved using Python for years without flatten, but I still miss it after using Ruby for a bit. – Hubro Aug 29 '14 at 10:03
  • 3
    It is built-in in Mathemaica, and I use it extensively. – Per Alexandersson Jan 26 '15 at 22:06
  • Correct me if I'm wrong, but flattening could be easily replaced by an algorithm that traverses the entire tree, am I right? This is usually the preferred method since you don't need to update the flattened array everytime the tree changes and you normally don't need to perform operations on a tree in a depth-wise manner more than once that would merit flattening the tree in the first place in order to optimize. – Neil Sep 28 '15 at 06:21
  • Specifically for the function call case you just use an asterisk on a list. ``func(*args)``. No need for a ``flatten`` call when invoking ``system()``. – Sean Perry Dec 15 '15 at 21:36
  • @SeanPerry That fails if they pass in an `int`. `flatten` would, presumably, not fail on that. – Nic Mar 01 '18 at 21:48

3 Answers3

40

Proposals for a flatten function to be added to the standard library appear from time to time on python-dev and python-ideas mailing lists. Python developers usually respond with the following points:

  1. A one-level flatten (turning an iterable of iterables into a single iterable) is a trivial one-line expression (x for y in z for x in y) and in any case is already in the standard library under the name itertools.chain.from_iterable.

  2. What are the use cases for a general-purpose multi-level flatten? Are these really compelling enough for the function to be added to the standard library?

  3. How would a general-purpose multi-level flatten decide when to flatten and when to leave alone? You might think that a rule like "flatten anything that supports the iterable interface" would work, but that would lead to an infinite loop for flatten('a').

See for example Raymond Hettinger:

It has been discussed ad nauseam on comp.lang.python. People seem to enjoy writing their own versions of flatten more than finding legitimate use cases that don't already have trivial solutions.

A general purpose flattener needs some way to be told what is atomic and what can be further subdivided. Also, it is not obvious how the algorithm should be extended to cover inputs with tree-like data structures with data at nodes as well as the leaves (preorder, postorder, inorder traversal, etc.)

Gareth Rees
  • 1,449
  • 10
  • 9
  • 1
    Just to be explicit, this means that the one-level `flatten` function can be defined as `lambda z: [x for y in z for x in y]`. – Christopher Martin Apr 16 '15 at 09:24
  • 1
    "A general purpose flattener needs some way to be told what is atomic and what can be further subdivided.": This sounds like a problem that can be solved using OOP: each object could have a `flatten` method. The implementation of this method should recursively call `flatten` on its subcomponent, if the object is a composite. Unfortunately, AFAIK not every value is an object in Python. In Ruby it should work though. – Giorgio Sep 28 '15 at 11:54
  • 1
    a flatten helper for a one-level flatten rather than a continued "for in for in" is already a good enough case IMO. easily readable – dtc Jun 15 '16 at 22:01
  • 2
    @Giorgio Python shies away from such methods. [Protocols are preferred,](http://lucumr.pocoo.org/2011/7/9/python-and-pola/#protocols) and I find they're vastly smoother to work with than an OOP design since you often don't even need to implement very much at all. – jpmc26 Dec 13 '17 at 22:58
  • @Giorgio Having a ``flatten`` method only moves the problem, it does not solve it. What is and isn't atomic depends on the use-case, not the types. Same for traversal order. In one case, you may want to flatten a list of lists of tuples completely, in another case those tuples are coordinates and should be preserved. – MisterMiyagi Jun 11 '20 at 07:52
  • Somehow Ruby managed to solve this long ago. It seems the common Python excuse for missing or mis-features is, "who would need that?", or "we couldn't figure out how it should work." The itertools.chain approach cannot handle `[1, [3, 4]]`. You might want a list like this when collecting items which could be single or lists themselves. It's also handy for ensuring you can call [0] on a value which may or may not be a list. – michael_teter Apr 29 '22 at 08:28
8

It does come with such a method but it doesn't call it flatten. It's called "chain". It returns an iterator which you'd then need to use the list() function on to turn it back into a list. If you don't want to use a *, you can use the second "from_iterator" version. It works the same in Python 3. It will fail if the list input is not a list of lists.

[[1], [2, 3], [3, 4, 5]] #yes
[1, 2, [5, 6]] #no

There was at one time a flatten method in the compiler.ast module but this was deprecated in 2.6 and then removed in 3.0. Arbitrary depth recursion, necessary for arbitrarily nested lists does not function well with Python's conservative maximum recursion depth. The reasoning for compiler's removal were largely due to it being a mess. Compiler was turned into ast but flatten was left behind.

Arbitrary depth can be achieved with numpy's arrays and that library's flatten.

  • The `chain.from_iterator` function, as you said, can only be used to flatten two-dimensional lists. An *actualy* flatten function, which accepts *any* amounts of nested lists and returns a one-dimensional list, would still be massively useful in lots of cases (at least in my opinion) – Hubro Aug 24 '14 at 05:48
  • 2
    @Hubro: "in lots of cases" — can you name six? – Gareth Rees Sep 13 '14 at 11:24
  • 2
    @GarethRees: I gave a few examples here: http://programmers.stackexchange.com/questions/254279/why-doesnt-python-have-a-flatten-function-for-lists/254280?noredirect=1#comment513862_254279 – Hubro Sep 14 '14 at 22:58
  • I would also go so far as to argue that if these other language do in fact provide such a feature to flatten a list in the very simple way described, that is one of the most compelling arguments in support of adding that simple ability to Python. – Bobort Sep 15 '16 at 20:50
  • Does it return an iterator or a generator? – jpmc26 Dec 13 '17 at 23:01
-1

... maybe because it's not that difficult to write one yourself

def flatten(l): return flatten(l[0]) + (flatten(l[1:]) if len(l) > 1 else []) if type(l) is list else [l]

... and then flatten all you want :)

>>> flatten([1,[2,3],4])
[1, 2, 3, 4]
>>> flatten([1, [2, 3], 4, [5, [6, {'name': 'some_name', 'age':30}, 7]], [8, 9, [10, [11, [12, [13, {'some', 'set'}, 14, [15, 'some_string'], 16], 17, 18], 19], 20], 21, 22, [23, 24], 25], 26, 27, 28, 29, 30])
[1, 2, 3, 4, 5, 6, {'age': 30, 'name': 'some_name'}, 7, 8, 9, 10, 11, 12, 13, set(['set', 'some']), 14, 15, 'some_string', 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30]
>>> 
Shreyas
  • 129
  • 1
  • 10
    asker is aware of that: "in Python, one has to go through the trouble of writing a function for flattening arrays from scratch". This doesn't even attempt to address the question asked, "This seems silly to me, flattening arrays is such a common thing to do. It's like having to write a custom function for concatenating two arrays." – gnat Apr 08 '16 at 13:24
  • 1
    Out of topic... But super cool :-) !! – SeF Apr 12 '18 at 13:18
  • this answer is like telling OP he's not a good developer because he didn't know how to code the function himself. I suggest you modify the beginning of your answer because this is useful code for those who stumble upon the question, even if off-topic – Federico Bonelli Feb 13 '19 at 12:14