The Voidspace Techie Blog

Using operator.add instead of a lambda in reduce will make it look cleaner and clearer.


Oh, also:

list(itertools.chain(*nestedList))


Gravatar How about using operator.add instead of the lambda? It turns out that this is faster than your reduce and lambda, which is in turn faster than the list comprehension (based on some measurements with timeit).


Gravatar Clearly, this does not qualify as a one-liner, but it is (I think) the more general solution. It will flatten things of arbitrary depth, not just a depth of 2. With some ;'s it could be mashed into one line.


def flat( aValue ):
  if type(aValue) in ( list, tuple ):
    if len(aValue) > 1:
      return flat(aValue[0])+flat(aValue[1:])
    return flat(aValue[0])
  else:
    return [aValue]


Gravatar Try this:


import timeit
setup = """
import itertools
import operator
a = [range(100) for _ in range(100)]
"""


t1 = timeit.Timer("list(itertools.chain(*a))", setup=setup)
t2 = timeit.Timer("reduce(operator.add, a)", setup=setup)
t3 = timeit.Timer("reduce(lambda x, y: x + y, a)", setup=setup)
t4 = timeit.Timer("[item for sublist in a for item in sublist]", setup=setup)


The numbers I got:

>>> t1.repeat(number=1000)
[0.5737299919128418, 0.57281088829040527, 0.57294988632202148]
>>> t2.repeat(number=1000)
[9.346325159072876, 9.3642270565032959, 9.3441071510314941]
>>> t3.repeat(number=1000)
[9.540198802947998, 9.5289080142974854, 9.5225989818572998]
>>> t4.repeat(number=1000)
[1.4107460975646973, 1.4074599742889404, 1.4073691368103027]


On the other hand, using range(10) in both places instead of range(100) made the list comprehension solution fastest.


Gravatar Try this:


import timeit
setup = """
import itertools
import operator
a = [range(100) for _ in range(100)]
"""


t1 = timeit.Timer("list(itertools.chain(*a))", setup=setup)
t2 = timeit.Timer("reduce(operator.add, a)", setup=setup)
t3 = timeit.Timer("reduce(lambda x, y: x + y, a)", setup=setup)
t4 = timeit.Timer("[item for sublist in a for item in sublist]", setup=setup)


The numbers I got:

>>> t1.repeat(number=1000)
[0.5737299919128418, 0.57281088829040527, 0.57294988632202148]
>>> t2.repeat(number=1000)
[9.346325159072876, 9.3642270565032959, 9.3441071510314941]
>>> t3.repeat(number=1000)
[9.540198802947998, 9.5289080142974854, 9.5225989818572998]
>>> t4.repeat(number=1000)
[1.4107460975646973, 1.4074599742889404, 1.4073691368103027]


On the other hand, using range(10) in both places instead of range(100) made the list comprehension solution fastest.


Gravatar I implemented a sort-of MapReduce and the reduce part ended up being a function all of its own:

def reducer(input, output):
for x in input:
if type(x) == type([]):
reducer(x, output)
else:
output.append(x)
return filter(None, output)
# or, if you're certain your tree
# is non-sparse:
return output


Basically what you're doing is compacting a tree, and you can't really make the assumption that you've only got two levels. This pretty much requires some sort of recursion... and I don't think you can do it in a Python one-liner... unless you find a clever way of recursively generating nested reduce functions without declaring them separately (tail recursion).


Gravatar Sum takes an initial value as an optional argument, so you can just do:

>>> nestedList = [[1, 2], [3, 4], [5, 6]]
>>> sum(nestedList, [])
[1, 2, 3, 4, 5, 6]


Gravatar Yeah, I've encountered this problem a few times. I coded up a recursive list flattener that worked fine with small, arbitrarily nested structures, but with larger structures I found myself exceeding the recursion limit. I wrote a new version that uses stack-based recursion and flattens arbitrarily nested instances of dicts, tuples, and lists or any combination thereof. Haven't benchmarked it though.

def flatten(nested):
"""flatten(||) ->

Accepts a list, tuple, or dict that may contain arbitrarily nested instances of
more lists, tuples, and dicts -- Returns a flattened list consisting of
all non-list/non-tuple/non-dict elements contained in the input.
e.g.,
>>> flatten(['a', (1, 'b', [666]), 'XYZ'])
['a', 1, 'b', 666, 'XYZ']
>>> flatten({1: ['a', ('b', {2: 'c'}), 'd'], 2: 'e'})
[1, 'a', 'b', 2, 'c', 'd', 2, 'e']
"""
nested = [ nested ]
flattened = []
while True:
if not nested:
break
else:
last_elt = nested.pop()
if isinstance(last_elt, tuple):
nested.extend( list( last_elt ))
elif isinstance(last_elt, dict):
nested.extend( list( last_elt.items()))
elif isinstance(last_elt, list):
nested.extend( last_elt )
else:
flattened.append( last_elt )
flattened.reverse()
return flattened


Gravatar And here's that code ... indented.

def flatten(nested):
____"""flatten(||) ->

____Accepts a list, tuple, or dict that may contain arbitrarily nested instances of
____more lists, tuples, and dicts -- Returns a flattened list consisting of
____all non-list/non-tuple/non-dict elements contained in the input.
____e.g.,
____>>> flatten(['a', (1, 'b', [666]), 'XYZ'])
____['a', 1, 'b', 666, 'XYZ']
____>>> flatten({1: ['a', ('b', {2: 'c'}), 'd'], 2: 'e'})
____[1, 'a', 'b', 2, 'c', 'd', 2, 'e']
____"""
____nested = [ nested ]
____flattened = []
____while True:
________if not nested:
____________break
________else:
____________last_elt = nested.pop()
____________if isinstance(last_elt, tuple):
________________nested.extend( list( last_elt ))
____________elif isinstance(last_elt, dict):
________________nested.extend( list( last_elt.items()))
____________elif isinstance(last_elt, list):
________________nested.extend( last_elt )
____________else:
________________flattened.append( last_elt )
____flattened.reverse()
____return flattened


Gravatar I could swear there was a built-in called 'flatten' that does exactly what you want.

Anyway, those won't work for nested lists that are arbitrarily deep...try it with [[[1,2],[3],[4]],[5,6,7]]. But maybe you don't care about that (you say list of lists), so nevermind.


Gravatar So I hacked away at it some more, and got this:

def reducer(ls):
for x in filter(lambda a: type(a) == type([]), ls):
ls = ls[:ls.index(x)] + reducer(x) + ls[ls.index(x) + 1:]
return ls

Also not a one-liner, but works on lists of arbitrary depth. IT IS TERRIBLY SLOW. To use benchmarks:

t1 = timeit.Timer("list(itertools.chain(*a))", setup=setup)
t2 = timeit.Timer("reduce(operator.add, a)", setup=setup)
t3 = timeit.Timer("reduce(lambda x, y: x + y, a)", setup=setup)
t4 = timeit.Timer("[item for sublist in a for item in sublist]", setup=setup)
t5 = timeit.Timer("sum(a, [])", setup=setup)
t6 = timeit.Timer("reducer(a)", setup=setup)

for func in (t1, t4, t5, t6):
outlist = func.repeat(number=1000)
average = reduce(lambda x, y: x + y, outlist) / float(len(outlist))
print "Average: %s" % (average)

Average: 0.446666717529
Average: 4.23866677284
Average: 4.29366668065
Average: 1.20166659355
Average: 4.20533331235
Average: 166.907666683

reducer obviously scales poorly, but that's primarily because most test cases are of wide trees where it has to look for branches at each level. If the tree is skinny, reducer does much better:

[quit testing 2 and 3]

Average: 0.0
Average: 0.0
Average: 0.00533334414164
Average: 8.96866671244

Keep in mind that the other 'reducers' only reduce 2 levels into one.


Gravatar Check out the various suggested implementations in this recipe:

http://aspn.activestate.com/ASPN...n/Recipe/ 363051


Gravatar import _tkinter._flatten as flatten


Gravatar Don't think the Tkinter solution will work for IronPython... but thanks.

Actually "sum(nestedList, [])" works *great* for me, we have only have one level of nesting.

Thanks to all those who posted suggestions.


Gravatar generator version:

def flatten_rec(nested):
_ try:
__ it = iter(nested)
_ except TypeError:
__ yield nested
__ return
_ for e in it:
__ for e2 in flatten_rec(e):
___ yield e2

print list(flatten_rec([[1,2,[3,4]],[5,[6,[7]]],8]))


Gravatar One option is a list comprehension with a side effect:

flattenedList = []
[flattenedList.extend(x) for x in nestedList]


Gravatar I find that nested list comprehensions are best formatted over several lines:

flat = [item for sublist in nestedList
                 

for item in subList]


Gravatar Or, even better:

flat = [item for sublist in nestedList
                        for item in subList]


Gravatar Hey, thanks for brining it up! I had this to do several times (probably like 4 or 5 times) and I've been using one of the worst possible ways to do it: reduce(lambda x, y: x + y, a). I'll use itertools.chain now, especially considering the fact that reduce will go away in python 3000.

At first, I was shocked to read that reduce was going a way, but after giving it more thoughts, I think that explicit loops are better for code readability.


Gravatar Often the faster solution:

r = []
for x in nestedList: r.extend(x)


Gravatar What about 2 lines more and keep to keep it readable? After if I want to sacrifice readability over one liners I would use Perl

flattenedList = []
for sublist in nestedlist:
____for item in sublist:
________flattenedList.append(item)


Gravatar Someone asked if it was possible to flatten arbitrarily nested lists as a one-liner in python. Here is one way to do just that:

nestedList = [[1],[[2,[3]],[4]]]
flatList = (lambda L,f:f(L,f))( nestedList, lambda L,f: not isinstance(L,list) and [L] or [x for s in L for x in f(s,f)] )

Of course, you shouldn't use anything like this unless you are entering an obfuscated code contest...


Gravatar numpy.array([[2,3,2],[4,3,2]]).flatten()


Gravatar http://groups.google.com/group/ c...832db53bd2700db


Gravatar Why not a simple minded function like the following:

def flatten(mylist):
...res = []
...for a in mylist:
......if type(a) == type([]) : res = res+flatten(a)
......else: res.append(a)
...return res


Name:

Email:

URL:

Comment:  ? 


 

Commenting by HaloScan