Tag Archives: List

Removing duplicates from a list while preserving order in Python

You can remove duplicates from a list easily by putting the elements in a set and then making a new list from the set’s contents:

def unique(sequence):
    return list(set(sequence))

However, this will put the elements in an arbitrary order. To preserve the order, you can use a set to keep track of which elements you’ve seen while populating the new list by a list comprehension:

def unique(sequence):
    seen = set()
    return [x for x in sequence if not (x in seen or seen.add(x))]

Note that this relies on the fact that set.add() returns None.

Counting the items in a list in Python

To count the number of a single item in a list, use the count() method:

from random import randint

dishes = ['spam', 'eggs', 'ham']
# Fill a list with 100 randomly selected dishes
orders = [dishes[randint(0, 2)] for i in range(1, 101)]
# Print the count of spam orders
print("spam: {0}".format(orders.count('spam')))
spam: 31

To get the counts of all items, use a collections.Counter

from collections import Counter

counter = Counter(orders)
print(counter)
Counter({'ham': 37, 'eggs': 32, 'spam': 31})

Difference between lists and tuples in Python

Lists and tuples in Python are both sequence types, and share a number of operations common to all sequence types:

  • in
  • +
  • *
  • slices
  • len()
  • min()
  • max()
  • index()
  • count()

However they are quite different in purpose and they way they are most commonly used:

  • Lists are mutable, and are generally used to store homogeneous objects (objects of the same type), which are accessed by iteration
  • Tuples are immutable, and are generally used to store heterogeneous objects, which are accessed by unpacking or indexing

This makes tuples more like lightweight objects, which is how they are often used. The namedtuple type takes this further by allowing indexing by named attributes.

Removing an element from a Python list by index

Method 1: Use del

mylist = [i for i in range(1,11)]
del mylist[4]
print mylist
[1, 2, 3, 4, 6, 7, 8, 9, 10]

Method 2: Use pop()

pop() returns the removed element. It takes an optional argument for the element to remove. If this is not supplied, it removes the last element.

mylist = [i for i in range(1,11)]
print mylist.pop()
print mylist
print mylist.pop(4)
print mylist
10
[1, 2, 3, 4, 5, 6, 7, 8, 9]
5
[1, 2, 3, 4, 6, 7, 8, 9]

Collecting data into chunks in Python

This is how to split an iterable into fixed-length chunks or blocks.
If the number of items in the iterable is not a multiple of the block size, the last block is padded with a fill value.

Method 1: A generator function

def grouper(iterable, n, fillvalue=None):
    for i in range(0, len(iterable), n):
        block = [j for j in iterable[i:i + n]] 
        yield tuple(block + [fillvalue] * (n - len(block)))

Method 2: Use itertools.izip_longest()

This is from the itertools documentation

def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx
    args = [iter(iterable)] * n
    return izip_longest(fillvalue=fillvalue, *args)

Example Program

    alphabet = "".join(chr(c) for c in range(65, 91)) # ABC...
    print list(grouper(alphabet, 3))

Output

[('A', 'B', 'C'), ('D', 'E', 'F'), ('G', 'H', 'I'), ('J', 'K', 'L'), ('M', 'N', 'O'), ('P', 'Q', 'R'), ('S', 'T', 'U'), ('V', 'W', 'X'), ('Y', 'Z', None)]

How to flatten a list of lists in Python

If you have a list of lists like this:

lol = [[1, 2, 3], [4, 5], [6, 7, 8, 9]]

And you want to flatten it into a single list like this:

[1, 2, 3, 4, 5, 6, 7, 8, 9]

You can use a list comprehension like this:

l = [item for sublist in lol for item in sublist]

Or you can use itertools.chain() like this:

from itertools import chain
l = list(chain(*lol))