Python Collections

Python Collections

Python’s collections module provides a variety of specialized data structures beyond the basic built-in types like list, tuple, dict, and set. These are useful for specific situations where you need additional functionality or more efficient handling of data.
Some of most commonly used collections are as follows:

1. namedtuple:
A lightweight, immutable object like a tuple, but with named fields.
Useful for situations where you want to give meaningful names to elements in a tuple.

Key Features of namedtuple:
    ● Immutability: Like regular tuples, namedtuple instances are immutable. This means once you create a
       namedtuple, you cannot change its values.
    ● Named Fields: You can access the elements of a namedtuple by name instead of index. This improves code
       readability and makes the code more self-explanatory.

       For e.g. we have given name as Point and we understand Point has coordinates as x and y axes.
       p = Point(1, 2) shows coordinate of the point p
       If we use p = (1,2) we will not know what is 1 and 2 in tuple p
    ● Indexing: Despite having named fields, namedtuple still supports indexing. You can access elements using
       indices if needed.

Example:

importing namedtuple from collections
from collections import namedtuple
# Define a namedtuple with name ‘Point’ and fields ‘x’ and ‘y’
Point = namedtuple(‘Point’, [‘x’, ‘y’])
# Create an instance of Point
p = Point(1, 2)
print(p)         # Output: Point(x=1, y=2)
print(p.x)      # Output: 1
print(p.y)      # Output: 2

2. defaultdict
A dictionary that provides a default value for non-existent keys, avoiding KeyError.
When you want to avoid manually checking for key existence before adding or updating a dictionary entry.
Consider this list
fruits = [‘apple’, ‘banana’, ‘apple’, ‘orange’, ‘banana’, ‘apple’]

Problem Without defaultdict:
Scenario:
Let’s say you want to count how many times each fruit appears in a list of words and store fruit as a key and count as value in a traditional dictionary. Without defaultdict, you’d have to manually check whether a word is in the dictionary or not:

fruits = [‘apple’, ‘banana’, ‘apple’, ‘orange’, ‘banana’, ‘apple’]
# Without defaultdict
word_count = {} #initializing empty traditional dictionary
for word in fruits: #iterating fruits list
#checking if word from list exist in word_count dictionary
    if word in word_count:
          word_count[word] += 1                #incrementing word_count by 1
    else:
          word_count[word] = 1

print(word_count) # Output: {‘apple’: 3, ‘banana’: 2, ‘orange’: 1}

Example with defaultdict:

from collections import defaultdict

fruits = [‘apple’, ‘banana’, ‘apple’, ‘orange’, ‘banana’, ‘apple’]

# With defaultdict
word_count = defaultdict(int) # Default value is 0 for int

for word in fruits:
word_count[word] += 1 # No need to check if the word exists!

print(word_count)
# Output: defaultdict(<class ‘int’>, {‘apple’: 3, ‘banana’: 2,
# ‘orange’: 1})

from collections import defaultdict

This line imports the defaultdict class from the collections module. defaultdict is a type
dictionary that provides a default value for keys that do not exist. This allows you to avoid KeyError when accessing or modifying dictionary entries.

fruits = [‘apple’, ‘banana’, ‘apple’, ‘orange’, ‘banana’, ‘apple’]

Here, a list called fruits is created, containing several fruit names. Note that some fruits,
like ‘apple’ and ‘banana’, appear multiple times.

word_count = defaultdict(int) 

A defaultdict named word_count is created. The argument int indicates that the default value for any key that does not exist in the dictionary will be 0. So if you try to access a key that isn’t present, it will automatically create it with a value of 0.

for word in fruits:

This line starts a for loop that iterates over each element in the fruits list. The variable word will take on the value of each fruit in the list one at a time.

    word_count[word] += 1  # No need to check if the word exists!

Inside the loop, this line increments the count for each fruit in the word_count dictionary.
If word is ‘apple’, for example, word_count[‘apple’] is accessed. If ‘apple’ is not already a key in word_count, it will automatically be created with a default value of 0, and then it gets incremented by 1.
This eliminates the need to check whether a word exists in the dictionary before incrementing its value.

print(word_count) 

Finally, this line prints the contents of the word_count dictionary to the console.

3. Counter:
A dictionary subclass that helps count the frequency of elements in a collection. It will count the number of each character in a string.

from collections import Counter
c = Counter(‘hello world’)
print(c)
# Output: Counter({‘l’: 3, ‘o’: 2, ‘h’: 1, ‘e’: 1, ‘ ‘: 1, ‘w’: 1,
# ‘r’: 1, ‘d’: 1})

Example: Imagine a small grocery store that wants to keep track of how many times
each item has been sold over a week. Using Counter, the store can quickly analyze the
sales data.

from collections import Counter

# List of items sold during the week
sales = [
    ‘apple’, ‘banana’, ‘orange’, ‘apple’,
    ‘banana’, ‘apple’, ‘orange’, ‘banana’,
    ‘banana’, ‘orange’, ‘apple’, ‘apple’
]

# Use Counter to count the frequency of each item sold
item_count = Counter(sales)

# Print the count of items sold
print(item_count)
# Output: Counter({‘apple’: 5, ‘banana’: 4, ‘orange’: 3})

Task:
1. Create a namedtuple to store three dimension coordinates (x,y,z) and print them
2. Create a defaultdict and group the names by first letter of the following list in it
names = [‘Alice’, ‘Bob’, ‘Charlie’, ‘David’, ‘Eve’]
Expected output:
defaultdict(<class ‘list’>, {‘A’: [‘Alice’], ‘B’: [‘Bob’], ‘C’: [‘Charlie’], ‘D’: [‘David’], ‘E’: [‘Eve’]})
3. Create a counter to count each character of the string “Rain, Rain, Go away!”

Course Video

Course Video English:

YouTube Reference :

Frequently Asked Questions

Still have a question?

Let's talk

Collections in Python: A module providing specialized container datatypes.

Ordered Collection in Python: OrderedDict maintains the order of items.

Three Types of Collection: list, tuple, and set.

Collection Functions: Functions like Counter(), deque(), etc.

Collection of Modules in Python: A set of classes and functions for managing collections.

Four Types of Collections in Python: Counter, OrderedDict, defaultdict, deque.

Collections in Python Dictionary: defaultdict and OrderedDict are key collections.

Dictionary Module in Python: collections offers defaultdict and other utilities.

All Python Collections: Includes lists, sets, and more advanced types like deque.

Is Collections a Package in Python? Yes, it’s a built-in module.