Search Trees in Python: From Theory to Practical Implementation

Algorithms Search Tree in Python header

Search trees are data structures that organize data in a hierarchical manner. One of the most common types of search trees is the binary search tree (BST), but there are also variations such as balanced search trees (such as AVL and red-black) and multiway search trees (such as B-trees).

[wpda_org_chart tree_id=43 theme_id=50]

Binary Search Tree (BST)

A binary search tree is a type of tree in which each node has at most two children, the left node contains lower values than the parent node, while the right node contains higher values. This property makes efficient searching possible.

There are a number of common operations that apply to binary search trees:

  1. Insertion: Add a new element to the tree respecting the sorting property.
  2. Search: Find an item in the tree.
  3. Deletion: Remove an item from the tree.
  4. Traversals: Visit all nodes of the tree in a certain order (inorder, preorder, postorder).
  5. Minimum and Maximum: Find the minimum and maximum value in the tree.

Here is an example implementation of a binary search tree in Python where these operations are implemented:

class BinarySearchTree:
    class Node:
        def __init__(self, key):
            self.key = key
            self.left = None
            self.right = None

    def __init__(self):
        self.root = None

    def insert(self, key):
        self.root = self._insert(self.root, key)

    def _insert(self, root, key):
        if root is None:
            return self.Node(key)

        if key < root.key:
            root.left = self._insert(root.left, key)
        elif key > root.key:
            root.right = self._insert(root.right, key)

        return root

    def search(self, key):
        return self._search(self.root, key)

    def _search(self, root, key):
        if root is None or root.key == key:
            return root
        elif key < root.key:
            return self._search(root.left, key)
        else:
            return self._search(root.right, key)

    def delete(self, key):
        self.root = self._delete(self.root, key)

    def _delete(self, root, key):
        if root is None:
            return root

        if key < root.key:
            root.left = self._delete(root.left, key)
        elif key > root.key:
            root.right = self._delete(root.right, key)
        else:
            if root.left is None:
                return root.right
            elif root.right is None:
                return root.left

            root.key = self._min_value(root.right)
            root.right = self._delete(root.right, root.key)

        return root

    def _min_value(self, root):
        current = root
        while current.left is not None:
            current = current.left
        return current.key

    def inorder_traversal(self):
        result = []
        self._inorder_traversal(self.root, result)
        return result

    def _inorder_traversal(self, root, result):
        if root:
            self._inorder_traversal(root.left, result)
            result.append(root.key)
            self._inorder_traversal(root.right, result)

    def min_value(self):
        return self._min_value(self.root)

    def max_value(self):
        return self._max_value(self.root)

    def _max_value(self, root):
        current = root
        while current.right is not None:
            current = current.right
        return current.key


# Example usage
bst = BinarySearchTree()
data = [50, 30, 70, 20, 40, 60, 80]

for value in data:
    bst.insert(value)

print("Inorder Traversal:", bst.inorder_traversal())
print("Min Value:", bst.min_value())
print("Max Value:", bst.max_value())

# Delete the node with key 30
bst.delete(30)
print("After deleting key 30:", bst.inorder_traversal())

Executing you will get the following result:

Inorder Traversal: [20, 30, 40, 50, 60, 70, 80]
Min Value: 20
Max Value: 80
After deleting key 30: [20, 40, 50, 60, 70, 80]

Now, let’s see an explanation of the main parts of the code:

  • class BinarySearchTree: This is the main class that represents a binary search tree.
  • class Node: This is an internal class that represents a node in the tree.
  • __init__(self): The initialization method of the BinarySearchTree class. Initialize the tree root to None.
  • insert(self, key): Adds a new node to the tree with the specified key.
  • _insert(self, root, key):Private recursive method for inserting a node.
  • search(self, key): Searches for a node with the specified key in the tree.
  • _search(self, root, key): Private recursive method for finding a node.
  • delete(self, key):Delete a node with the specified key from the tree.
  • _delete(self, root, key): Private recursive method for deleting a node.
  • _min_value(self, root): Find the minimum value in the tree starting from a certain node.
  • inorder_traversal(self): Returns a list with in-order traversal of the tree.
  • _inorder_traversal(self, root, result): Private recursive method for in-order traversal.
  • min_value(self): Returns the minimum value present in the tree.
  • max_value(self): Returns the maximum value present in the tree.
  • _max_value(self, root): Find the maximum value in the tree starting from a certain node.

Time Complexity:

Insertion, search, and deletion in a balanced binary search tree have an average time complexity of O(log n), where n is the number of nodes in the tree. Without balancing, the tree can degenerate into a linked list, increasing the complexity to O(n).

Balancing a search tree

The concept of balancing in a binary search tree is crucial to ensure efficient performance in insert, search, and delete operations. A binary search tree is considered balanced when the height difference between the left and right subtrees of each node is bounded and not too large. If a tree is unbalanced, operations can degenerate into linked-list-like performance, increasing time complexity.

Unbalanced Trees:

Binary search trees can become unbalanced in various ways, but one of the most common situations occurs when data is entered in ascending or descending order. Under these circumstances, the tree risks degenerating into a linked-list-like structure, with one subtree long and another virtually non-existent.

An example of an unbalanced tree could be the following:

        1
         \
          2
           \
            3
             \
              4
               \
                5

In this case, the height of the tree is proportional to the number of nodes, and therefore the search, insertion and deletion operations may take a time proportional to the number of nodes in the tree, compromising the efficiency of the algorithm.

Balanced Trees:

A binary search tree is balanced when, for each node, the heights of the left and right subtrees differ by at most 1. There are several balanced tree structures, such as AVL trees, red-black trees, and Splay trees.

Advantages of Balanced Trees:

  1. Smoother performance: Balanced trees ensure that search, insert, and delete operations have consistent performance and do not degenerate in extreme cases.
  2. Guaranteed Time Complexity: In a balanced tree, operations have an average time complexity of O(log n), where n is the number of nodes in the tree.

AVL trees

AVL trees are a specific type of balanced binary search tree. In an AVL, the height difference between the left and right subtrees of each node is limited to -1, 0, or 1. This rule is maintained after each insertion or deletion operation by tree rotations.

Using AVL trees ensures optimal balancing, but rotations may slightly increase the cost of operations compared to other balanced structures.

In Python, libraries like sortedcontainers or bintrees provide efficient implementations of balanced trees. When working with binary search trees, it is important to consider balancing to ensure efficient performance in key operations.

Here is an example implementation of an AVL tree in Python without using libraries. For this example, we will use a Node class to represent the tree nodes and an AVLTree class to handle operations on the AVL tree.

class Node:
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None
        self.height = 1  # The height of a new node is 1

class AVLTree:
    def insert(self, root, key):
        if not root:
            return Node(key)

        if key < root.key:
            root.left = self.insert(root.left, key)
        elif key > root.key:
            root.right = self.insert(root.right, key)

        root.height = 1 + max(self._get_height(root.left), self._get_height(root.right))

        balance = self._get_balance(root)

        # Rotations to maintain balance
        if balance > 1:
            if key < root.left.key:
                return self._rotate_right(root)
            else:
                root.left = self._rotate_left(root.left)
                return self._rotate_right(root)

        if balance < -1:
            if key > root.right.key:
                return self._rotate_left(root)
            else:
                root.right = self._rotate_right(root.right)
                return self._rotate_left(root)

        return root

    def delete(self, root, key):
        if not root:
            return root

        if key < root.key:
            root.left = self.delete(root.left, key)
        elif key > root.key:
            root.right = self.delete(root.right, key)
        else:
            if not root.left:
                return root.right
            elif not root.right:
                return root.left

            min_val_node = self._get_min_value_node(root.right)
            root.key = min_val_node.key
            root.right = self.delete(root.right, min_val_node.key)

        root.height = 1 + max(self._get_height(root.left), self._get_height(root.right))

        balance = self._get_balance(root)

# Rotations to maintain balance
        if balance > 1:
            if self._get_balance(root.left) >= 0:
                return self._rotate_right(root)
            else:
                root.left = self._rotate_left(root.left)
                return self._rotate_right(root)

        if balance < -1:
            if self._get_balance(root.right) <= 0:
                return self._rotate_left(root)
            else:
                root.right = self._rotate_right(root.right)
                return self._rotate_left(root)

        return root

    def search(self, root, key):
        if not root or root.key == key:
            return root

        if key < root.key:
            return self.search(root.left, key)
        else:
            return self.search(root.right, key)

    def _get_height(self, node):
        if not node:
            return 0
        return node.height

    def _get_balance(self, node):
        if not node:
            return 0
        return self._get_height(node.left) - self._get_height(node.right)

    def _rotate_right(self, y):
        x = y.left
        T2 = x.right

        x.right = y
        y.left = T2

        y.height = 1 + max(self._get_height(y.left), self._get_height(y.right))
        x.height = 1 + max(self._get_height(x.left), self._get_height(x.right))

        return x

    def _rotate_left(self, x):
        y = x.right
        T2 = y.left

        y.left = x
        x.right = T2

        x.height = 1 + max(self._get_height(x.left), self._get_height(x.right))
        y.height = 1 + max(self._get_height(y.left), self._get_height(y.right))

        return y

    def _get_min_value_node(self, root):
        current = root
        while current.left:
            current = current.left
        return current

    def inorder_traversal(self, root):
        result = []
        self._inorder_traversal(root, result)
        return result

    def _inorder_traversal(self, root, result):
        if root:
            # Traverse the left subtree
            self._inorder_traversal(root.left, result)
            # Append the current node's key to the result
            result.append(root.key)
            # Traverse the right subtree
            self._inorder_traversal(root.right, result)


# Example of use
avl_tree = AVLTree()
root = None
keys = [9, 5, 10, 0, 6, 11, -1, 1, 2]
for key in keys:
    root = avl_tree.insert(root, key)

print("AVL tree after insertion:", avl_tree.inorder_traversal(root))

# Search and delete the node with key 10
key_to_search = 10
found_node = avl_tree.search(root, key_to_search)
if found_node:
    print(f"Element with key {key_to_search} found.")
else:
    print(f"Element with key {key_to_search} not found.")

root = avl_tree.delete(root, key_to_search)
print("AVL tree after deletion of 10:", avl_tree.inorder_traversal(root))

In this example, Node represents the nodes of the tree with the addition of a height field to keep track of the height of each node. The AVLTree class contains methods for inserting, deleting, and searching, as well as auxiliary functions for calculating height, balance, and performing rotations necessary to keep the tree balanced. Finally, a usage example is performed by inserting some keys into the tree, searching and deleting a specific node.

Esxecuting this code you will get the following result:

AVL tree after insertion: [-1, 0, 1, 2, 5, 6, 9, 10, 11]
Element with key 10 found.
AVL tree after deletion of 10: [-1, 0, 1, 2, 5, 6, 9, 11]

Multiway Search Trees

Multiway search trees are a variant of binary search trees in which each node can have more than two children. In a multiway tree, each node can have a variable number of subtrees. This model allows you to organize data more flexibly than binary trees.

Features of Multiway Search Trees:

  1. Variable Number of Children: In a multiway search tree, a node can have zero or more children. While in binary trees each node has at most two children, in multiway trees there is no fixed limit to the number of children of a node.
  2. Hierarchical Structure: Multiway trees maintain a hierarchical structure, where each node is connected to one or more subtrees, each of which is itself a tree.
  3. Flexibility in Data Organization: Data organization in multiway trees can be more flexible than in binary trees. This can be useful in situations where data may naturally be organized into more than two categories or levels.
  4. Search and Insert Operations: Search and insert operations in multiway search trees may vary depending on the specific implementation. However, the fundamental concept is similar to that of binary trees, where it is necessary to traverse the tree in order to correctly locate the desired node.

Here is a simplified example of a multiway search tree with nodes that can have up to three children:

        10
      / | \
     5  20  30
    / \   / | \
   3   7 15 25  35

In this example, the node with key 10 has three children (5, 20, and 30), and each of them can have a variable number of children. The tree can be traversed using approaches similar to those of binary trees.

Multiway search trees are used in situations where the hierarchical structure of the data can be better represented with a variable number of children for each node. However, the choice between a multiway and a binary search tree will depend on the specific needs of the problem and the operations you plan to perform on the trees.

Below you will find an example implementation of a multiway search tree in Python. In this example, we will use a Node class to represent the tree nodes and a MultiwayTree class to handle operations on the tree.

class Node:
    def __init__(self, key):
        self.key = key
        self.children = []

class MultiwayTree:
    def __init__(self):
        self.root = None

    def insert(self, key, parent_key=None):
        new_node = Node(key)

        if not parent_key:
            self.root = new_node
        else:
            parent_node = self._search(self.root, parent_key)
            if parent_node:
                parent_node.children.append(new_node)
            else:
                print(f"Parent with key {parent_key} not found. Inserting at the root.")

    def _search(self, root, key):
        if not root:
            return None

        if root.key == key:
            return root

        for child in root.children:
            result = self._search(child, key)
            if result:
                return result

        return None

    def inorder_traversal(self, root=None):
        if not root:
            root = self.root

        result = []
        self._inorder_traversal(root, result)
        return result

    def _inorder_traversal(self, root, result):
        if root:
            for child in root.children:
                self._inorder_traversal(child, result)
            result.append(root.key)

# Example of use
multiway_tree = MultiwayTree()

# Insert nodes into the tree
multiway_tree.insert(10)
multiway_tree.insert(5, parent_key=10)
multiway_tree.insert(20, parent_key=10)
multiway_tree.insert(3, parent_key=5)
multiway_tree.insert(7, parent_key=5)
multiway_tree.insert(15, parent_key=20)
multiway_tree.insert(25, parent_key=20)
multiway_tree.insert(30, parent_key=10)
multiway_tree.insert(35, parent_key=30)

# Print the in-order traversal of the multiway tree
print("Inorder Traversal of the Multiway Tree:", multiway_tree.inorder_traversal())

Executing the code you will get the following result:

Inorder Traversal of the Multiway Tree: [3, 7, 5, 15, 25, 20, 35, 30, 10]

In this example, the Node class represents the nodes of the tree, with the children field being a list containing the child nodes. The MultiwayTree class handles tree operations, including insertion of new nodes and in-order traversal.

The usage example shows how to create a multiway search tree by inserting some nodes with parent-child relationships and then printing the in-order traversal of the tree. You can customize and extend this example to suit your needs.

Conclusions

Search trees are fundamental data structures in computer science, widely used for efficient management of sorted data. The correct management of their structure is crucial to maintain high performance in search and insertion operations.

Leave a Reply