OlegSchwann June 15, 2019 at 23:31

Go data structures cheat sheet

Some companies interview online writing code. It is required to solve the olympiad problem for speed. In such circumstances, there is no time to see the details of the implementation of data structures - you need to immediately realize the idea. But courses on algorithms and data structures provide examples in either pseudo-code or C ++. More reference solutions to problems are often written in C ++. In preparation for the interview, I made a crib of libraries - analogues of STL containers, so as not to waste precious time searching.

Let's start with the obvious.

Dynamic continuous array

The analogue std::vector.
Supports accessing an element by index for a constant time of several processor cycles. You can increase or decrease capacity. This is the built-in slice:

vector := []int{}

Conveniently, basic concepts are built into the language.

Stack

The analogue std::stack.

An ordered set in which adding new elements and deleting existing ones is done from one end. The element that was placed last (last-in, first-out - LIFO) is removed first from the stack. This is again a walled slice. Snippets are copied from project to project:

// Push
stack = append(stack, value)

// Pop
// Проверка, что len(stack) > 0
stack, value = a[:len(stack)-1], a[len(stack)-1]

The slice operation does not allocate a new memory.

Turn

Analogue std::queueand std::deque.

Queues implement retrieval and addition operations for start and end in constant time. The element that was first placed (first-in, first-out - FIFO) is deleted first from the queue. A buffered channel is a queue on a ring buffer, you can use it when the reader and writer are different goroutines. But there is no separate queue implementation in the standard library. The awesome-go list advises the library https://github.com/gammazero/deque .

import "github.com/gammazero/deque"

Ongoing operations:

func (q *Deque) PushBack(elem interface{})
func (q *Deque) PushFront(elem interface{})
func (q *Deque) PopBack() interface{}
func (q *Deque) PopFront() interface{}
func (q *Deque) Back() interface{}
func (q *Deque) Front() interface{}
func (q *Deque) At(i int) interface{}

Doubly linked list

The analogue std::list.
It consists of elements containing, in addition to their own data, links to the next and previous list item. It is in the standard library:

import "container/list"

As expected, it supports inserts (to the beginning, to the end, before and after the element the pointer is passed to) and delete.

func (l *List) PushBack(v interface{}) *Element
func (l *List) PushFront(v interface{}) *Element
func (l *List) InsertAfter(v interface{}, mark *Element) *Element
func (l *List) InsertBefore(v interface{}, mark *Element) *Element
func (l *List) Remove(e *Element) interface{}

Go does not provide specific syntax for iterators. Therefore, the next / previous element can be obtained from a pointer to any node. These methods do not go bad after adding / removing an item to the list, without surprises.

func (e *Element) Next() *Element
func (e *Element) Prev() *Element

Priority queue

Analog std::priority_queue
Allows you to add items in any order, and get at any time the highest priority of the remaining. It is used, for example, in the algorithm for constructing a minimum spanning tree, when at the next step the algorithm selects the shortest edge of all starting at the vertices already covered at one end.

The standard library has an adapter that turns any sortable container (implements sort.Interface) into a priority queue.

import "container/heap"

This is a classic binary heap . Implements insertion and deletion in O (log n).

func Pop(h Interface) interface{}
func Push(h Interface, x interface{})
func Remove(h Interface, i int) interface{}

Hash table

It is a dictionary and an associative array.

Analog std::unordered_map.

Allows you to add a key-value, delete the value by key and check for the presence of an element for O (1) on average. Obviously map is built into the language:

unorderedMap := make(map[string]int)

The result of make (map) is a pointer, and the way it works is slightly different from standard containers:

// Проверка вхождения:
_, ok := unorderedMap["route"]
// Удаление элемента:
delete(unorderedMap, "route")
// Нахождение длины:
n := len(unorderedMap)

"Runtime / map", unlike std :: unordered_map, takes care of the programmer - it is safe to delete values during iteration.

A bunch of

The analogue std::unordered_set.
Almost the same as a hash table, but without saving the value.
If we need only a quick entry check, then we can use the built-in map again. It is only necessary to specify an empty value to indicate that only the key is important.

var m = make(map[string]struct{})
m["!"] = struct{}{}
_, ok := m["!"] // true

But this implementation does not support complex operators. To merge, intersect, the difference from the box, you need third-party libraries. Most used, judging by the number of stars: https://github.com/deckarep/golang-set

import "github.com/deckarep/golang-set"

The most needed part of the API :

Add(i interface{}) bool
Remove(i interface{})
Cardinality() int // len()
Contains(i ...interface{}) bool
IsSubset(other Set) bool
Intersect(other Set) Set
Union(other Set) Set
Difference(other Set) Set
SymmetricDifference(other Set) Set

Set int

In the experimental part of the standard library, there is an optimized set int that saves every bit.

import "golang.org/x/tools/container/intsets"

It also supports union, intersection, difference of sets.

Binary search trees

Analogs std::setand std::map.
They may seem like newbies to bad hash counterparts: they
also support adding, deleting, and checking for entries, but beyond O (log n).
But trees store nodes sorted by key.

There are no trees in the standard go library, a repository containing AVL, Red-Black and B-trees is widely used .

import "github.com/emirpasic/gods/trees/avltree"

Most commonly used API methods :

func (tree *Tree) Get(key interface{}) (value interface{}, found bool)
func (tree *Tree) Put(key interface{}, value interface{})
func (tree *Tree) Remove(key interface{})
func (tree *Tree) Size() int
func (tree *Tree) Keys() []interface{}
func (tree *Tree) Values() []interface{}
func (tree *Tree) Left() *Node
func (tree *Tree) Right() *Node

There are two particularly important tree methods:

func (tree *Tree) Ceiling(key interface{}) (ceiling *Node, found bool)

returns the smallest existing item larger than the key.

func (tree *Tree) Floor(key interface{}) (floor *Node, found bool)

returns the largest existing element less than a key.

Tasks for this are relatively often found in interviews. In real life it is used in database indexes:

select x from table where x <= $1 limit 1;

If there is an index, it will work for O (log n), for 1 border search in the B-tree.

Bloom filter

But this data structure in stl is not.
Like a hash table, it allows you to check whether an element belongs to a set. But the filter does not store keys when adding, and takes a constant amount of memory. It is possible to receive a false positive response (there is no element in the set, but the data structure reports that it is), but not false negative. It is used as a filter to quickly cut off almost all non-existing keys, saving more expensive verification, for example, reading from a disk or making a request to the database.
There is a third-party library: https://github.com/willf/bloom

import "github.com/willf/bloom"

Not so often used, you can peek at the API .

HyperLogLog

There is no such thing in the standard C ++ library.

Probabilistic data structure. With a small error (≈ 0.4%), it considers the number of unique elements without storing the keys themselves. It provides huge memory savings. If the task is to quickly calculate the number of visitors or requests - HyperLogLog is ideal.

The most popular library for this now.

https://github.com/axiomhq/hyperloglog

import "github.com/axiomhq/hyperloglog"

Sortings

Analogs std::sortand std::stable_sort.
From the consumer point of view, there are only 2 fundamentally different types:
Stable (do not change the order of equal elements [[4, 0], [1, 2], [1, 1], [5, 6]] -> [[1, 2], [1, 1], [4, 0], [5, 6]])
and not stable, which do not guarantee the sequence of other fields.
Both are in the standard library:

func Sort(data Interface)

This is a Hoar quick sort implementation, unstable. But for sections of length <12 heap sorting is used as optimization.

func Stable(data Interface)

Inside it is a merge sort, but for efficiency, when the recursive algorithm reaches blocks of less than 20 elements, insertion sort is used.

These are the classic algorithms working for O (n log n).

If you read it, congratulations. Knowledge of specific APIs helps in solving test problems. (If you worked with something and know the best alternatives - write in the comments.

Tags: