Skip to content

Deep learning operations reinvented (for pytorch, tensorflow, chainer, gluon and others)

License

Notifications You must be signed in to change notification settings

davidnvq/einops

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

einops

Build Status PyPI version

What's new in this fork

It's all about functionality on a single dimension

1. Redimension Functionality

The function redim is added. This performs redimension elements on 1 specific dimension, e.g., chunking or reordering. The operation is performed by the pattern within the bracket.

Note that: It only supports a single pattern so far, and there must be a single group/element on that dimension

def redim(tensor, pattern, **axes_lengths: int)
    """Args:
        tensor: (torch.Tensor or np.Array)
        pattern: pattern to redimension, e.g., "B H W [(r g b) -> (b g r)]"
        **axes_lengths: optional lengths of axes, e.g., B=10, H=300, W=600, b=1, g=1,r=1
    Returns:
        a re-dimensioned tensor
    """

Example: Reordering

>>> image = np.random.randn(30, 40, 3) # RGB
# change it to RGB -> BGR
# It is not necessary to specify the length of other axes, only for `assert` purpose
# When element length = 1, or can be infered from the context, we also don't need to specify

>>> image = redim(image, "height width [(r g b) -> (b g r)]", height=30, width=40, r=1, g=1, b=1)

Example: chunking

 # Split dataset into train and validation set
>>> train_set = redim(dataset, "[(train valid) -> train] H W", train=800, valid=200)
>>> valid_set = redim(dataset, "[(train valid) -> valid] H W", train=800, valid=200)

# Remove alpha channel
>>> image = np.random.randn(30, 40, 4) # RGBA
>>> image = redim(image, "H W [(rgb a) -> rgb]", rgb=3) # or the below
>>> image = redim(image, "H W [(r g b a) -> (r g b)]")

# Crop the image
>>> image = redim(image, "[(top down) -> top] W", top=20)
>>> image = redim(image, "H [left right]", left=10)

2. Concatenation Functionality

The function concat is added. This performs the concatenation of tensors along 1 axis.

Note that:

  1. Except for the concatenated axis, the lengths of the other axes must be the same.
  2. It is not necessary for all tensors to have the same length, they can be different.
  3. We DONT need to specify all the lengths of dimensions.
def concat(tensor_list, pattern, **axes_lengths: int):
    Args:
        tensor_list:(List[torch.Tensor/np.Array]) list of tensors have same length on all dimensions (except concat dim)
        pattern: (str) pattern to redimension, e.g., "batch seq [dx dy dz -> (dx dy dz)]"
        **axes_lengths: optional lengths of axes, B=10, H=300, W=600
    Returns:
        a concatenated tensor

Example: concatenate

>>> x = torch.randn(2, 10, 512)
>>> y = torch.randn(2, 10, 128)
>>> z = torch.randn(2, 10, 256)
>>> h = concat([x, y, z], "batch seq [dx dy dz -> (dx dy dz)]", batch=2, seq=10, dx=512, dy=128, dz=256)

Other Note: We can use ellipsis when we don't want to list all dimension names along this axis

###  Example: concatenate
>>> h = concat([x, y, z], "batch seq [... -> ...]")
>>> h = concat([x, y, z], "batch seq [... -> d]")

This pull request based on #56, #50, and #20.

Flexible and powerful tensor operations for readable and reliable code. Supports numpy, pytorch, tensorflow, and others.

Tweets

In case you need convincing arguments for setting aside time to learn about einsum and einops... Tim Rocktäschel, FAIR

Writing better code with PyTorch and einops đź‘Ś Andrej Karpathy, AI at Tesla

Slowly but surely, einops is seeping in to every nook and cranny of my code. If you find yourself shuffling around bazillion dimensional tensors, this might change your life Nasim Rahaman, MILA (Montreal)

Contents

Tutorial / Documentation

Tutorials are the most convenient way to see einops in action (and right now work as a documentation)

Installation

Plain and simple:

pip install einops
# support operations on dimension
git clone https://github.com/davidnvq/einops
cd einops
python setup.py install

-->

API

einops has a minimalistic yet powerful API.

Three operations provided (einops tutorial shows those cover stacking, reshape, transposition, squeeze/unsqueeze, repeat, tile, concatenate, view and numerous reductions)

from einops import rearrange, reduce, repeat
# rearrange elements according to the pattern
output_tensor = rearrange(input_tensor, 't b c -> b c t')
# combine rearrangement and reduction
output_tensor = reduce(input_tensor, 'b c (h h2) (w w2) -> b h w c', 'mean', h2=2, w2=2)
# copy along a new axis 
output_tensor = repeat(input_tensor, 'h w -> h w c', c=3)

And two corresponding layers (einops keeps a separate version for each framework) with the same API.

from einops.layers.chainer import Rearrange, Reduce
from einops.layers.gluon import Rearrange, Reduce
from einops.layers.keras import Rearrange, Reduce
from einops.layers.torch import Rearrange, Reduce
from einops.layers.tensorflow import Rearrange, Reduce

Layers behave similarly to operations and have the same parameters (with the exception of the first argument, which is passed during call)

layer = Rearrange(pattern, **axes_lengths)
layer = Reduce(pattern, reduction, **axes_lengths)

# apply created layer to a tensor / variable
x = layer(x)

Example of using layers within a model:

# example given for pytorch, but code in other frameworks is almost identical  
from torch.nn import Sequential, Conv2d, MaxPool2d, Linear, ReLU
from einops.layers.torch import Rearrange

model = Sequential(
    Conv2d(3, 6, kernel_size=5),
    MaxPool2d(kernel_size=2),
    Conv2d(6, 16, kernel_size=5),
    MaxPool2d(kernel_size=2),
    # flattening
    Rearrange('b c h w -> b (c h w)'),  
    Linear(16*5*5, 120), 
    ReLU(),
    Linear(120, 10), 
)

Naming

einops stands for Einstein-Inspired Notation for operations (though "Einstein operations" is more attractive and easier to remember).

Notation was loosely inspired by Einstein summation (in particular by numpy.einsum operation).

Why use einops notation?!

Semantic information (being verbose in expectations)

y = x.view(x.shape[0], -1)
y = rearrange(x, 'b c h w -> b (c h w)')

While these two lines are doing the same job in some context, the second one provides information about the input and output. In other words, einops focuses on interface: what is the input and output, not how the output is computed.

The next operation looks similar:

y = rearrange(x, 'time c h w -> time (c h w)')

but it gives the reader a hint: this is not an independent batch of images we are processing, but rather a sequence (video).

Semantic information makes the code easier to read and maintain.

More checks

Reconsider the same example:

y = x.view(x.shape[0], -1) # x: (batch, 256, 19, 19)
y = rearrange(x, 'b c h w -> b (c h w)')

The second line checks that the input has four dimensions, but you can also specify particular dimensions. That's opposed to just writing comments about shapes since comments don't work and don't prevent mistakes as we know

y = x.view(x.shape[0], -1) # x: (batch, 256, 19, 19)
y = rearrange(x, 'b c h w -> b (c h w)', c=256, h=19, w=19)

Result is strictly determined

Below we have at least two ways to define the depth-to-space operation

# depth-to-space
rearrange(x, 'b c (h h2) (w w2) -> b (c h2 w2) h w', h2=2, w2=2)
rearrange(x, 'b c (h h2) (w w2) -> b (h2 w2 c) h w', h2=2, w2=2)

There are at least four more ways to do it. Which one is used by the framework?

These details are ignored, since usually it makes no difference, but it can make a big difference (e.g. if you use grouped convolutions in the next stage), and you'd like to specify this in your code.

Uniformity

reduce(x, 'b c (x dx) -> b c x', 'max', dx=2)
reduce(x, 'b c (x dx) (y dy) -> b c x y', 'max', dx=2, dy=3)
reduce(x, 'b c (x dx) (y dy) (z dz)-> b c x y z', 'max', dx=2, dy=3, dz=4)

These examples demonstrated that we don't use separate operations for 1d/2d/3d pooling, those are all defined in a uniform way.

Space-to-depth and depth-to space are defined in many frameworks but how about width-to-height?

rearrange(x, 'b c h (w w2) -> b c (h w2) w', w2=2)

Framework independent behavior

Even simple functions are defined differently by different frameworks

y = x.flatten() # or flatten(x)

Suppose x's shape was (3, 4, 5), then y has shape ...

  • numpy, cupy, chainer: (60,)
  • keras, tensorflow.layers, mxnet and gluon: (3, 20)
  • pytorch: no such function

Independence of framework terminology

Example: tile vs repeat causes lots of confusion. To copy image along width:

np.tile(image, (1, 2))    # in numpy
image.repeat(1, 2)        # pytorch's repeat ~ numpy's tile

With einops you don't need to decipher which axis was repeated:

repeat(image, 'h w -> h (tile w)', tile=2)  # in numpy
repeat(image, 'h w -> h (tile w)', tile=2)  # in pytorch
repeat(image, 'h w -> h (tile w)', tile=2)  # in tf
repeat(image, 'h w -> h (tile w)', tile=2)  # in jax
repeat(image, 'h w -> h (tile w)', tile=2)  # in mxnet
... (etc.)

Supported frameworks

Einops works with ...

Contributing

Best ways to contribute are

  • spread the word about einops
  • if you like explaining things, alternative tutorials would be very helpful
    • some people grasp einops ideas immediately, while many others need help-by-example
  • translating examples in languages other than English is also a good idea
  • use einops notation in your papers to strictly define used operations!

Supported python versions

einops works with python 3.5 or later.

About

Deep learning operations reinvented (for pytorch, tensorflow, chainer, gluon and others)

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%