new syntax for transpose #21037

StefanKarpinski · 2017-03-15T16:03:19Z

Now that .op is generally the vectorized form of op, it's very confusing that .' means transpose rather than the vectorized form of ' (adjoint, aka ctranspose). This issue is for discussing alternative syntaxes for transpose and/or adjoint.

The text was updated successfully, but these errors were encountered:

mbauman · 2017-03-15T16:10:59Z

Andreas tried Aᵀ (and maybe Aᴴ) in #19344, but it wasn't very well received. We could similarly pun on ^ with special exponent types T (and maybe H) such that A^T is transpose, but that's rather shady, too. Not sure there are many other good options that still kinda/sorta look like math notation.

StefanKarpinski · 2017-03-15T16:17:26Z

I kind of think that t(A) might be the best, but it's unfortunate to "steal" another one-letter name.

nalimilan · 2017-03-15T16:29:15Z

Moving my comment from the other issue (not that it solves anything, but...):

+1 for using something else than .'.

I couldn't find languages with a special syntax for transposition, except for APL which uses the not-so-obvious ⍉, and Python which uses *X (which would be confusing for Julia). Several languages use transpose(X); R uses t(X). That's not pretty, but it's not worse than .'. At least you're less tempted to use ' by confusing it with .': it would be clear that these are very different operations.

See Rosetta code. (BTW, the Julia example actually illustrates conjugate transpose...)

mauro3 · 2017-03-15T16:38:33Z

Could one of the other ticks be used? ` or "

ararslan · 2017-03-15T17:27:08Z

-100 to changing adjoint, since it's one of the awesome things that makes writing Julia code as clear as writing math, plus conjugate transpose is usually what you want anyway so it makes sense to have an abbreviated syntax for it.

As long as we have the nice syntax for conjugate transpose, a postfix operator for regular transpose seems mostly unnecessary, so just having it be a regular function call seems fine to me. transpose already works; couldn't we just use that? I find the t(x) R-ism unfortunate, as it's not clear from the name what it's actually supposed to do.

Using a different tick would be kind of weird, e.g. A` can look a lot like A' depending on the font, and A" looks too much like A''.

stevengj · 2017-03-16T13:27:33Z

If we make the change in #20978, then a postfix transpose actually becomes more useful than it is now. e.g. if you have two vectors x and y and you want to apply f pairwise on them, you can do e.g. f.(x, y.') ... with #20978, this will be applicable to arrays of arbitrary types.

Honestly, I think our best option is still to leave it as-is. None of the suggestions seem like a clear improvement to me. .' has the advantage of familiarity from Matlab. The . actually is somewhat congruent with dot-call syntax in examples like f.(x, y.'), and suggests (somewhat correctly) that the transpose "fuses" (it doesn't produce a temporary copy thanks to RowVector and future generalizations thereof).

In fact, we could even take it further, and make f.(x, g.(y).') a fusing operation. i.e. we change .' transpose to be non-recursive ala #20978 and we extend its semantics to include fusion with other nested dot calls. (If you want the non-fusing version, you would call transpose.)

StefanKarpinski · 2017-03-16T17:31:07Z

I like that plan a lot, @stevengj.

StefanKarpinski · 2017-03-16T17:49:13Z

One wrinkle: presumably the @. macro does not turn y' into y.' (since that would be wrong). It could, however, turn y' into some kind of fused adjoint operation.

stevengj · 2017-03-16T23:08:03Z

The main problem is finding a clean way to make f.(x, g.(y).') have fusing semantics. One possibility would be to transform it to f.(x, g.(y.')) and hence to broadcast(x,y -> f(x, g(y)), x, y.')?

Note that, for this to work properly, we might need to restore the fallback transpose(x) = x method, in which case we might as well let transpose remain recursive.

StefanKarpinski · 2017-03-17T19:25:50Z

I think deciding whether transpose should be recursive or not is orthogonal to whether we make it participate in dot syntax fusion. The choice of making it non-recursive is not motivated by that.

stevengj · 2017-03-17T19:40:47Z

@StefanKarpinski, if restore a fallback transpose(x) = x, then most of the motivation for changing it to be non-recursive goes away.

jebej · 2017-06-05T13:42:07Z

What's the problem if the fallback is restored but we still have the transpose be non-recursive?

stevengj · 2017-06-05T15:46:10Z

@jebej, recursive transpose is more correct when it is used as a mathematical operation on linear operators. If I remember correctly, the main reason for making it non-recursive was so that we don't have to define the transpose(x) = x fallback, rather than throwing a MethodError.

But it would not be terrible to have the fallback but still be non-recursive.

bkamins · 2017-06-28T12:45:52Z

Let me add two comments (I have looked through the earlier discussion and did not notice them - sorry if I have omitted something):

documentation for permutedims says This is a generalization of transpose for multi-dimensional arrays. Transpose is equivalent to permutedims(A, [2,1]).; it is a bit misleading as it suggests on the first read a generalization of transpose function, which it is not.
How one is supposed to make a transpose of a vector x=["a", "b"]? Actually y=x.' works and creates a new variable but getindex fails on it. AFAIK you have to use reshape(x, 1, :) or much slower hcat(x...) to achieve it but it is unnatural to have a different syntax for Vector (permutedims does not work here).

andreasnoack · 2017-06-28T13:22:46Z

What is your use case for transposing a vector of strings?

bkamins · 2017-06-28T14:05:54Z

Consider the following scenario for instance:

x = ["$(j+i)" for j in 1:3, i in 1:5]
y = ["$i" for i in 5:9]

and I want to append y after the last row of x. And the simplest way is to vcat a transpose of y.

Comes up in practice when incrementally logging text data to a Matrix{String} (I could use Vector{Vector{String}}), but often matrix is more useful (or then again there is a question how to convert Vector{Vector{String}} to Matrix{String} by vertically concatenating consecutive elements).

mbauman · 2017-06-28T15:29:44Z

Another use-case: transposing is the simplest way to make two vectors orthogonal to each other in order to broadcast a function over the cartesian product (f.(v, w.')).

Sacha0 · 2017-09-21T18:14:02Z

Data point: Yesterday I encountered a party confused by the postfix "broadcast-adjoint" operator and why it behaves like transpose. Best!

ttparker · 2017-11-08T17:54:30Z

FWIW, I strongly feel that we should get rid of the .' syntax. As someone more familiar with Julia than with Matlab, I expected it to mean vectorized adjoint and I got really tripped up when it didn't. Julia isn't Matlab and shouldn't be bound by Matlab's conventions - if in Julia, a dot means vectorization of the adjacent function, then this should be consistent across the language and shouldn't randomly have the one horrible exception that .' is formally unrelated to '.

I think it's fine to just have transpose without any special "tick" notation, since the vast majority of the time, it's called on a matrix of real numbers, so ' would be equivalent if you really want to save typing. If we want to make a fusing version of transpose, then I really don't think that .' is the right syntax.

JeffBezanson · 2017-11-08T19:59:28Z

That's a good point. Arguably only the adjoint needs super-compact syntax.

StefanKarpinski · 2017-11-09T19:56:12Z

Let's just call this transpose and deprecate .'. In the future, we can consider if we want .' as pointwise adjoint or if we just want to leave it perma-deprecated to avoid trapping Matlab users.

StefanKarpinski · 2018-08-05T21:37:28Z

It seems confusing that we would have a'' === a but ''(a) === a'. It seems better to use Base.apostrophe as the name instead (or something like that).

ttparker · 2018-08-05T22:25:04Z

Might it be better to split this discussion off into a new Github issue, since it's about ' syntax that's not directly related to matrix transposition?

perrutquist · 2018-08-07T07:20:27Z

Is there an automated way of splitting issues, or should I simply open a new one and link to the discussion here?

ararslan · 2018-08-07T09:25:10Z

The latter

briochemc · 2018-08-28T03:54:03Z

The only situation where a superscript-T notation would really be appropriate is if you have a numerical matrix whose indices you want to permute, but you really don't want the adjoint linear operator. Such situations certainly exist, but may be too rare to justify introducing new syntax.

I guess I'm way too late for the discussion, but I'd like to point one use that I think is worth mentioning: Applying the complex-step differentiation to a real-valued function which has transpose inside of it. (I personally figured out that I needed .' in MATLAB and julia for this particular reason.)

I'll give an example with multiple occurences of transpose (maybe I could avoid doing it this way?)

using LinearAlgebra

# f : Rⁿ → R
#     x  ↦ f(x) = xᵀ * x / 2
f(x) = 0.5 * transpose(x) * x

# Fréchet derivative of f
# Df : Rⁿ → L(Rⁿ, R)
#      x  ↦ Df(x) : Rⁿ → R (linear, so expressed via multiplication)
#                   h  ↦ Df(x)(h) = Df(x) * h
Df(x) = transpose(x) 

# Complex-step method version of Df
function CSDf(x) 
    out = zeros(eltype(x), 1, length(x))
        for i = 1:length(x)
        x2 = copy(x) .+ 0im
        h = x[i] * 1e-50
        x2[i] += im * h
        out[i] = imag(f(x2)) / h
    end
    return out
end

# 2nd Fréchet derivative
# D2f : Rⁿ → L(Rⁿ ⊗ Rⁿ, R)
#       x  ↦ D2f(x) : Rⁿ ⊗ Rⁿ → R (linear, so expressed via multiplication)
#                     h₁ ⊗ h₂ ↦ D2f(x)(h₁ ⊗ h₂) = h₁ᵀ * D2f(x) * h₂
D2f(x) = Matrix{eltype(x)}(I, length(x), length(x))

# Complex-step method version of D2f
function CSD2f(x)
    out = zeros(eltype(x), length(x), length(x))
    for i = 1:length(x)
        x2 = copy(x) .+ 0im
        h = x[i] * 1e-50
        x2[i] += im * h
        out[i, :] .= transpose(imag(Df(x2)) / h)
    end
    return out
end 

# Test on random vector x of size n
n = 5
x = rand(n)
Df(x) ≈ CSDf(x)
D2f(x) ≈ CSD2f(x)

# test that the 1st derivative is correct Fréchet derivative
xϵ = √eps(norm(x))
for i = 1:10
    h = xϵ * randn(n) # random small y
    println(norm(f(x + h) - f(x) - Df(x) * h) / norm(h)) # Fréchet check
end

# test that the 2nd derivative is correct 2nd Fréchet derivative
for i = 1:10
    h₁ = randn(n) # random h₁
    h₂ = xϵ * randn(n) # random small h₂
    println(norm(Df(x + h₂) * h₁ - Df(x) * h₁ - transpose(h₁) * D2f(x) * h₂) / norm(h₂)) # Fréchet check
end
# Because f is quadratic, we can even check that f is equal to its Taylor expansion
h = rand(n)
f(x + h) ≈ f(x) + Df(x) * h + 0.5 * transpose(h) * D2f(x) * h

The point being that f and Df must be defined using transpose and must not use the adjoint.

c42f · 2018-08-29T08:31:32Z

I don't think the complex step method is super relevant in julia. Isn't it a neat hack/workaround to get automatic differentiation in cases where a language supports efficient builtin complex numbers, but an equivalently efficient Dual number type can't be defined? That's not the case in julia, which has really nice automatic differentiation libraries.

briochemc · 2018-08-29T19:40:56Z

I agree about using Dual numbers instead of the complex-step method and that's a very good point you are making (I personally have already replaced all my complex-step-method evaluations with dual-number ones in julia). However, I do think that this is still a valid use case, for demonstration purposes, teaching tricks (see, e.g., Nick Higham talking about the complex-step method at Julia Con 2018), and portability (in other words, I worry that MATLAB's version of the code above using complex numbers would be cleaner).

mattcbro · 2018-08-31T06:02:49Z

Coming from the world of Engineers and possibly Physicists who use complex arrays more than real arrays, not having a transpose operator is a bit of a pain. (Complex phasor representation for a harmonic time dependency is ubiquitous in our field.) I personally would favor the numpy syntax of x.H and x.T, though my only consideration is conciseness .

The density of the transpose operator relative to Hermitian transpose is about 1 to 1 in my code. So the unconjugated transpose is equally important to me. A lot of the use of transpose is to create outer products and to size arrays correctly for interfacing to other code or for matrix multiplication.

I intend for now to simply provide a macro or one character function for the operation, however what is the proper equivalent to the old functionality, transpose() or permutedims()?

andyferris · 2018-08-31T12:28:10Z

transpose is intended for linear algebra and is recursive, and permutedims is for non-recursive arrangement of data of any type.

It’s interesting you say you use transpose as much as adjoint. I used to be the same, but mostly because I tended to make mistakes where my data was real so I tended to transpose but actually the adjoint was the correct operation (generalized to the complex case - adjoint was the right operation for my algorithm). There are (many) valid exceptions, of course.

Jutho · 2018-08-31T12:53:44Z

In everything related to electrodynamics, you often use space-like vectors and want to use vector operations in R^n (typically n=3), i.e. transpose in particular, even though your vectors are complex-valued because you've taken a Fourier transform. It seems like @mattcbro is talking about this kind of applications.

That being said, when reading up on syntax discussions, I am often contemplating that for me, personally, I could not imagine that a slightly more verbose syntax is the thing that slows down my programming speed or efficiency. Thinking about the algorithm itself and the most natural/efficient way of implementing it takes much more time.

stevengj · 2018-08-31T14:08:22Z

In everything related to electrodynamics, you often use space-like vectors and want to use vector operations in R^n (typically n=3), i.e. transpose in particular, even though your vectors are complex-valued because you've taken a Fourier transform.

Not necessarily. Often you want time-average quantities from the Fourier amplitudes, in which case you use the complex dot product, e.g. ½ℜ[𝐄*×𝐇] is the time-average Poynting flux from the complex Fourier components and ¼ε₀|𝐄|² is a time-average vacuum energy density. On the other hand, since the Maxwell operator is (typically) a complex-symmetric operator ("reciprocal"), you often use an unconjugated "inner product" for (infinite-dimensional) algebra on the fields 𝐄(𝐱) etc. over all space.

Jutho · 2018-08-31T14:28:02Z

That's true, I had the word often in the first sentence, but removed it apparently :-).

mattcbro · 2018-08-31T16:42:22Z

Well if you want to go there, Electromagnetic quantities are even more concisely written in a Clifford Algebraic formulation, often called Geometric algebra. Those algebras have multiple automorphisms and antiautomorphisms that play a critical role in the formulation of the theory, especially when considering scattering problems.

These algebras typically have a concise matrix representation and those morphisms are often easily computed via complex transpose, Hermitian transpose and conjugation.

Nevertheless as I stated earlier my primary use of transpose is often to arrange my arrays to interface with other arrays, other code, and to get matrix multiply to work against the correct dimension of a flattened array.

c42f · 2018-08-31T22:36:24Z

I personally would favor the numpy syntax of x.H and x.T

Easy to implement now in 1.0, and should be efficient:

function Base.getproperty(x::AbstractMatrix, name::Symbol)
    if name === :T
        return transpose(x) 
    #elseif name === :H # can also do this, though not sure why we'd want to overload with `'`
    #    return adjoint(x)
    else
        return getfield(x, name)
    end
end

This is surprisingly easy and kind of neat. The downside seems to be that orthogonal uses of getproperty don't compose with each other. So anybody implementing getproperty on their specific matrix type will need to implement the generic behavior by hand.

c42f · 2018-08-31T22:57:14Z

orthogonal uses of getproperty don't compose

Hmm. I wonder if that implies x.T "should" have been lowered to getproperty(x, Val(:T)). I shudder to think what that would do to the poor compiler though.

andyferris · 2018-09-01T04:47:42Z

I’m sure everyone has their opinion - but to me it’s almost a feature that it’s hard to build a generic interface out of dot syntax. Don’t get me wrong, it’s a really great feature and wonderful for defining named tuple-like structs, and so-on.

(It’s also possible to add a Val dispatch layer to your types pretty easily).

mattcbro · 2018-09-01T06:10:19Z

@c42f 's code works like a charm. Unfortunately for me I'm trying to write code that works on versions 0.64 and up, which forces me to use either transpose or my own defined function T(A) = transpose(A). Perhaps a macro would have been a little cleaner and slightly more efficient.

c42f · 2018-09-01T11:09:27Z

To be clear, I'm not suggesting defining this particular getproperty is a good idea for user code. It's just likely to break things in the longer term ;-) Though perhaps one day we'll have a good enough feel for the consequences that we could have x.T defined in Base.

But in a general sense, I do wonder why this kind of property use for defining "getters" in generic interfaces is actually bad. For example, generic field getter functions currently have a whopping big namespace problem which is simply solved by judicious use of getproperty. It's far nicer to write x.A than to write MyModule.A(x), some longer ugly function name like get_my_A(x), or to export the extremely generic name A from a user module. The only problem as I see it, is the expected ability to override the meaning of .B for subtypes independently of .A being defined generically on a super type. Hence the half serious comment about Val.

StefanKarpinski · 2018-09-10T15:20:25Z

Funny idea:

julia> x'̄
ERROR: syntax: invalid character "̄"

The character looks kind of like a T but it's actually a ' with a bar over it. Not sure if serious...

KristofferC · 2018-09-10T15:30:16Z

mbauman · 2018-09-10T15:32:47Z

Yeah, looks like that on GitHub to me, too. But it's an overbar. Copy and paste into my terminal shows:

Too clever and cute. I still do like the combining characters, though, and I think 'ᵀ is nice.

johnalx · 2020-04-28T13:02:13Z

-100 to changing adjoint, since it's one of the awesome things that makes writing Julia code as clear as writing math, plus conjugate transpose is usually what you want anyway so it makes sense to have an abbreviated syntax for it.

There is a certain arrogance to a statement like this. Consider that some finite proportion of developers explicitly do not want adjoint() but need transpose().

Case and point for us working with symbolic calculations for modeling the default ' operator would cause for example the pseudo-inverse (A'*A)\(A*b) or a quadratic form v'*A*v to return erroneously long and complex results that cannot be reduced.

Maybe the solution is some kind of compiler directive declaring the meaning of '.

StefanKarpinski mentioned this issue Mar 15, 2017

Taking matrix transposes seriously #20978

Closed

ararslan added design Design of APIs or of the language itself linear algebra Linear algebra speculative Whether the change will be implemented is speculative labels Mar 15, 2017

mkborregaard referenced this issue in JuliaPlots/Plots.jl Jun 10, 2017

Fix some deprecations

3a2ee0f

stevengj mentioned this issue Jun 28, 2017

Transpose not working after upgrading to 0.6 JuliaPy/PyCall.jl#407

Closed

JeffBezanson added the triage This should be discussed on a triage call label Nov 8, 2017

JeffBezanson added the parser Language parsing and surface syntax label Nov 8, 2017

StefanKarpinski removed the triage This should be discussed on a triage call label Nov 9, 2017

perrutquist mentioned this issue Aug 7, 2018

Allow operator suffixes to the postfix apostrophe #28494

Closed

stevengj mentioned this issue Jan 28, 2020

Unicode modifiers for adjoint operator #34507

Closed

MasonProtter mentioned this issue Apr 25, 2020

new postfix transpose() operator different from ' #35593

Closed

jebej mentioned this issue Sep 26, 2020

Small feature request: hvec function #37768

Open

simeonschaub mentioned this issue Oct 16, 2020

add 'ᵀ postfix operator for transpose #38062

Merged

JeffBezanson closed this as completed in #38062 Oct 23, 2020

jessebett mentioned this issue Dec 5, 2020

Parenthesis Postfix Function Syntax Allowing (x,y)f #38704

Closed

dnadlinger mentioned this issue Mar 17, 2021

'ᵀ as syntax for transpose #40070

Closed

new syntax for transpose #21037

new syntax for transpose #21037

Comments

StefanKarpinski commented Mar 15, 2017

mbauman commented Mar 15, 2017

StefanKarpinski commented Mar 15, 2017

nalimilan commented Mar 15, 2017

mauro3 commented Mar 15, 2017

ararslan commented Mar 15, 2017

stevengj commented Mar 16, 2017 • edited

StefanKarpinski commented Mar 16, 2017

StefanKarpinski commented Mar 16, 2017

stevengj commented Mar 16, 2017 • edited

StefanKarpinski commented Mar 17, 2017

stevengj commented Mar 17, 2017

jebej commented Jun 5, 2017

stevengj commented Jun 5, 2017 • edited

bkamins commented Jun 28, 2017

andreasnoack commented Jun 28, 2017

bkamins commented Jun 28, 2017 • edited by andreasnoack

mbauman commented Jun 28, 2017

Sacha0 commented Sep 21, 2017

ttparker commented Nov 8, 2017 • edited

JeffBezanson commented Nov 8, 2017

StefanKarpinski commented Nov 9, 2017

StefanKarpinski commented Aug 5, 2018

ttparker commented Aug 5, 2018

perrutquist commented Aug 7, 2018

ararslan commented Aug 7, 2018

briochemc commented Aug 28, 2018 • edited

c42f commented Aug 29, 2018

briochemc commented Aug 29, 2018 • edited

mattcbro commented Aug 31, 2018

andyferris commented Aug 31, 2018 • edited

Jutho commented Aug 31, 2018

stevengj commented Aug 31, 2018 • edited

Jutho commented Aug 31, 2018

mattcbro commented Aug 31, 2018

c42f commented Aug 31, 2018 • edited

c42f commented Aug 31, 2018

andyferris commented Sep 1, 2018

mattcbro commented Sep 1, 2018

c42f commented Sep 1, 2018

StefanKarpinski commented Sep 10, 2018

KristofferC commented Sep 10, 2018

mbauman commented Sep 10, 2018 • edited

johnalx commented Apr 28, 2020 • edited

stevengj commented Mar 16, 2017 •

edited

stevengj commented Mar 16, 2017 •

edited

stevengj commented Jun 5, 2017 •

edited

bkamins commented Jun 28, 2017 •

edited by andreasnoack

ttparker commented Nov 8, 2017 •

edited

briochemc commented Aug 28, 2018 •

edited

briochemc commented Aug 29, 2018 •

edited

andyferris commented Aug 31, 2018 •

edited

stevengj commented Aug 31, 2018 •

edited

c42f commented Aug 31, 2018 •

edited

mbauman commented Sep 10, 2018 •

edited

johnalx commented Apr 28, 2020 •

edited