Skip to content

Understanding the ConvLayer Schedule #8196

Answered by abadams
FabianSchuetze asked this question in Q&A
Discussion options

You must be logged in to vote

Consider a matrix multiply sum_k A_ij * B_jk. Note that the value of B you use in the inner loop doesn't depend on i, so you want to unroll over i so that you can load a value of B once and then use it multiple times. Similarly, the value of A you use doesn't depend on k, so you also want to unroll over k a bit for the same reason. For an n x m tile, this means you do O(n + m) loads and O(n * m) FMAs. Many other sane-looking schedules will do two loads per FMA. You want n and m to be as large as possible to maximize the number of FMAs per load. You stop when you either run out of registers and start spilling to the stack (introducing more loads and stores), or when you become limited by t…

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@FabianSchuetze
Comment options

Answer selected by FabianSchuetze
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants