Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

what is the meaning of the following code in''MPViT-main\semantic_segmentation\configs\_base_\models'' #13

Open
zhouqunbing opened this issue Jul 28, 2022 · 1 comment

Comments

@zhouqunbing
Copy link

这段代码具体对应于文中的哪个模块呀?为什么会牵扯到相对位置编码呢?
class ConvRelPosEnc(nn.Module):
"""Convolutional relative position encoding."""
def init(self, Ch, h, window):
"""Initialization.

    Ch: Channels per head.
    h: Number of heads.
    window: Window size(s) in convolutional relative positional encoding.
            It can have two forms:
            1. An integer of window size, which assigns all attention heads
               with the same window size in ConvRelPosEnc.
            2. A dict mapping window size to #attention head splits
               (e.g. {window size 1: #attention head split 1, window size
                                  2: #attention head split 2})
               It will apply different window size to
               the attention head splits.
    """
    super().__init__()

    if isinstance(window, int):
        # Set the same window size for all attention heads.
        window = {window: h}
        self.window = window
    elif isinstance(window, dict):
        self.window = window
    else:
        raise ValueError()

    self.conv_list = nn.ModuleList()
    self.head_splits = []
    for cur_window, cur_head_split in window.items():
        dilation = 1  # Use dilation=1 at default.
        padding_size = (cur_window + (cur_window - 1) *
                        (dilation - 1)) // 2
        cur_conv = nn.Conv2d(
            cur_head_split * Ch,
            cur_head_split * Ch,
            kernel_size=(cur_window, cur_window),
            padding=(padding_size, padding_size),
            dilation=(dilation, dilation),
            groups=cur_head_split * Ch,
            )
        self.conv_list.append(cur_conv)
        self.head_splits.append(cur_head_split)
    self.channel_splits = [x * Ch for x in self.head_splits]

def forward(self, q, v, size):
    """foward function"""
    B, h, N, Ch = q.shape
    H, W = size

    # We don't use CLS_TOKEN
    q_img = q
    v_img = v

    # Shape: [B, h, H*W, Ch] -> [B, h*Ch, H, W].
    v_img = rearrange(v_img, "B h (H W) Ch -> B (h Ch) H W", H=H, W=W)
    # Split according to channels.
    v_img_list = torch.split(v_img, self.channel_splits, dim=1)
    conv_v_img_list = [
        conv(x) for conv, x in zip(self.conv_list, v_img_list)
    ]
    conv_v_img = torch.cat(conv_v_img_list, dim=1)
    # Shape: [B, h*Ch, H, W] -> [B, h, H*W, Ch].
    conv_v_img = rearrange(conv_v_img, "B (h Ch) H W -> B h (H W) Ch", h=h)

    EV_hat_img = q_img * conv_v_img
    EV_hat = EV_hat_img
    return EV_hat
@1787648106
Copy link

Transformer中的分解自注意力的计算那块。
另外有没有人使用多卡环境跑的时候,每个两个验证报torch.distributed.elastic.multiprocessing.api:failed(exitcode:-9)然后程序终止的呀?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants