Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about "self.offsets" some questions #15

Open
CallmeChenChen opened this issue Feb 3, 2020 · 3 comments
Open

about "self.offsets" some questions #15

CallmeChenChen opened this issue Feb 3, 2020 · 3 comments

Comments

@CallmeChenChen
Copy link

Dear DaLao:
what's the function of "self.offsets" ?

self.offsets = np.array((0, *np.cumsum(field_dims)[:-1]), dtype=np.long)
def forward(self, x):
x = x + x.new_tensor(self.offsets).unsqueeze(0)

@KwangKa
Copy link

KwangKa commented Apr 22, 2021

feature index offset of each field

@cpy18727
Copy link

cpy18727 commented Nov 14, 2021

Dear DaLao: what's the function of "self.offsets" ?

self.offsets = np.array((0, *np.cumsum(field_dims)[:-1]), dtype=np.long) def forward(self, x): x = x + x.new_tensor(self.offsets).unsqueeze(0)

    # e.g. field_dims = [2, 3, 4, 5], offsets = [0, 2, 5, 9]
    # 索引的偏移量
    # 因为所有特征共用一个 Embedding表
    # 所以,实际表中 0~1行  对应 特征 X0, 即 field_dims[0]
    #               2~4行  对应 特征 X1, 即 field_dims[1]
    #               5~8行  对应 特征 X2, 即 field_dims[2]
    #               9~14行 对应 特征 X3, 即 field_dims[3]
    # 但实际特征取值 forward(self, x) 的 x大小 只在自身词表内取值
    # 比如:X1取值0,对应Embedding内行数就是 offsets[X1] + X1 = 2 + 0 = 2

@tsWen0309
Copy link

细节上说的有一点小小的错误

  • 若field_dim = [5,10,5], np.cumsum(field_dims)[:-1]=[5,15], 所以源码中 补了一个零 np.array((0, *np.cumsum(field_dims)[:-1])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants