Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

报bug:DSIN特征穿越问题? #29

Open
zjpf opened this issue Jan 20, 2022 · 0 comments
Open

报bug:DSIN特征穿越问题? #29

zjpf opened this issue Jan 20, 2022 · 0 comments

Comments

@zjpf
Copy link

zjpf commented Jan 20, 2022

bug位置在文件2_gen_dsin_input.py第52行:last_sess_idx = i。当用户没有大于2个行为的session时,last_sess_idx = len(user_hist_session[user]) - 1,而不是等于0。导致第56行定位用户前4个session时,取的是最新的4个session,而非当前session前4个session。因此造成部分样本会使用到label时间之后的特征。
“11,1494226737,302383,430548_1007,1,0
11,1494226737,598359,430548_1007,1,0
11,1494226737,684497,430548_1007,1,0
11,1494419569,427488,430548_1007,1,0
11,1494419569,611964,430548_1007,1,0
11,1494419569,739213,430548_1007,1,0”,例如raw_sample中user_id=11,时间=1494226737的3个样本就是这种情况。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant