[Feature]: Split CJK names #2624

ZnqbuZ · 2023-08-18T15:06:09Z

Debug log ID

NA

What happened?

I believe many people would like to keep a Chinese name in a whole:

instead of splitting it:

Still, we want the first & last names to be correctly capitalized, e.g. "杨莲亭" has pinyin "yang lian ting", and should be capitalized like "YangLianting".

Usually, hacks like auth.substring(1,1).clean.capitalize + auth.substring(2).clean.capitalize do the trick, since most Chinese surnames is simply 1 character.

However, there are some surnames consisting of 2 characters. For example, “东方不败” has pinyin "dong fang bu bai", and should be capitalized like "DongfangBubai" rather than "DongFangbubai", which the formula I mentioned will give.

I tried using jieba, but it seems to think of a name as one word. Please correct me if that's not the case.

So, to achieve this, I wrote a simple JavaScript snippet to split Chinese names. It extracts the first 2 characters of a name, and looks them up in a dictionary, to see if the chars constitute a surname. Currently, it supports Simplified & Traditional Chinese. Korean / Japanese could also be supported once someone gives me a list of Korean / Japanese compound surnames.

I hope you could consider adding this function.

function splitName(name, lang) {
    var compoundSurnames = {
        'zh-hans': ['阿单', '阿跌', '阿贺', '阿会', '阿里', '阿仑', '阿罗', '阿热', '哀骀', '艾岁', '安迟', '安都', '安国', '安金', '安陵', '安平', '安期', '安丘', '安是', '安阳', '奥敦', '奥鲁', '奥屯', '阿史那', '巴公', '拔拔', '拔列', '拔略', '拔也', '把利', '罢敌', '白马', '白狄', '白公', '白侯', '白鹿', '白鸾', '白冥', '白男', '白象', '白亚', '白乙', '白石', '百里', '柏常', '柏侯', '柏高', '班丘', '阪泉', '阪上', '鲍丘', '鲍俎', '苞丘', '卑梁', '卑徐', '北方', '北宫', '北郭', '北海', '北旄', '北门', '北比', '北丘', '北人', '北唐', '北堂', '北乡', '北殷', '北野', '北城', '北关', '北辰', '北山', '倍俟', '奔水', '逼阳', '比丘', '比人', '闭珊', '辟闾', '宾牟', '并官', '波斯', '拨略', '薄奚', '薄野', '伯比', '伯夫', '伯常', '伯成', '伯德', '伯封', '伯丰', '伯高', '伯昏', '伯暋', '伯夏', '伯有', '伯州', '伯宗', '不第', '不戴', '驳马', '薄姑', '薄奚', '薄野', '卜成', '卜梁', '卜马', '步叔', '步扬', '步温', '池张', '陈一', '曹牟', '曹丘', '常涛', '长鱼', '车非', '成功', '成公', '成阳', '乘马', '叱卢', '丑门', '樗里', '穿封', '淳于', '单于', '答禄', '达勃', '达步', '达奚', '登徒', '邓陵', '第一', '第二', '第三', '第四', '第五', '第六', '第七', '第八', '地连', '地伦', '东方', '东里', '东南', '东宫', '东门', '东乡', '东丹', '东郭', '东陵', '东关', '东闾', '东阳', '东野', '东莱', '豆卢', '斗于', '都尉', '独孤', '端木', '段干', '多子', '尔朱', '方雷', '丰将', '封人', '封父', '夫蒙', '夫余', '浮丘', '富察', '傅其', '傅余', '棼冒', '蚡冒', '范姜', '干已', '高车', '高陵', '高堂', '高阳', '高辛', '皋落', '哥舒', '盖楼', '庚桑', '梗阳', '宫孙', '公羊', '公良', '公孙', '公罔', '公西', '公冶', '公敛', '公梁', '公输', '公上', '公山', '公户', '公玉', '公仪', '公仲', '公甲', '公坚', '公宾', '公伯', '公祖', '公乘', '公晰', '公族', '姑布', '古口', '古龙', '古赖', '古孙', '穀梁', '谷浑', '瓜田', '关龙', '鲑阳', '归海', '虢射', '函治', '韩余', '罕井', '浩生', '浩星', '纥骨', '纥奚', '纥于', '贺陈', '贺拨', '贺兰', '贺楼', '赫连', '赫王', '黑齿', '黑肱', '侯冈', '呼延', '壶丘', '呼衍', '斛律', '胡非', '胡母', '胡毋', '胡林', '忽仑', '皇甫', '皇父', '花裳', '火拔', '胡桃', '兀官', '吉白', '即墨', '季夙', '季瓜', '季连', '季融', '季孙', '季尹', '茄众', '蒋丘', '姜匀', '金齿', '晋楚', '京城', '经孙', '泾阳', '九百', '九方', '九吾', '睢鸠', '沮渠', '巨母', '勘阻', '渴侯', '渴单', '可汗', '空桐', '崆峒', '空桑', '空相', '昆吾', '老阳', '郎佳', '乐羊', '荔菲', '栎阳', '梁丘', '梁由', '梁余', '梁垣', '陵阳', '伶舟', '冷沦', '令狐', '柳下', '龙丘', '龙藤', '卢妃', '卢蒲', '鲁步', '甪里', '陆费', '闾丘', '禄阁', '马矢', '马师', '麦丘', '麦卢', '茅夷', '蒙山', '孟孙', '弥牟', '密革', '密茅', '墨夷', '墨台', '万俊', '昌顿', '慕容', '木门', '木易', '万俟', '孟玄', '纳喇', '那拉', '纳兰', '南宫', '南郭', '南门', '南荣', '南离', '宁李', '欧侯', '欧阳', '逄门', '盆成', '彭祖', '平陵', '平宁', '破丑', '仆固', '濮阳', '浦思', '漆雕', '奇介', '綦母', '綦毋', '綦连', '祁连', '乞伏', '绮里', '千代', '千乘', '勤宿', '青阳', '丘丽', '丘陵', '曲沃', '屈侯', '屈突', '屈男', '屈卢', '屈同', '屈门', '屈引', '七七', '壤驷', '扰龙', '容成', '汝嫣', '萨孤', '三饭', '三闾', '三州', '桑丘', '商瞿', '上官', '尚方', '少师', '少施', '少室', '少叔', '少正', '社南', '社北', '申屠', '申徒', '沈犹', '神农', '胜屠', '石作', '石雨', '石牛', '侍其', '士季', '士弱', '士孙', '士贞', '叔敖', '叔梁', '叔孙', '叔先', '叔促', '水丘', '司城', '司空', '司寇', '司鸿', '司马', '司徒', '司士', '似和', '素和', '素黎', '夙沙', '孙阳', '索阳', '索卢', '沈江', '沓卢', '太史', '太叔', '太阳', '淡台', '唐山', '堂溪', '陶丘', '同蹄', '统奚', '秃发', '涂钦', '屠岸', '吐火', '吐贺', '吐万', '吐罗', '吐缶', '吐难', '吐缶', '吐浑', '吐奚', '吐和', '屯浑', '脱脱', '秃发', '拓拨', '拓跋', '澹台', '谭刘', '太宰', '完颜', '王孙', '王官', '王人', '王刘', '王子', '微生', '尾勺', '温孤', '温稽', '闻人', '屋户', '巫马', '巫许', '吾丘', '无庸', '无钩', '无终', '五鹿', '五鸠', '武安', '吴刘', '王黄', '息夫', '西陵', '西乞', '西钥', '西乡', '西门', '西周', '西郭', '西方', '西野', '西宫', '戏阳', '瑕吕', '霞露', '夏侯', '鲜虞', '鲜于', '鲜阳', '咸丘', '相里', '解枇', '谢丘', '新垣', '辛垣', '信都', '信平', '修鱼', '徐辜', '徐吾', '徐藤', '徐离', '宣于', '轩辕', '轩丘', '阏氏', '延陵', '罔法', '铅陵', '羊角', '耶律', '叶阳', '伊祁', '伊耆', '猗卢', '义渠', '邑由', '意如', '因孙', '银齿', '尹文', '雍门', '游水', '由吾', '右师', '有莘', '宥连', '于陵', '虞丘', '盂丘', '宇文', '尉迟', '乐羊', '乐正', '运龙', '运期', '宰父', '辗迟', '湛卢', '臧孙', '章仇', '仉督', '长孙', '长儿', '张廖', '张简', '真鄂', '正令', '执头', '中央', '中长', '中行', '中野', '中英', '中梁', '中垒', '钟离', '钟吾', '终黎', '终葵', '仲孙', '仲长', '周阳', '周氏', '周生', '朱阳', '诸葛', '主父', '颛孙', '颛顼', '訾辱', '淄丘', '子言', '子人', '子服', '子家', '子桑', '子南', '子叔', '子车', '子阳', '宗伯', '宗正', '宗政', '尊卢', '昨和', '左人', '左丘', '左师', '左行', '佐南'],
        'zh-hant': ['阿單', '阿跌', '阿賀', '阿會', '阿裏', '阿侖', '阿羅', '阿熱', '哀駘', '艾歲', '安遲', '安都', '安國', '安金', '安陵', '安平', '安期', '安丘', '安是', '安陽', '奧敦', '奧魯', '奧屯', '阿史那', '巴公', '拔拔', '拔列', '拔略', '拔也', '把利', '罷敵', '白馬', '白狄', '白公', '白侯', '白鹿', '白鸞', '白冥', '白男', '白象', '白亞', '白乙', '白石', '百裏', '柏常', '柏侯', '柏高', '班丘', '阪泉', '阪上', '鮑丘', '鮑俎', '苞丘', '卑梁', '卑徐', '北方', '北宮', '北郭', '北海', '北旄', '北門', '北比', '北丘', '北人', '北唐', '北堂', '北鄉', '北殷', '北野', '北城', '北關', '北辰', '北山', '倍俟', '奔水', '逼陽', '比丘', '比人', '閉珊', '辟閭', '賓牟', '並官', '波斯', '撥略', '薄奚', '薄野', '伯比', '伯夫', '伯常', '伯成', '伯德', '伯封', '伯豐', '伯高', '伯昏', '伯暋', '伯夏', '伯有', '伯州', '伯宗', '不第', '不戴', '駁馬', '薄姑', '薄奚', '薄野', '蔔成', '蔔梁', '蔔馬', '步叔', '步揚', '步溫', '池張', '陳一', '曹牟', '曹丘', '常濤', '長魚', '車非', '成功', '成公', '成陽', '乘馬', '叱盧', '醜門', '樗裏', '穿封', '淳於', '單於', '答祿', '達勃', '達步', '達奚', '登徒', '鄧陵', '第一', '第二', '第三', '第四', '第五', '第六', '第七', '第八', '地連', '地倫', '東方', '東裏', '東南', '東宮', '東門', '東鄉', '東丹', '東郭', '東陵', '東關', '東閭', '東陽', '東野', '東萊', '豆盧', '鬥於', '都尉', '獨孤', '端木', '段幹', '多子', '爾朱', '方雷', '豐將', '封人', '封父', '夫蒙', '夫余', '浮丘', '富察', '傅其', '傅余', '棼冒', '蚡冒', '範姜', '幹已', '高車', '高陵', '高堂', '高陽', '高辛', '臯落', '哥舒', '蓋樓', '庚桑', '梗陽', '宮孫', '公羊', '公良', '公孫', '公罔', '公西', '公冶', '公斂', '公梁', '公輸', '公上', '公山', '公戶', '公玉', '公儀', '公仲', '公甲', '公堅', '公賓', '公伯', '公祖', '公乘', '公晰', '公族', '姑布', '古口', '古龍', '古賴', '古孫', '穀梁', '谷渾', '瓜田', '關龍', '鮭陽', '歸海', '虢射', '函治', '韓余', '罕井', '浩生', '浩星', '紇骨', '紇奚', '紇於', '賀陳', '賀撥', '賀蘭', '賀樓', '赫連', '赫王', '黑齒', '黑肱', '侯岡', '呼延', '壺丘', '呼衍', '斛律', '胡非', '胡母', '胡毋', '胡林', '忽侖', '皇甫', '皇父', '花裳', '火拔', '胡桃', '兀官', '吉白', '即墨', '季夙', '季瓜', '季連', '季融', '季孫', '季尹', '茄眾', '蔣丘', '姜勻', '金齒', '晉楚', '京城', '經孫', '涇陽', '九百', '九方', '九吾', '睢鳩', '沮渠', '巨母', '勘阻', '渴侯', '渴單', '可汗', '空桐', '崆峒', '空桑', '空相', '昆吾', '老陽', '郎佳', '樂羊', '荔菲', '櫟陽', '梁丘', '梁由', '梁余', '梁垣', '陵陽', '伶舟', '冷淪', '令狐', '柳下', '龍丘', '龍藤', '盧妃', '盧蒲', '魯步', '甪裏', '陸費', '閭丘', '祿閣', '馬矢', '馬師', '麥丘', '麥盧', '茅夷', '蒙山', '孟孫', '彌牟', '密革', '密茅', '墨夷', '墨臺', '萬俊', '昌頓', '慕容', '木門', '木易', '萬俟', '孟玄', '納喇', '那拉', '納蘭', '南宮', '南郭', '南門', '南榮', '南離', '寧李', '歐侯', '歐陽', '逄門', '盆成', '彭祖', '平陵', '平寧', '破醜', '仆固', '濮陽', '浦思', '漆雕', '奇介', '綦母', '綦毋', '綦連', '祁連', '乞伏', '綺裏', '千代', '千乘', '勤宿', '青陽', '丘麗', '丘陵', '曲沃', '屈侯', '屈突', '屈男', '屈盧', '屈同', '屈門', '屈引', '七七', '壤駟', '擾龍', '容成', '汝嫣', '薩孤', '三飯', '三閭', '三州', '桑丘', '商瞿', '上官', '尚方', '少師', '少施', '少室', '少叔', '少正', '社南', '社北', '申屠', '申徒', '沈猶', '神農', '勝屠', '石作', '石雨', '石牛', '侍其', '士季', '士弱', '士孫', '士貞', '叔敖', '叔梁', '叔孫', '叔先', '叔促', '水丘', '司城', '司空', '司寇', '司鴻', '司馬', '司徒', '司士', '似和', '素和', '素黎', '夙沙', '孫陽', '索陽', '索盧', '沈江', '沓盧', '太史', '太叔', '太陽', '淡臺', '唐山', '堂溪', '陶丘', '同蹄', '統奚', '禿發', '塗欽', '屠岸', '吐火', '吐賀', '吐萬', '吐羅', '吐缶', '吐難', '吐缶', '吐渾', '吐奚', '吐和', '屯渾', '脫脫', '禿發', '拓撥', '拓跋', '淡臺', '譚劉', '太宰', '完顏', '王孫', '王官', '王人', '王劉', '王子', '微生', '尾勺', '溫孤', '溫稽', '聞人', '屋戶', '巫馬', '巫許', '吾丘', '無庸', '無鉤', '無終', '五鹿', '五鳩', '武安', '吳劉', '王黃', '息夫', '西陵', '西乞', '西鑰', '西鄉', '西門', '西周', '西郭', '西方', '西野', '西宮', '戲陽', '瑕呂', '霞露', '夏侯', '鮮虞', '鮮於', '鮮陽', '鹹丘', '相裏', '解枇', '謝丘', '新垣', '辛垣', '信都', '信平', '修魚', '徐辜', '徐吾', '徐藤', '徐離', '宣於', '軒轅', '軒丘', '閼氏', '延陵', '罔法', '鉛陵', '羊角', '耶律', '葉陽', '伊祁', '伊耆', '猗盧', '義渠', '邑由', '意如', '因孫', '銀齒', '尹文', '雍門', '遊水', '由吾', '右師', '有莘', '宥連', '於陵', '虞丘', '盂丘', '宇文', '尉遲', '樂羊', '樂正', '運龍', '運期', '宰父', '輾遲', '湛盧', '臧孫', '章仇', '仉督', '長孫', '長兒', '張廖', '張簡', '真鄂', '正令', '執頭', '中央', '中長', '中行', '中野', '中英', '中梁', '中壘', '鐘離', '鐘吾', '終黎', '終葵', '仲孫', '仲長', '周陽', '周氏', '周生', '朱陽', '諸葛', '主父', '顓孫', '顓頊', '訾辱', '淄丘', '子言', '子人', '子服', '子家', '子桑', '子南', '子叔', '子車', '子陽', '宗伯', '宗正', '宗政', '尊盧', '昨和', '左人', '左丘', '左師', '左行', '佐南']
    }
    var splitIndex = compoundSurnames[lang].includes(name.substr(0, 2)) ? 2 : 1
    return [name.substr(0, splitIndex), name.substr(splitIndex)]
}

splitName('东方不败', 'zh-hans') will gives ['东方', '不败'], and should be eventually capitalized to ['Dongfang', 'Bubai'].

The text was updated successfully, but these errors were encountered:

github-actions · 2023-08-18T15:06:22Z

Hello there @ZnqbuZ,

Hope you're doing well! @retorquere is here to help you get the most out of your experience with Better BibTeX. To make sure he can assist you effectively, he kindly asks for your cooperation in providing a debug log – it's like giving him the key to understanding and solving the puzzle!

Getting your debug log is a breeze and will save us both time. Trust me, it's way quicker than discussing why it's important. 😃

How to Share Your Debug Log:

If the issue involves specific references or exports, just right-click on the relevant item(s) and choose "Better BibTeX -> Submit Better BibTeX debug log" from the menu.
For other issues, follow these simple steps:
- Restart Zotero with debugging enabled (Help -> Debug Output Logging -> Restart with logging enabled).
- Reproduce the problem.
- Select "Send Better BibTeX debug report..." from the help menu.

Once you hit that submit button, you'll get a special red debug ID. Just share that with @retorquere in this issue thread. If the question is regarding an export, don't forget to include what you see exported and what you expected.

By sharing your debug log, you're giving @retorquere a clearer picture of your setup and the items causing the issue. It's like a superhero cape for him – he can swoop in and tackle the problem much faster.

We totally get that your time is valuable, and we appreciate your effort in helping @retorquere help you. You might be surprised at how much this simple step speeds up the whole process.

Thanks a bunch!

retorquere · 2023-08-18T15:39:18Z

A debug log is not "not applicable" here. A debug log per point 1 gives me the entry we're discussing here -- I cannot enter Chinese names myself.

ZnqbuZ · 2023-08-18T15:53:25Z

Sorry. I've sent a log with ID YeGr1kqXOgnV-6U3RYALN

ZnqbuZ · 2023-08-18T16:02:44Z

A log with more examples ZAVVH2PE-apse/6.7.112-6 was sent.

retorquere · 2023-08-18T16:05:01Z

Thank you.

retorquere · 2023-08-18T16:15:33Z

Does that mean that there is a definitive list of compound Chinese family names, and that they all exist of two characters?

@duncdrum, can I ask you to jump in? I mean no offence @ZnqbuZ but I don't know anything about Chinese so if there's anything to discuss I need to have others involved.

ZnqbuZ · 2023-08-18T16:35:00Z

Does that mean that there is a definitive list of compound Chinese family names, and that they all exist of two characters?

Yes, for all modern names and almost all ancient names. I got the list from wikipedia and I'm pretty sure that list contains all surnames used by people in recent 150 years. Actually, only 81 of them are still used nowadays. However, the Chinese history is so long (~5000 years) that I doubt there exists a full list.

I guess you could add a filter and let users choose if they want to use it. And store the name list in configuration so users can modify it.

After some investigation, I found there seemed to be surnames of 3 chars in 2000-3000 years ago. I don't think it's possible that they happen to be authors of any document...

retorquere · 2023-08-19T01:34:10Z

I tried using jieba, but it seems to think of a name as one word. Please correct me if that's not the case.

Jieba puts spaces between the characters which makes each character a "word" for the citekey formatter.

ZnqbuZ · 2023-08-19T01:47:15Z

I tried using jieba, but it seems to think of a name as one word. Please correct me if that's not the case.

Jieba puts spaces between the characters which makes each character a "word" for the citekey formatter.

I think jieba's hanling of names is expected.

By the way, I observed some strange behaviours in capitalization of Chinese titles, which should be jieba's problem. Are you still using js-jieba? It seems to be outdated. I wonder what prevent you from using C library? Have you considered using WASM?

retorquere · 2023-08-19T02:33:02Z

By the way, I observed some strange behaviours in capitalization of Chinese titles, which should be jieba's problem. Are you still using js-jieba? It seems to be outdated.

Still using js-jieba, indeed

I wonder what prevent you from using C library?

Using C code in Zotero extensions is not trivial. It's not work I'm keen to pick up.

Have you considered using WASM?

I've looked into it briefly but I'd only consider it if there was a clean javascript wrapper for an already-compiled wasm binary. I don't want to get into a whole new programming language for this.

retorquere · 2023-08-19T02:42:11Z

The wrappers that do exist either assume node as an environment, where they use node-specific libraries like fs or stream to load wasm from disk, or web, where they assume the wasm will be served from a http(s) server. Zotero is a weird mix of both environments that no library understands. Pure-JS libraries usually run just fine. Anything that's not requires working around the library -- I have monkey-patches in place to reroute the jieba dictionary loading for example. I'll take a look at jieba-wasm.

retorquere · 2023-08-19T08:48:13Z

Can you see whether https://www.npmjs.com/package/jieba-wasm offers different cutting modes for cn and tw (jieba-js offers use of jieba-zh-tw and jieba-zh-cn as cutting modes)?

retorquere · 2023-08-19T08:50:29Z

Also what the different cut functions and their parameters mean?

retorquere · 2023-08-19T09:24:16Z

Is there also a full list of single-character Chinese family names?

retorquere · 2023-08-19T09:29:50Z

jieba puts spaces between the characters which makes each character a "word" for the citekey formatter.

this is incorrect. auth.jieba cuts up auth using whatever rules jieba applies -- I know absolutely nothing about Chinese, so I don't know what jieba does either. It is auth.ideographs that puts spaces between the characters which makes each character a "word" for the citekey formatter.

retorquere · 2023-08-19T09:50:36Z

ZAVVH2PE-apse does not contain samples, logs with samples have -refs- in the debug log ID. See point 1. above.

retorquere · 2023-08-19T09:53:58Z

Can you export the items from YeGr1kqXOgnV-6U3RYALN to RDF and attach them to this issue? YeGr1kqXOgnV-6U3RYALN does contain items but I cannot import them.

ZnqbuZ · 2023-08-19T10:45:43Z

Can you see whether https://www.npmjs.com/package/jieba-wasm offers different cutting modes for cn and tw (jieba-js offers use of jieba-zh-tw and jieba-zh-cn as cutting modes)?

I'm creating a testing environment.

Is there also a full list of single-character Chinese family names?

Yes, there is, but do we really need it? I mean modern Chinese family names are either 2 chars or 1 char - so we just need a function to ensure that content in author field is truly a Chinese name, basically a utf8 range checker is ok. I can write it if needed.

jieba puts spaces between the characters which makes each character a "word" for the citekey formatter.

this is incorrect. auth.jieba cuts up auth using whatever rules jieba applies -- I know absolutely nothing about Chinese, so I don't know what jieba does either. It is auth.ideographs that puts spaces between the characters which makes each character a "word" for the citekey formatter.

It's hard for a segmentation library to deal with names. Analogically speaking, it may cut "WallaceGoodman" to "Wall Ace Good Man"

ZAVVH2PE-apse does not contain samples, logs with samples have -refs- in the debug log ID. See point 1. above.

I'm sorry. I'm creating a testing library. Soon it will be uploaded.

retorquere · 2023-08-19T10:47:59Z

Yes, there is, but do we really need it? I mean modern Chinese family names are either 2 chars or 1 char - so we just need a function to ensure that content in author field is truly a Chinese name, basically a utf8 range checker is ok. I can write it if needed.

No need, that's already in my current tests.

I'm creating a testing library. Soon it will be uploaded.

Thanks.

retorquere · 2023-11-22T19:23:58Z

Just got back from zotero-dev that it might be because Z6 only supports the MVP specification.

Is it possible to compile jieba-rs to stay within that spec (until Zotero 7 goes GA)?

ZnqbuZ · 2023-11-22T19:53:27Z

Not a single name is correctly parsed... How did you use the lib?

Sorry -- I was just using auth. But auth.jieba returns all-lowercase now. What should I get instead of changsunwuji?

It should be ZhangsunWuji - I have written the right version in a filed of the items, maybe titles, I forgot it.

ZnqbuZ · 2023-11-22T20:00:43Z

Just got back from zotero-dev that it might be because Z6 only supports the MVP specification.

Is it possible to compile jieba-rs to stay within that spec (until Zotero 7 goes GA)?

Honestly, I'm not quite sure about how to do it now... Maybe I can manage it after several days of research.

retorquere · 2023-11-22T20:06:49Z

It should be ZhangsunWuji - I have written the right version in a filed of the items, maybe titles, I forgot it.

@book{changsunwuji::PossiblywrongpinyinHansAuthor:ZhangsunWuji,
@book{changsunwuji::PossiblywrongpinyinHantAuthor:ZhangsunWuji,
@book{dongfangbubai::HansAuthor:DongfangBubai,
@book{dongfangbubai::HantAuthor:DongfangBubai,
@book{linghuchong::HansAuthor:LinghuChong,
@book{linghuchong::HantAuthor:LinghuChong,
@book{murongfu::HansAuthor:MurongFu,
@book{murongfu::HantAuthor:MurongFu,
@book{ouyangfeng::HansAuthor:OuyangFeng,
@book{ouyangfeng::HantAuthor:OuyangFeng,
@book{renwohang::HansAuthor:RenWoxing,
@book{renwohang::HantAuthor:RenWoxing,
@book{shangguanyun::HansAuthor:ShangguanYun,
@book{shangguanyun::HantAuthor:ShangguanYun,
@book{simayi::HansAuthor:SimaYi,
@book{simayi::HantAuthor:SimaYi,
@book{simazhongda::HansAuthor:SimaZhongda,
@book{simazhongda::HantAuthor:SimaZhongda,
@book{weichirong::PossiblywrongpinyinHansAuthor:YuchiRong,
@book{weichirong::PossiblywrongpinyinHantAuthor:YuchiRong,
@book{xiahoudun::HansAuthor:XiahouDun,
@book{xiahoudun::HantAuthor:XiahouDun,
@book{yanglianting::HansAuthor:YangLianting,
@book{yanglianting::HantAuthor:YangLianting,
@book{zhugekongming::HansAuthor:ZhugeKongming,
@book{zhugekongming::HantAuthor:ZhugeKongming,
@book{zhugeliang::HansAuthor:ZhugeLiang,
@book{zhugeliang::HantAuthor:ZhugeLiang,

ZnqbuZ · 2023-11-22T20:09:39Z

Just got back from zotero-dev that it might be because Z6 only supports the MVP specification.

Is it possible to compile jieba-rs to stay within that spec (until Zotero 7 goes GA)?

Honestly, I'm not quite sure about how to do it now... Maybe I can manage it after several days of research

It should be ZhangsunWuji - I have written the right version in a filed of the items, maybe titles, I forgot it.

@book{changsunwuji::PossiblywrongpinyinHansAuthor:ZhangsunWuji,
@book{changsunwuji::PossiblywrongpinyinHantAuthor:ZhangsunWuji,
@book{dongfangbubai::HansAuthor:DongfangBubai,
@book{dongfangbubai::HantAuthor:DongfangBubai,
@book{linghuchong::HansAuthor:LinghuChong,
@book{linghuchong::HantAuthor:LinghuChong,
@book{murongfu::HansAuthor:MurongFu,
@book{murongfu::HantAuthor:MurongFu,
@book{ouyangfeng::HansAuthor:OuyangFeng,
@book{ouyangfeng::HantAuthor:OuyangFeng,
@book{renwohang::HansAuthor:RenWoxing,
@book{renwohang::HantAuthor:RenWoxing,
@book{shangguanyun::HansAuthor:ShangguanYun,
@book{shangguanyun::HantAuthor:ShangguanYun,
@book{simayi::HansAuthor:SimaYi,
@book{simayi::HantAuthor:SimaYi,
@book{simazhongda::HansAuthor:SimaZhongda,
@book{simazhongda::HantAuthor:SimaZhongda,
@book{weichirong::PossiblywrongpinyinHansAuthor:YuchiRong,
@book{weichirong::PossiblywrongpinyinHantAuthor:YuchiRong,
@book{xiahoudun::HansAuthor:XiahouDun,
@book{xiahoudun::HantAuthor:XiahouDun,
@book{yanglianting::HansAuthor:YangLianting,
@book{yanglianting::HantAuthor:YangLianting,
@book{zhugekongming::HansAuthor:ZhugeKongming,
@book{zhugekongming::HantAuthor:ZhugeKongming,
@book{zhugeliang::HansAuthor:ZhugeLiang,
@book{zhugeliang::HantAuthor:ZhugeLiang,

Yes, and those words after the last colons are correctly capitalized.

retorquere · 2023-11-22T20:11:04Z

Honestly, I'm not quite sure about how to do it now... Maybe I can manage it after several days of research

How do I build the package?

ZnqbuZ · 2023-11-22T20:13:09Z

Honestly, I'm not quite sure about how to do it now... Maybe I can manage it after several days of research

How do I build the package?

Sorry, what do you mean?

retorquere · 2023-11-22T20:13:28Z

Yes, and those words after the last colons are correctly capitalized.

But that's how they come out of auth.jieba. I'm not applying lower.

ZnqbuZ · 2023-11-22T20:15:37Z

Yes, and those words after the last colons are correctly capitalized.

But that's how they come out of auth.jieba. I'm not applying lower.

Have you used splitName in spellnames? It does give right answers for these names?

retorquere · 2023-11-22T20:15:53Z

Honestly, I'm not quite sure about how to do it now... Maybe I can manage it after several days of research

How do I build the package?

Sorry, what do you mean?

I've cloned WasmJieba, I just wanted to see if I could help with the compilation.

ZnqbuZ · 2023-11-22T20:27:56Z

Honestly, I'm not quite sure about how to do it now... Maybe I can manage it after several days of research

How do I build the package?

Sorry, what do you mean?

I've cloned WasmJieba, I just wanted to see if I could help with the compilation.

My build command is wasm-pack build --target web --out-dir pkg/web/pkg --out-name wasmjieba-web, where wasm-pack can be installed by cargo.

ZnqbuZ · 2023-11-22T20:31:34Z

Besides, in .cargo/config.toml the target should be wasm32-unknown-unknown.

ZnqbuZ · 2023-11-22T21:17:08Z

I believe it must have to do with wasm-opt. Could you try this unoptimized debug version and see if it works?

retorquere · 2023-11-22T21:29:25Z

I believe it must have to do with wasm-opt. Could you try this unoptimized debug version and see if it works?

CompileError: at offset 679499: bad type

retorquere · 2023-11-22T21:34:42Z

Does work on Zotero 7.

ZnqbuZ · 2023-11-22T21:34:49Z

I believe it must have to do with wasm-opt. Could you try this unoptimized debug version and see if it works?

CompileError: at offset 679499: bad type

OK... no idea what this means, so maybe it's not related to wasm-opt. I'm going to dig into those rust wasm things and try that old firefox tomorrow. Tell me if you want more explanation for compilation.

retorquere · 2023-11-22T21:58:38Z

What is the full contents of the config.toml?

ZnqbuZ · 2023-11-22T22:00:48Z

config.zip

github-actions bot added the needs-debug-log label Aug 18, 2023

github-actions bot added the awaiting-user-feedback label Aug 18, 2023

github-actions bot removed needs-debug-log awaiting-user-feedback labels Aug 18, 2023

github-actions bot added the awaiting-user-feedback label Aug 18, 2023

github-actions bot removed the awaiting-user-feedback label Aug 18, 2023

github-actions bot added the awaiting-user-feedback label Aug 19, 2023

github-actions bot removed the awaiting-user-feedback label Aug 19, 2023

github-actions bot added the awaiting-user-feedback label Aug 19, 2023

github-actions bot removed the awaiting-user-feedback label Aug 19, 2023

github-actions bot added the awaiting-user-feedback label Aug 19, 2023