Skip to content
This repository has been archived by the owner on Jan 3, 2024. It is now read-only.

returns nothing for Thai #44

Open
garfieldnate opened this issue Sep 23, 2018 · 3 comments
Open

returns nothing for Thai #44

garfieldnate opened this issue Sep 23, 2018 · 3 comments

Comments

@garfieldnate
Copy link

>> from wiktionaryparser import WiktionaryParser
>> parser = WiktionaryParser()
>> word = parser.fetch('ฉลาด')
>> word
[]

The page is clearly there on the website: https://en.wiktionary.org/wiki/%E0%B8%89%E0%B8%A5%E0%B8%B2%E0%B8%94. I'm trying to scrape the pronunciations.

@Surkal
Copy link

Surkal commented Sep 28, 2018

The language is english by default.

parser.fetch('ฉลาด', language='thai')

@garfieldnate
Copy link
Author

Ah, that gets it. The info returned is not quite right, though:

[
    {
        'etymology': 'From Khmer ឆ្លាត (chlaat, “clever”). Compare Lao ສະຫລາດ (sa lāt).\n', 'definitions': [
            {
                'partOfSpeech': 'adjective', 
                'text': ['ฉลาด • (chà-làat) (abstract noun ความฉลาด)', 'clever; smart; intelligent.'], 'relatedWords': [], 
                'examples': []
            }
        ], 
        'pronunciations': {
            'text': ['From Khmer ឆ្លាត (chlaat, “clever”). Compare Lao ສະຫລາດ (sa lāt).\n'], 
            'audio': []
        }
    }, 
    {
        'etymology': '', 
        'definitions': [
            {
                'partOfSpeech': 'noun', 
                'text': ['ฉลาด • (chà-làat)', 'Alternative form of สลาด (slàat)'], 
                'relatedWords': [], 
                'examples': []
            }
        ], 
        'pronunciations': {
            'text': ['From Khmer ឆ្លាត (chlaat, “clever”). Compare Lao ສະຫລາດ (sa lāt).\n'], 
            'audio': []
        }
    }
]

The etymology is in the pronunciation text, and the pronunciation is missing altogether.

@suyashb95
Copy link
Owner

Yeah well, the format of the pronunciations is different from most of the other words. I'm still working on it

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants