Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data issue: CAS# 16949-15-8 #28

Open
ljn917 opened this issue Oct 15, 2019 · 2 comments
Open

Data issue: CAS# 16949-15-8 #28

ljn917 opened this issue Oct 15, 2019 · 2 comments

Comments

@ljn917
Copy link

ljn917 commented Oct 15, 2019

Hi,

It looks like the data for CAS# 16949-15-8 is not correct. As this shows, CAS# 16949-15-8 is LiBH4, but I got the following output. It looks like the hydrogens are dropped incorrectly.

>>> a=thermo.chemical.Chemical('16949-15-8')
>>> a.smiles
'[Li+].[B-]'
>>> a.rho
537.840616966581

Thanks

@alexchandel
Copy link
Contributor

Problem is line 68158 of chemical identifier.tsv. PubChem gave a mismatched formula, weight, & smiles. (Don't expect accuracy from the govt). The smiles should be [Li+].[BH4-].

@alexchandel
Copy link
Contributor

It gets better. PubChem has 5 separate "compound" entries, all claiming to be lithium borohydride:

  1. https://pubchem.ncbi.nlm.nih.gov/compound/11996612 (non-existent)
  2. https://pubchem.ncbi.nlm.nih.gov/compound/20722760 (non-existent)
  3. https://pubchem.ncbi.nlm.nih.gov/compound/4148881
  4. https://pubchem.ncbi.nlm.nih.gov/compound/139038538 (high-P polymorph)
  5. https://pubchem.ncbi.nlm.nih.gov/compound/139046170 (high-P polymorph)

Wikipedia cites No.3, as does the ChemSpider entry with the same CAS number.

(There are also duplicated sodium aluminum hydride entries, one showing net charge and the other formal charge.)

@CalebBell I recommend deleting the CID# 20722760 row altogether, and adding a row for CID# 4148881 with the CAS# 16949-15-8.

Separately, given the number of errors & duplicates in PubChem, a chemical identifiers duplicate.tsv database should be created to alias the various duplicate CID's.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants