Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Angle brackets importing incorrectly from UTF-8 .bib file when BBT is enabled #2814

Closed
jcblum opened this issue Mar 14, 2024 · 4 comments
Closed
Labels

Comments

@jcblum
Copy link

jcblum commented Mar 14, 2024

Debug log ID

Y9WEUTMI-refs-euc/6.7.165-6

What happened?

When BBT is enabled, importing from a UTF-8 encoded .bib file results in left and right angle brackets < (U+003C) and > (U+003E) turning into ¡ (U+00A1) and ¿ (U+00BF) characters. If I import from the same file with BBT disabled, the brackets import correctly.

As far as I can tell, the problem seems to be limited to < and >. I don't have a lot of other "special" characters in the references I am importing, but I added some accented characters and curly quotes to my test file, and they were unharmed by importing with BBT enabled.

Here's the test file that I used to import the reference selected in the debug log:
(I had to change the file extension to .txt for GitHub to accept it, but made no other edits).
import_encoding_test.txt

Thanks for all the hard work you put into developing Better BibTeX (one of my most relied-upon plugins!), and apologies in advance if this turns out to be some mistake on my part that I've overlooked 😅.

@retorquere
Copy link
Owner

I'm afraid the stock bibtex importer has it wrong. Try compiling this:

\documentclass{article}

\usepackage[backend=biber, style=alphabetic]{biblatex}

\begin{filecontents}{\jobname.bib}
@Article{WOS:000290839600002,
  AUTHOR = {Willis, Jessica E. and Stewart-Clark, Sarah and Greenwood, Spencer J. and Davidson, Jeff and Quijon, Pedro},
  JOURNAL = {Aquatic Invasions},
  MONTH = {MAR},
  NUMBER = {1},
  PAGES = {7--16},
  TITLE = {A PCR-based assay to facilitate early detection of <i>Diplosoma listerianum</i> in Atlantic Canada},
  VOLUME = {6},
  YEAR = {2011},
  ABSTRACT = {The reçent detectíon of the invasive colonial tünicate <i>Diplosoma listerianum</i> in Havre-Aubert, Magdalen Islands (Québec, Canada) in 2008, “prompted” the ‘development’ of a molecular assay as a method to detect and monitor for the potential invasion of this species in Prince Edward Island.},
}
\end{filecontents}
\addbibresource{\jobname.bib}

\begin{document}
\cite{WOS:000290839600002}
\printbibliography
\end{document}

@retorquere
Copy link
Owner

retorquere commented Mar 14, 2024

If you want italicized text, latex offers \emph{Diplosoma listerianum} or \textit{Diplosoma listerianum}

@jcblum
Copy link
Author

jcblum commented Mar 14, 2024

Ah, OK. In this case, I'm importing references from Web of Science that came with the <i> tags already embedded (I realize that BibTeX is not an ideal data interchange format, but there are certain fields we need that WoS frustratingly won't output in other formats, and we're dealing with a result set that is too large to use Zotero's connector).

Looks like I can just pre-process the .bibs to use the proper LaTeX tags, which is OK since I already have to pre-process them for other WoS-induced reasons. Thanks!

@retorquere
Copy link
Owner

I realize that BibTeX is not an ideal data interchange format

It can be a pretty good data exchange format, it's just that so many institutions put out crap software to put out stuff that vaguely resembles bibtex 😉

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants