TITLE: Grabbing BibTeX from a DOI
DATE: 2021-08-08
AUTHOR: John L. Godlee
====================================================================


There's a website called doi2bib.org that takes a DOI and returns a
BibTeX entry. I have been using it for a while to quickly get
references for writing my PhD thesis. The website uses the DOI
proxy server REST API in the background, so I figured it wouldn't
be too hard to use CURL directly on the API to do the same thing in
the terminal, to save me opening my web browser. This CURL request
works well, where $1 is the DOI.

 [doi2bib.org]: https://www.doi2bib.org/

   curl -LH "Accept: application/x-bibtex" http://dx.doi.org/$1

In a similar vein, I wrote a script that grabs DOIs from a PDF. I
used the regex for DOIs provided in a blog post on CrossRef, which
apparently matches 74.4 of 74.9 million registered DOIs. The script
grabs the first DOI in the PDF by default, because that's most
often the DOI of the article itself, rather than DOIs for
references in the article.

 [blog post on CrossRef]:
https://www.crossref.org/blog/dois-and-matching-regular-expressions/

   #!/usr/bin/env sh

   pdftotext "$1" - |\
       grep -ioP "\b(10.\d{4,9}/[-._;()/:A-Z0-9]+)\b" |\
       head -n 1