1

I'm working on a markdown Rmd document with references in several .bib BibTeX databases. The yaml header includes:

---
title: "title"
author: "me"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
  bookdown::word_document2:
    reference_docx: StylesTemplate.docx
    number_sections: false
bibliography:
  - "`r system('kpsewhich graphics.bib', intern=TRUE)`"
  - "`r system('kpsewhich statistics.bib', intern=TRUE)`"
  - "`r system('kpsewhich timeref.bib', intern=TRUE)`"
  - "`r system('kpsewhich Rpackages.bib', intern=TRUE)`"
csl: apa.csl
---

I am stymied in how to get the following references to sort in the correct author-year order. The first two are out of order.

enter image description here

I am aware that pandoc-citeproc with a .csl file attempts to disambiguate authors when there are different spellings, but I checked my .bib files and all of these Tukey publications have one of:

author = {John W. Tukey}
author = {Tukey, John W.}

so they should be considered the same.

The first 4 references in my BibTeX files are:

@InProceedings{Tukey:1975:picturing,
  author    = {John W. Tukey},
  booktitle = {Proceedings of the International Congress of Mathematicians, Vancouver},
  title     = {Mathematics and the picturing of data},
  year      = {1975},
  pages     = {523--531},
  volume    = {2},
}

@Techreport{Tukey:1993:TR,
  author      = "John W. Tukey",
  title       = "Exploratory Data Analysis: Past, Present, and Future",
  institution = "Department of Statistics, Princeton University",
  year        = "1993",
  number      = "No. 302",
  month       = apr,
  url         = "https://apps.dtic.mil/dtic/tr/fulltext/u2/a266775.pdf",
}

@Article{Tukey:59,
  author  = {John W. Tukey},
  journal = {Technometrics},
  title   = {A Quick, Compact, Two Sample Test to {Duckworth's} Specifications},
  year    = {1959},
  pages   = {31--48},
  volume  = {1},
  doi     = {10.2307/1266308},
  url     = {https://www.jstor.org/stable/1266308},
}

@article{Tukey:1962,
    Author = {John W. Tukey},
    Journal = {The Annals of Mathematical Statistics},
    Number = {1},
    Pages = {1--67},
    Publisher = {Institute of Mathematical Statistics},
    Title = {The Future of Data Analysis},
    Url = {http://www.jstor.org/stable/2237638},
    Volume = {33},
    Year = {1962},
    }

I see minor differences in formatting, but these should not affect pandoc-citeproc sorting.

Is this perhaps a bug in pandoc-citeproc or is there something I can do in my .bib files to avoid this?

I'm running R 4.1.3 under R Studio 2022.02.1, with pandoc 2.17.1.1

Update

I re-ran this using the chicago-author-date.csl style. All the references now sort correctly, so there must be something peculiar with the apa.csl style. I'd still prefer apa.csl style, so it would be of interest to understand why the difference.

enter image description here

Rob
  • 14,746
  • 28
  • 47
  • 65
user101089
  • 3,756
  • 1
  • 26
  • 53
  • The different spellings *should* be considered the same – but did you actually try to make the spellings identical? This could pin down the problem. Moreover, did you try compiling a regular (and minimal) TEX file with these references? This could rule out that the issue is related to knitr/pandoc/.... – CL. May 30 '22 at 09:21
  • Yes, I used `grep -i "author.*Tukey"` on each of my `.bib` files and resolved all of these to be the same, "John W. Tukey", or "Tukey, John W." But the difference between the result of `apa.csl` and `chicago-author-date.csl` tells me that there is something there causing what I get. – user101089 Jun 03 '22 at 17:17

0 Answers0