I love to use LaTeX for typesetting my papers. The flexibility of the environment and the crisp beauty of the final product&I think anyone who uses it regularly knows what I'm gushing about. While working on a recent paper, however, I was frustrated by the prospect of customizing bibliography and citation styles in LaTeX. I care about fine-grained control of my bibliography. I wondered if there was another sensitive soul out there who felt the same way and decided to create a better soluion.

Here, I'll describe an alternative to the standard bibliography environment that I found for style customization without sacrificing the raw power of LaTeX. I'll also comment briefly on some alternatives I've seen and tried. I take for granted that BibTeX library is already available as creating one is outside the scope of this article. I will mention that my favorite reference manager, Mendeley, can export its database as a BibTeX library.

The Problem

LaTeX's default bibliography environment is simple enough to invoke. For a complete description, see Stefan Kottwitz's book [1]. Inline styles are written with the \cite{} command, as in the example below.

Previous studies have demonstrated that collective efficacy \cite{Morenoff2001} and prejudice \cite{Sampson2004} shape neighborhoods more strongly than their physical make-up and condition.

The final and only other requirement is to add the bibliography and point to a BibTeX reference database. In the example below, it would be named myrefs.bib.

\bibliographystyle{alpha}
\bibliography{myrefs}

It is necessary to invoke the bibtex program. This translates the references from our BibTeX library (myrefs.bib) that we have cited in our document into a thebibliography environment which is put in place where we invoked the command, \bibliography. In short, we call bibtex and then typeset twice.

bibtex my_document.tex

After typesetting (twice), we see in-line citations and our bibliography. The bibliography style we specified, alpha, is formatted such that the in-line citation labels are a combination of a shortened author name and publication year, the bibliography is sorted by author name, and square brakets surround the labels. But what if you want something different?

Well, there are four default styles. The other three are described by Kottwitz [1] are listed below. ShareLaTeX has a list of eight (8), including the four discussed here, with example output.

  • plain: Arabic numbers for the labels, sorted according to the names of the authors. The number is written in square brackets which also appear with \cite.
  • unsrt: No sorting. All entries appear like they were cited in the text, otherwise it looks like plain.
  • abbrv: Like plain, but first names and other field entries are abbreviated.

The bibtex program figures out how to style your citations and bibliography from a specification that resides in a *.bst file (i.e., there is a plain.bst for the built-in plain style). You could write one of these yourself, perhaps using one of the built-ins as a template, but the postfix language they are written can be very difficult to read and write.

It seems that others have recognized the need for more flexible customization and, to be fair, there are other options out there in the form of preprocessors and LaTeX packages. This TeX StackExchange article does a good job of summarizing their trade-offs. The only alternative among the extant packages and programs that I've tried is natbib which doesn't completely solve the problem of fully customizable citations and bibliographies (though it's a good start). In the end, one still has to provide a *.bst file with natbib.

A Solution

I should mention that my high expectations of full customization were shaped by my previous experience with R Markdown and knitr. With R Markdown, formatting for bibliographies and in-line citations can be specified by the Citation Style Language (CSL); see also this reference. I have a vivid memory of first seeing the Zotero repository of CSL stylesheets: all 7,438 of them. Clearly, 7,438 options are better than 4 (or 8), right? If the numbers don't convince you, just open up a CSL stylesheet; it's an XML variant, making it much easier to read and write than *.bst files.

So, CSL works out-of-the-box in R Markdown—great! And knitr, with help from Pandoc, enables R Markdown documents to be serialized to a wide variety of formats (e.g., PDF, HTML, Microsoft Word). But what if you want to use the full range of TeX commands and environments available through third-party packages? Some people might also object to using R Markdown to write their paper, anyway; particularly if they don't use R or markdown.

I'm not one of those people but I do want to use raw TeX sometimes. So, I started looking into Pandoc to preprocess my TeX documents. Pandoc supports CSL and can defer raw TeX input to the LaTeX typesetting program. By chaining Pandoc and LaTeX, with a custom CSL file to my liking, I can fully customize my bibliography and citation styles without sacrificing the full range of TeX features available in third-party libraries.

As an example, here is the References section of a paper I wrote. I wanted hanging indents in my bibliography—one last, obsessive detail to achieve my vision for a bibliography—so I have invoked \setlength and \hangparas which require the setspace and hanging packages, respectively. The last line looks a lot like it did before, right? I just point to my BibTeX bibliography database.

\section{References}
\setlength{\parindent}{0pt} % Reset indentation for references...
\hangparas{32pt}{1}
\bibliography{/home/arthur/library.bib}

To typeset this with Pandoc, I have a little shell script that encapsulates the variety of options available with that program. I locate my BibTeX library for Pandoc with the --bibliography option and I tell it how I'd like my bibliography and citations formatted, in CSL, with the --csl option.

pandoc \
-M author="K. Arthur Endsley" \
-M date="February 2, 2015" \
-f latex+raw_tex -N -R \
--smart \
--include-in-header=header.tex \
--bibliography=/home/arthur/library.bib \
--csl=citation_style.csl \
--template=template.latex \
-o MyPaper.pdf MyPaper.tex

Note that I have a custom LaTeX template and a header TeX file that load some packages and set up my document. These are important, as \usepackage and some other commands can only be used in the preamble—they can't go inside your input TeX file to Pandoc. To get an idea of what I mean, see my input TeX file (MyPaper.tex) below:

\title{Assessment of Urban Change through a Land-Cover Change Proxy at the Neighborhood Scale with Subpixel Measurements from Satellite Remote Sensing}
\author{K. Arthur Endsley}

\begin{document}

\maketitle

\section{Background}

Neighborhood change manifests in changes in the physical environment...

Anything else has to go in the template or in the header.

References

  1. LaTeX: Beginner's Guide by Stefan Kottwitz