Citation Keys

Generating citekeys for your items

The BibTeX citations keys generated by the standard Zotero exporters are always generated at time of export, using an algorithm that usually generates unique keys. For serious LaTeX users, “usually” presents the following problems:

  • If a non-unique key is generated, which one gets postfixed with a distinguishing character is essentially non-deterministic.
  • The keys are always auto-generated, so if you correct a typo in the author name or title, the key will change
  • You can’t see the citation keys until you export them

For a LaTeX author, the citation keys have their own meaning, fully separate from the other entry data, even if people usually pick a naming scheme related to them. As the citation key is the piece of data that connects your bibliography, this is a piece of data you want to have control over. BBT offers you this control:

  • Stable citation keys, without key clashes. BBT generates citation keys that take into account other existing keys in your library in a deterministic way, regardless of what part of your library you export, or the order in which you do it.
  • BBT is conservative about citation key changes, and allows you to fix keys to any value of your choosing.
  • Generate citation keys from JabRef(-ish) patterns.

You can also

  • Drag and drop LaTeX citations using these keys to your favorite LaTeX editor
  • Show your citation keys in the item list view.

Set your own, fixed citation keys

By default, BBT generates the citation key from the item information, and this key may change when you edit the item. Such keys are called dynamic keys. In contrast, fixed keys are marked with a pushpin in the item list view and in the item details to distinguish them from dynamic keys.

You can fix the citation key (called pinning in BBT) for an item by adding the text Citation Key: <your citekey> anywhere in the extra field of the item on a line of its own. You can generate a pinned citation key by selecting one or more items, right-clicking, and selecting Generate BibTeX key, which will add the current citation key to the extra field, thereby pinning it.

Drag and drop/hotkey citations

You can drag and drop citations into your LaTeX/Markdown/Orgmode editor, and it will add a proper \cite{citekey}/[@citekey]/[[zotero://select...][@citekey]. The cite command is configurable for LaTeX by setting the config option in the preferences. Do not include the leading backslash.

This feature requires a one-time setup: choose the Quick Copy format under the Citation keys preferences for BBT, and go to Zotero preferences, tab Export, under Default Output Format, select “Better BibTeX Quick Copy: [format you just selected]”.

Find duplicate keys through integration with Report Customizer

The plugin will generate BibTeX comments to show whether a key conflicts and with which entry. BBT integrates with Zotero: Report Customizer to display the BibTeX key plus any conflicts between them in the zotero report.

Configurable citekey generator

BBT also implements a citekey generator for those entries that don’t have a citekey set explicitly; the formatter pattern language used to follow the JabRef key formatting syntax, but now uses a javascript-ish format. You can set your generator pattern in the Better BibTeX preferences (you can get there via the Zotero preferences, or by clicking the Better BibTeX “Preferences” button in the addons pane.

Better BibTeX knows four kinds of “things” to build the citekey from:

  1. “functions”, these produce text based on the item the key is being constructed from, eg shorttitle. Even though these are largely case insensitive, they must start with a lowercase letter.
  2. “field access”, direct text from the zotero item fields; these again are largely case insensitive, but they must start with an uppercase letter
  3. “filters”, these are actions that act on the text returned from either functions, field access, or from a subformula like (auth + title || year). these are fully case insensitive, and you can chain these together, each acting on the output of the previous filter.
  4. bare strings (text quoted in single or double quotes)

There are 5 ways you can build subformulae:

  1. composition: (auth + title)
  2. alternates: (auth || title) (use the first thing that returns any text, so auth if that returns text, otherwise title). Note the test-filters like .len *will skip to the next formula if it fails; see also sequence below
  3. ternaries: (auth ? year : title) (if auth returns any text, use year, otherwise use title). Ternary operators have the format condition ? output_if_true : output_if_false, and you can use it like an if-or statement.
  4. sequences: (auth, shorttitle.len, '', title, 'hi'). The first element of the sequence that returns a non-empty output is used. Tests like .len that would usually skip to the next formula don’t do so in a sequence, they just return empty output if they fail, except the last element in the sequence. If that is a test, and it fails, the current formula is skipped and the next started. You would usually want a non-test here, or a fixed value.
  5. (auth + title) > 0 or auth > 0 are shorthand for (auth + title).len / auth.len.

these can be combined, eg (auth || shorttitle || year) ? (auth + title) : (year || title), but subformulae cannot appear in parameters, so title.select(auth ? 3 : 4) is not valid. Filters (explained below) can be applied to subformulae, so (title || auth).lower checks whether the title function produces output (i.e. not empty). If it does, the title function is used; otherwise, the formula will use the auth function. It then converts the output of (title || auth) to lowercase.

You can also explicitly test whether a formula part is not empty and jump to the next formula if not:

title.lower.len + year; auth + year

which would have the formula evaluate whether the title function returns a non-empty text; if this condition is not met, formula evaluaton jumps to the next formula auth + year. You can also test for a minimal length using eg

title > 1 + year | auth + year

which is shorthand for

title.len('>',1) + year | auth + year

which checks whether the title output is longer than 1 character.

The default key pattern is auth.lower + shorttitle(3,3) + year; if you have papers that use keys which were generated by the key generator of the standard Bib(La)TeX exporters of Zotero you may want to use zotero.clean instead in order to ease migration from existing exports for people who previously used the standard Zotero Bib(La)TeX exports.

auth.lower + shorttitle(3,3) + year, means

  1. last name of first author without spaces, in lowercase because of the .lower filter
  2. The first n (default: 3) words of the title, apply capitalization to first m (default: 0) of those.
  3. year of publication if any,
  4. a letter postfix (a, b, c, etc) in case of a clash (this part is always added, you can’t disable it, although you can change it to Zotero-style numeric)

Changing a pattern will only affect items created/changed after you changed the pattern; existing keys are not automatically regenerated when you change the pattern. If you want your keys to update after a pattern change you will have to select your items, right-click, and select Refresh. This will not affect keys you have pinned.

If you want to get fancy, you can set multiple patterns separated by a semicolon (;) or vertical bar (|), of which the first will be applied that yields a non-empty string. If all return a empty string, a random key will be generated.

An example application for this behavior is to use the tex.shortauthor from the extra field when defined to generate short citation keys for entries with long group author names, but to default to auth.lower otherwise:

extra('tex.shortauthor').transliterate.clean.lower.len + year; auth.lower + year

You can add a verbatim text by just including it in single or double quotes:

extra('tex.shortauthor').transliterate.clean.lower.len + year; 'default' + auth.lower + year

Formulas have some ternary and or-style choice support; you can use them in formulas instead of a function, but not in parameters; you can for example use

(title ? title : auth).lower + year

or

(title || auth).lower + year

instead of

title.len + year | auth + year

and you can test for length of subsections; what you would previously do with

auth + title + len + year

to jump to the next formula if auth and title were both empty is now

(auth + title).len + year

Generating citekeys

To generate your citekeys, you use a formula composed of functions and filters. Broadly, functions grab text from your item, and filters transform that text. Note that the formula syntax has changed from a bracketed format to a javascript-ish format. The old syntax was getting harder to maintain and its inflexibility prevented new extensions to the functions being implemented cleanly. The old syntax still works but will be translated to the new format automatically.

Below you will find a full list of functions and filters you can use, in the new format only, sorry. You can still use these in the old syntax, but they support only positional parameters, where I would recommend generally to use the new syntax with named parameters.

Note: a number of functions below talk about the author’s lastname; you can read that as “when available”. If you have the name as a single-field name (for entities like International Business Machines or Aristoteles), Zotero doesn’t have a last name, and the full single-field name is taken instead.

Functions

auth(n: number = 0, m: number = 1, creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false)

The first n (default: all) characters of the mth (default: first) author's last name.

authAuthEa(creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false, sep: string = '.')

The last name of the first two authors, and ".ea" if there are more than two.

authEtAl(creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false, sep: string = ' ')

The last name of the first author, and the last name of the second author if there are two authors or "EtAl" if there are more than two. This is similar to auth.etal. The difference is that the authors are not separated by "." and in case of more than 2 authors "EtAl" instead of ".etal" is appended.

authEtal2(creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false, sep: string = '.')

The last name of the first author, and the last name of the second author if there are two authors or ".etal" if there are more than two.

authForeIni(creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*')

The given-name initial of the first author.

authIni(n: number = 0, creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false, sep: string = '.')

The beginning of each author's last name, using no more than n characters (0 = all).

authorIni(creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false, sep: string = '.')

The first 5 characters of the first author's last name, and the last name initials of the remaining authors.

authorLast(creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false)

The last name of the last author

authorLastForeIni(creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*')

The given-name initial of the last author.

authorsAlpha(creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false, sep: string = ' ')

Corresponds to the BibTeX style "alpha". One author: First three letters of the last name. Two to four authors: First letters of last names concatenated. More than four authors: First letters of last names of first three authors concatenated. "+" at the end.

authorsn(n: number = 0, creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false, sep: string = ' ')

The last names of the first n (default: all) authors.

authshort(creator: ("author" | "editor" | "translator" | "collaborator" | "*") = '*', initials: boolean = false, sep: string = '.')

The last name if one author/editor is given; the first character of up to three authors' last names if more than one author is given. A plus character is added, if there are more than three authors.

creators(n: (number | [ number, number ]) = 0, type: ("*" | Creator | Creator[] | (Creator | "*")[][]) = [['primary', 'editor', 'translator', '*']], name: `sprintf-style format template` = '%(f)s', etal: string = '', sep: string = ' ', min: number = 0, max: number = 0)

Author/editor information.

Creator is one of: artist, attorneyAgent, author, bookAuthor, cartographer, castMember, commenter, composer, contributor, cosponsor, counsel, director, editor, guest, interviewee, interviewer, inventor, performer, podcaster, presenter, producer, programmer, recipient, reviewedAuthor, scriptwriter, seriesEditor, sponsor, translator, wordsBy

creatortypes(match?: RegExp)

This will return a comma-separated list of creator type information for all creators on the item in the form <1 or 2><creator-type>, where 1 or 2 denotes a 1-part or 2-part creator, and creator-type is one of {{% citekey-formatters/creatortypes %}}, or primary for the primary creator-type of the Zotero item under consideration. The list is prefixed by the item type, so might look like audioRecording:2performer,2performer,1composer.

date(format: string = '%Y-%m-%d')

The date of the publication

extra(variable: string)

A pseudo-field from the extra field. eg if you have Original date: 1970 in your extra field, you can get it as extra(originalDate), or tex.shortauthor: APA which you could get with extra('tex.shortauthor'). Any tex. field will be picked up, the other fields can be selected from this list of key names.

field(name: string)

Gets the value of the item field

firstpage

The number of the first page of the publication (Caution: this will return the lowest number found in the pages field, since BibTeX allows 7,41,73--97 or 43+.)

group(name: string)

Tests whether the item is in the given group library

infix(format: string = '%(a)s', start: number = 0)

a pseudo-function that sets the citekey disambiguation infix using an sprintf-js format spec for when a key is generated that already exists. The infix charachter appears at the place of this function of the formula instead of at the and (as postfix does). You must include exactly one of the placeholders %(n)s (number), %(a)s (alpha, lowercase) or %(A)s (alpha, uppercase). For the rest of the disambiguator you can use things like padding and extra text as sprintf-js allows. With start set to 1 the disambiguator is always included, even if there is no need for it when no duplicates exist. The default format is %(a)s.

inspireHep

Fetches the key from inspire-hep based on DOI or arXiv ID

item(id: ("key" | "id") = 'key')

returns the internal item ID/key

journal(abbrev: ("off" | "abbrev" | "auto" | "abbrev+auto" | "full") = 'abbrev+auto')

returns the journal abbreviation, or, if not found, the journal title, If 'automatic journal abbreviation' is enabled in the BBT settings, it will use the same abbreviation filter Zotero uses in the wordprocessor integration. You might want to use the abbr filter on this. Abbreviation behavior can be specified as abbrev+auto (the default) which uses the explicit journal abbreviation if present, and tries the automatic abbreviator if not (if auto-abbrev is enabled in the preferences), auto (skip explicit journal abbreviation even if present), abbrev (no auto-abbrev even if it is enabled in the preferences) or full/off (no abbrevation).

keyword(n: number)

Tag number n. Mostly for legacy compatibility

language(...name: string[])

Tests whether the item has the given language set, and skips to the next pattern if not

lastpage

The number of the last page of the publication (See the remark on firstpage)

library

Tests whether the item is in the user library

month

the month of the publication

origdate

the original date of the publication

origyear

the original year of the publication

postfix(format: `sprintf-style format template` = '%(a)s', start: number = 0)

a pseudo-function that sets the citekey disambiguation postfix using an sprintf-js format spec for when a key is generated that already exists. Does not add any text to the citekey otherwise. You must include exactly one of the placeholders %(n)s (number), %(a)s (alpha, lowercase) or %(A)s (alpha, uppercase). For the rest of the disambiguator you can use things like padding and extra text as sprintf-js allows. With start set to 1 the disambiguator is always included, even if there is no need for it when no duplicates exist. The default format is %(a)s.

shorttitle(n: number = 3, m: number = 0)

The first n (default: 3) words of the title, apply capitalization to first m (default: 0) of those.

shortyear

The last 2 digits of the publication year

title

Capitalize all the significant words of the title, and concatenate them. For example, An awesome paper on JabRef will become AnAwesomePaperJabref

type(...allowed: string[])

Without arguments, returns the item type. When arguments as passed, tests whether the item is of any of the given types, and skips to the next pattern if not, eg type(book) + veryshorttitle | auth + year.

veryshorttitle(n: number = 1, m: number = 0)

The first n words of the title, apply capitalization to first m of those

year

The year of the publication

zotero

Generates citation keys as the stock Zotero Bib(La)TeX export does. Note that this pattern inherits all the problems of the original Zotero citekey generation -- you should really only use this if you have existing papers that rely on this behavior.

Note: All auth... functions will fall back to editors if no authors are present on the item.

Note: The functions above used to have the clean function automatically applied to them, this is no longer the case, so if you have CJK authors/titles and you want to manipulate them (using eg. capitalize), you could have to use transliterate on them first, eg. authEtal2.transliterate.capitalize + year + shorttitle(3, 3).

Direct access to unprocessed fields

The above functions all retrieve information stored in the item’s fields and process it in some way. If you don’t want this, you can instead call field contents without any processing. To access Zotero fields, refer to them as given in the table below:

AbstractNote AccessDate AdminFlagJM AdoptionDateJM
AlbumJM ApplicationNumber Archive ArchiveCollectionJM
ArchiveIDZ ArchiveLocation ArtworkMedium ArtworkSize
AssemblyNumberJM Assignee AudioFileType AudioRecordingFormat
AuthorityZ BillNumber BlogTitle BookAbbreviationJM
BookTitle CallNumber CaseName Code
CodeNumber CodePages CodeVolume Committee
Company ConferenceDateJM ConferenceName Country
Court DOI Date DateAmendedJM
DateDecided DateEnacted DictionaryTitle Distributor
DivisionJM DocketNumber DocumentNameJM DocumentNumber
Edition EncyclopediaTitle EpisodeNumber FilingDate
FirstPage FormatZ ForumTitle GazetteFlagJM
Genre History ISBN ISSN
IdentifierZ Institution InterviewMedium Issue
IssueDate IssuingAuthority JournalAbbreviation JurisdictionJM
Label Language LegalStatus LegislativeBody
LetterType LibraryCatalog ManuscriptType MapType
Medium MeetingName MeetingNumberJM NameOfAct
Network NewsCaseDateJM NumPages Number
NumberOfVolumes OpeningDateJM OpusJM OrganizationZ
OriginalDateJM Pages ParentTreatyJM PatentNumber
Place PostType PresentationType PriorityDateJM
PriorityNumbers ProceedingsTitle ProgramTitle ProgrammingLanguage
PublicLawNumber PublicationDateJM PublicationNumberJM PublicationTitle
Publisher References RegnalYearJM RegulationTypeJM
RegulatoryBodyJM ReignJM ReleaseJM ReportNumber
ReportType Reporter ReporterVolume RepositoryZ
RepositoryLocationZ ResolutionLabelJM Rights RunningTime
Scale Section Series SeriesNumber
SeriesText SeriesTitle Session SessionTypeJM
ShortTitle SigningDateJM Status Studio
Subject SupplementNameJM System ThesisType
Title TreatyNumberJM Type University
Url VersionNumber VideoRecordingFormat Volume
VolumeTitleJM WebsiteTitle WebsiteType YearAsVolumeJM

(fields marked Z are only available in Zotero, fields marked with JM are only available in Juris-M).

Filters

abbr(chars: number = 1)

Abbreviates the text. Only the first character and subsequent characters following white space will be included.

acronym(list: string = 'acronyms', reload: boolean = false, passthrough: boolean = false)

Does an acronym lookup for the text.

alphanum

clears out everything but unicode alphanumeric characters (unicode character classes L and N)

ascii

removes all non-ascii characters

capitalize

uppercases the first letter of each word

clean

transliterates the citation key and removes unsafe characters

condense(sep: string = '')

replaces spaces in the value passed in. You can specify what to replace it with by adding it as a parameter, e.g .condense('\_') will replace spaces with underscores. Equivalent to .replace(/\s+/g, sep).

default(text: string)

Returns the given text if no output was generated

discard

discards the input

find(match: (string | RegExp), passthrough: boolean = false)

Finds a text in the string and returns it.

formatDate(format: string = '%Y-%m-%d')

formats date as by replacing y, m and d in the format

ideographs

Treat ideaographs as individual words

jieba(mode?: ("cn" | "tw" | "hant"))

word segmentation for Chinese items. Uses substantial memory, and adds about 7 seconds to BBTs startup time; must be enabled under Preferences -> Better BibTeX -> Advanced -> Citekeys

kuromoji

word segmentation for Japanese items. Uses substantial memory; must be enabled under Preferences -> Better BibTeX -> Advanced -> Citekeys

len(relation: ("=" | "<" | ">" | "<=" | "!=" | ">=") = '>', length: number = 0)

If the length of the output does not match the given number, skip to the next pattern.

localTime

transforms date/time to local time. Mainly useful for dateAdded and dateModified as it requires an ISO-formatted input.

lower

Forces the text inserted by the field marker to be in lowercase. For example, auth.lower expands to the last name of the first author in lowercase.

match(match: (string | RegExp), clean: boolean = false)

If the output does not match the given string/regex, skip to the next pattern.

nopunct(dash: string = '-')

Removes punctuation

nopunctordash

Removes punctuation and word-connecting dashes. alias for nopunct(dash='')

numeric

returns the value if it's an integer

pinyin

transliterates the citation key to pinyin

postfix(postfix: string)

postfixes with its parameter, so postfix('\_') will add an underscore to the end if, and only if, the value it is supposed to postfix isn't empty

prefix(prefix: string)

prefixes with its parameter, so .prefix('\_') will add an underscore to the front if, and only if, the value it is supposed to prefix isn't empty.

replace(find: (string | RegExp), replace: string)

replaces text, for the text to match you can pass either:

  • a string: .replace('.etal','&etal') which will match case-insensitive, so will replace .EtAl with &etal.
  • javascript regular expression: .replace(/[.]etal/ig, '&etal')
select(start: number = 1, n?: number)

selects words from the value passed in. The format is select(start,number) (1-based), so select(1,4) or select(n=4) would select the first four words. If n is not given, all words from start to the end are selected.

skipwords(nopunct: boolean = false)

filters out common words like 'of', 'the', … the list of words can be seen and changed by going into about:config under the key extensions.zotero.translators.better-bibtex.skipWords as a comma-separated, case-insensitive list of words.

If you want to strip words like 'Jr.' from names, you could use something like Auth.nopunct.skipwords.fold after adding jr to the skipWords list. Note that this filter is always applied with nopunct on if you use title (which is different from Title) or shorttitle.

substring(start: number = 1, n?: number)

substring(start,n) selects n (default: all) characters starting at start

transliterate(mode?: ("minimal" | "tw" | "arabic" | "chinese" | "german" | "japanese" | "mongolian" | "russian" | "uk" | "ru" | "mn" | "ar" | "de" | "ja" | "zh" | "zh-hant" | "ukranian"))

transliterates the citation key. If you don't specify a mode, the mode is derived from the item language field

upper

Forces the text inserted by the field marker to be in uppercase. For example, auth.upper expands the last name of the first author in uppercase.

Usage note: the functions condense, skipwords, capitalize and select rely on whitespaces for word handling. Most functions strip whitespace and thereby make these filter functions sort of useless. You will in general want to use the fields from the table above, which give you the values from Zotero without any changes. The fields with ** are only available in Juris-M.