This is an in-progress collection of hints for writing and typesetting in the Capra research group.
The source is on GitHub, so you can contribute or fork your own style guide there.
The BibTeX code you get from the ACM Digital Library or IEEE Xplore or whatever is usually terrible.
You need to fix it up manually:
Remove all the useless fields like publisher and keywords. For inproceedings (conference) entries, keep only author, title, booktitle, and year.
Edit the conference name (booktitle) to be less rambly. Remove stuff like Proceedings of the 32nd ACM SIGPLAN Conference on... and use something succinct like Programming Language Design and Implementation (PLDI). Include popular abbreviations in parentheses to help readers skim. When submitting to a venue with unjust page limit rules that include references, consider using the abbreviation by itself to save space (and as a form of protest).
Check for capitalization in the title and surround it with curly braces. For example, use {PRIMES} is in {P} to make sure BibTeX doesn’t render it as Primes is in p.
If you find something on arXiv, always look for a real publication first before resorting to an arXiv citation.
Citation are not nouns. For example, this is wrong:
We build on the work of [32].
The right way is to either name the system or the authors:
We build on Terra [32].
We build on the work of Cookie Monster et al. [32].
The natbib package for LaTeX defines the \citet macro which automatically
adds the names of the authors and the citation.
It provides several other useful macros for citations.
The abbreviation e.g. means for example, and i.e. means something like in other words.
Do not italicize them, and always follow them with a comma.
Both need some kind of separating punctuation preceding them, such as a comma, an opening parenthesis, or an em dash.
For example:
They forgot the most important thing in the world, i.e., breakfast.
There would be no egg-based foods (e.g., omelettes or quiches) today.
In TeX, three hyphens (---) become an em dash (—).
You can use em dashes, sparingly, in place of parentheses or to evoke a conversational pause.
See Eddie Kohler’s advice on how to use them.
Do not put spaces around an em dash.
Use an en dash, not a hyphen, in numeric ranges like 4–10.
Also use an en dash between words that are balanced with each other but not part of a compound word.
Common examples include hardware–software, producer–consumer, and things named after multiple people, like Curry–Howard or Lucas–Kanade.
Use hyphens in compound modifier phrases when they help clarify which words go together.
Specifically, hyphenate compound modifiers that come before the noun they modify, except when they consist of an adverb ending in -ly.
Do not hyphenate compound modifiers that come after the noun they modify unless you have a really good reason to.
For example, you need a hyphen in language-based security, off-chip DRAM, and real-time deadline
but not in this accelerator is fixed function.
On the other hand, fully connected layer does not need a hyphen, even though the modified noun layer comes last, because the -ly suffix in fully makes it easy to see how to parse the phrase.
Some phrases can act either as modifiers that need hyphens or as nouns that do not.
The phrase state of the art is a common bugaboo.
A reference to something in the state of the art does not need hyphens, but a state-of-the-art accelerator does.
You often want to put math, code, or other notation in the flow of prose.
Do it like this:
Introductory sentence, ending with a colon:
%
\[ e = m \times c^2 \]
%
More explanation here.
The text leading up to the notation should give enough context so that the reader knows why they are about to see an equation.
It should call out the key insight they should look for while trying to understand the math or listing.
The text afterward should provide justification and explain details that make sense after seeing the notation.
Above and below the math or listing, use an empty TeX comment line (%) to avoid starting a new paragraph while still making the TeX look readable.
For math, be sure to use display-mode math macros like \[ x \] or align*.
Use align* (instead of several \[ \] equations in a row) when you have multiple lines:
If you have to list items that are only a sentence or two, consider inlining them into a paragraph, following these rules. (1) There’s no need for fancy marker words like firstly and secondly. (2) Instead, use numerals in parentheses at the beginning of each item.
Avoid the passive voice as much as reasonable.
When you say the data is converted, for example, your writing will be clearer if it is more specific about who or what does the conversion: the system, the algorithm, the user, the server, the authors, etc.
An imperfect way to tell whether a sentence is in the passive voice is to try adding …by space aliens to the end.
If that works, you probably want to add a subject for your verb.
(Credit to Melissa O’Neill for this trick.)
Sometimes, rewriting a sentence to avoid the passive voice makes it worse.
Give it an earnest try, but give up if the alternative seems bad.
To reference a section, use Section~\ref{...}. Capitalize Section and use
a tilde to make a nonbreaking space. (And the same for tables and figures.) Always use Section, even when it’s a subsection or subsubsection.
When explaining figures or tables that show results, refer to them early in the paragraph, usually in the first sentence, and then explain the contents in more detail. For example:
Figure 9 shows the execution time for each benchmark relative to an unmodified execution. The geometric mean slowdown is 8%. The worst slowdown is streamcluster, which is 31% slower with debugging enabled.
It’s usually best to put the figure or table reference right at the beginning of the sentence and to follow it with an active verb.
For example, prefer Figure N shows X to X can be found in Figure N or As shown in Figure N, X.
Use run time as a noun for the time when execution happens, as in the error arises at run time.
Use run-time as an adjective phrase before the noun, as in the system’s run-time behavior.
Use runtime as the noun that is shorthand for runtime system, as in we designed a compiler and runtime.
(Credit to James Wilcox’s recollection of Mike Ernst.)
Use the same rules for compile time and compile-time (but compiletime is not a thing).
Writing in “academic mode” can tempt you to use phrases that are more complicated than they need to be.
Try to keep things simple, even if it means sounding informal.
Here are some find/replace patterns for simplifying language:
which means that → so
gives X the ability to → lets X
allows X to → lets X
is different → differs
is built on → builds on
it is observed that X → just X or, if necessary, we observe that X
in order to → to
so as to → to
as can be seen in the figure, X → the figure shows that X or just X
In TeX, use booktabs and its \toprule, \midrule, and \bottomrule lines instead of \hline.
Do not put horizontal lines between every row (but between the header row and the rest of the rows is fine).
Do not use vertical lines.
Right-align columns that contain numbers.
It can be tempting to chain together two sentences by starting the second one with this as a noun.
For example:
Because the oven was broken, we microwaved the hot pocket. This resulted in a quicker but soggier treat.
This pattern can make the second sentence harder to read because the referent can be ambiguous (at first glance, this might be the hot pocket, the oven, or the microwave).
Insert a clarifying noun, such as this technique in the sentence above.
Use a × symbol, not the letter x, when writing about dimensions (“a 4×2 grid”) or factors (“a speedup of 2.3× over the baseline”).
In TeX, you can use $\times$ to get the symbol.