When a model is ran say every few months and a report has to be produced based on the model results, a copy-paste like modus operandi can easily cause synchronization errors: the wrong text is inserted not referring to the current results but to an older simulation result. I recently had a discussion with someone from a government agency who experienced exactly this problem.
A similar thing can and does happen when documenting a piece of software: examples, code fragments or screen dumps refer to older versions of the program.
R and Rstudio provide tools to create dynamic documents from R using a nice Rstudio environment. We could have said in the document something like:
A cost savings of `r round(100*(oldcost-cost)/oldcost,1)`% has been achieved.
which would generate:
A cost savings of 25.4% has been achieved.
Here we evaluated an R expression using data from the R session associated with the R markdown document.
Literate Programming
This is the grand-daddy of R Markdown. LP (Literate Programming) was meant to document source code, or rather have a single design document that could spit out either TeX for typesetting the document or Pascal (and later C) source code (4,5). Actually in some respects this tool was very flexible: things could be documented out-of-order. The name web was used for this. Knuth emphasized the usefulness of this: the order of thinking and writing about code is different than the order of generated source code. R Markdown prefers code chunks to be in order.
R users will appreciate the assignment symbol in LP:
R Markdown
The markup language to write documents in R is simple and somewhat restricted (6). Actually that may be a good thing. Too many ways to change the lay out of a document often leads to a somewhat less clean, more convoluted final product. Writing documentation while writing code is a big step forward I think. It makes it more pleasant to use and also has a beneficial impact on the code quality (explaining what you try to do helps to focus the mind and gets things better organized). I have heard people advancing the argument that documentation should be written before the code is written, but that looks to me not always a feasible or easy approach.
Recently better support for large documents (e.g. books) has been added in the form of being able to work sub-documents (sections or chapters) individually. Some books produced this way are shown at https://bookdown.org/.
Output formats
The main output formats of R Markdown are HTML, LaTeX, and MS Word. MS Word is especially important in non-academic institutions as Word and more general Office is a main work horse there. Click on the figures to enlarge.
Here is HTML, LaTeX and Word output of the same small example document.
Tables
Outputting tables is somewhat of a challenge. The easiest is to use the function kable. The LaTex and Word output is reasonable, but the HTML output needs some adjustment:
Books
References
- Yihui Xie, Dynamic Documents with R and knitr, 2nd ediition, Chapmand and Hall/CRC, 2015.
- Yihui Xie, bookdown, Authoring Books and Technical Documents with R Markdown, Chapman and Hall/CRC, 2016. Also available through: https://bookdown.org/yihui/bookdown/
- Christopher Gandrud, Reproducible Research with R and RStudio, Chapman and Hall/CRC, 2015
- Donald Knuth, Literate Programming, http://www.literateprogramming.com/knuthweb.pdf, paper on LP (LP means Literate Programming in this context and not Linear Programming).
- Donald Knuth, Literate Programming, CSLI publications, 1992 (the book)
- https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf a short document summarizing R Markdown.
No comments:
Post a Comment