Pages

Showing posts with label r markdown. Show all posts
Showing posts with label r markdown. Show all posts

Tuesday, April 14, 2015

Breaking Up With Powerpoint

I’m breaking up with powerpoint. I’ve known this day would come for a while now, but its shocking its finally here. There are academic arguments for its abandonment, but none really compelled me. The honest truth is I’ve finally found something better.

Two weeks ago I did something new, I wrote and delivered a presentation on graph theory and interactive data analysis to a mixed crowd of upper year undergrads and grad students. What was special about this presentation was that powerpoint was nowhere to be seen, not a familiar microsoft trapping in sight. This presentation was a beamer presentation. I wrote it in R Markdow supplemented with some raw LaTeX, and now I don’t think I can ever back. I’m breaking up with powerpoint, and I think you should too.

Background

If you’re not familiar yet with R Markdown, I recommend you go back and read my introduction to Knitr and R Markdown. It has become so natural to do the bulk of my work without leaving the comfortable confines of R Studio that I keep looking for more tasks I can do without switching software. Presentations were a logical next step. Previously I had dipped my toes into the problem of authoring presentations with markdown. I used ioslides (another presentation format offered by R Markdown), but was unsatisfied with the level of customization I could achieve (with my primitive knowledge of javascript). So I tried again, but this time with beamer. Beamer for those who haven’t spent much time swimming in the LaTeX pools is a convenient package for rendering LaTeX code as pdf slide-decks. I encountered beamer for first when I tried learning LaTeX originally but never had enough time or drive to master it. However, now with the added ease offered by R Markdown I decided to give it another shot.

These are the results:

The code is available from my github

Other than the relatively uninspiring title page the document came out beautifully. Figures rendered wonderfully, code seamlessly integrated into the slides, natural sub-sectioning, I can’t wait to write more like it. I recommend you quickly scroll through the document to see just how simple the document turned out to be (after code headers).

Getting Started

I won’t lie, there were quite a few gotchas1 along the way, but you get the opportunity to learn from my mistakes. To start a beamer presentation in R Studio, create a new markdown document as I discussed in the post about markdown, but instead of choosing the default settings, click the panel labelled presentation, then select beamer and ok

R Studio throws in some demonstration slides to give you a taste for how to make your presentation, you can go ahead an delete that (though keep the yaml block at the very top [the stuff enclosed by — ]) because I’ll walk you through how to write a really simple presentation.

First Gotcha, YAML Headers, and Themes

A problem I ran in to (yet haven’t done my due dilligence and reported it) was that I couldn’t resize the code in my document. As there isn’t a burgeoning community of R Markdown –> Knitr –> Beamer users, tracking down which component of the pipeline isn’t working right and finding a fix is challenging. I found references to a workaround by Yuihui Xie (the creator of Knitr) for getting the code to the right size, but it didn’t work for me, and supposedly is no longer necessary anyway. He was using Knitr –> Beamer so the issue could be in R Markdown. I created a work-around that made the code font smaller but left the output font gargantuan, it was sufficient for my purposes. You can grab the modified template I used from my github by running:

library(RCurl)
gistUrl <- "https://gist.githubusercontent.com/cfhammill/b5ba7767d7729bd676a2/raw/987d43694eda1fc263efdd38af03f846db80e690/resizeTemplate.beamer"

write(getURL(gistUrl), "resizeTemplate.beamer") 

Then you can add template: resizeTemplate.beamer to your yaml header. Also if you’re interested in using a theme to beautify your document you can add that in the header as well:

---
title: "A title"
author: "Your Name"
date: 'Today's date'
output:
  beamer_presentation:
    theme: "Boadilla"
    template: resizeTemplate.beamer
---

I used the theme boadilla but there are many others to choose from. To find the theme that’s right for you check out the gallery by Ian Blaines to see one presentation rendered in many different themes.

Slides

Once that is set up, you can start writing your presentation. By default, new slides begin with every level 2 header, or line break. To create two slides (plus your title slide) you can add the following code to get a titled slide and an untitled slide:

## Slide 1

Some Slide Contents!

------------------

Untitled slide 2

Images

Next thing you might want to try is to add some images into your documents.

To add pictures, you can use the default markdown code:

![](path/to/pic.png)

But I found myself unsatisfied with the default sizing and positioning. I wanted a centred picture of a certain size. To achieve that I needed to write some raw LaTeX

\centering \scalebox{0.45}{\includegraphics{path/to/pic.png}}

Centering indicates the line should be centered, and since LaTeX treates included graphics as large characters that will center your image. The \scalebox command resizes the image as you’d expect (with numbers larger than one expanding it). All and all not too complicated.

Bullet Points and Sequence

To have a series of bullet points in your slide you just need to create a bulleted list the default markdown way

#Bulleted Slide

- Isn't
- This 
- Easy

And you’ll get a nice bulleted slide. If you’re like me and want some but not all of your bullet points to come in sequentially you can add incremental = true to your yaml block after beamer_presentation: but I found it easier to leave that out and specify manually where I’d like my bullets to be sequential. To force sequential bullets (or if incremental is true to force static bullets, which isn’t documented on R Markdown’s webpage) you just need to add the greater than sign before the bullet.

#Sequential Bullet Slide

>- Wait for it
>- .
>- ..
>- ...
>- Point!

Images can be made sequential too by putting them in a sequential bullet.

Bullet Spacing

The applies to line spacing in general, markdown ignores extra white space by default, so trying to force extra space between points isn’t as easy as one might hope (although there is probably a way to do it with your LaTeX header or yaml header). The solution I found was to manually include LaTeX line-breaks

#Spaced Out Bullets

- Point 1 \newline
- Point 2 \newline
- Point 3 \newline

Which is useful if you, like me, try to keep text to a minimum so using white space effectively is key.

Resizing Font

To resize font in your document you can use LaTeX’s font sizing codes e.g. (\large{your text}, \Large{your text}, \tiny{your text}, etc.)

This was useful for me to make better use of the slide space with sparse text (lots of line spacing and a bigger font), and for emphasis without using headers which can trigger some unwanted stylistic changes.

Outro

With that you now know about as much as I do about creating presentations with Beamer via Knitr via R Markdown. It’s pretty straight forward, if you ever do presentations that involve code, equations, and figures I can’t recommend it enough. I hope you’re inspired to try your next presentation without powerpoint.

-Chris

Bonus Trick For Those interested: in the presentation, the red X and green check mark were made using grid graphics directly from within R. I previously wrote a little about using ggplot2 in unexpected ways, this used some of those lessons. By using the grid package directly you can draw whatever you like on a plot canvas, check out the presentation code for how I did it.


  1. “Gotcha” is a programming term for a little irksome quirk of a language or tool that cause it to perform in unexpected or counter intuitive ways

Tuesday, March 3, 2015

Getting Started With R Markdown

Intro

For today’s post, I’d like to continue on the R Markdown theme started in my last post, and give a brief introduction to authoring documents using R Markdown and Knitr. If you’re completely new to Knitr and R Markdown I recommend reading the “Back to basics” paragraph in my last post to get a feel for the context and purposes of these tools.

These tools fill essentially the same niche for the R community as ipython notebooks do in the python community. My experience with ipython notebooks is minimal relative to my markdown experience and so I don’t feel qualified to compare the two, but I will say that ipython notebooks aren’t the only way to share code and rationale all in one document.

My affection for R Markdown and Knitr is strong, but there were some growing pains. When I started learning R Markdown, I had a tiny bit of experience with Knitr already but nothing resembling expertise, the resultant challenge was having to learn both tools concurrently, flipping back and forth between documentation that, at the time, felt more like a technical showcase. I don’t want others to have to share that struggle, so here’s my attempt at streamlining the plunge into “reproducible research”1.

Sources and Suggestions

Before we jump in to authoring our first (second, thirtieth) document using R Markdown, I feel I need to pay homage to the materials I learned from. If a topic is not covered in this introduction you will (hopefully) be able to find it in one of these sources.

  • R Studio’s R Markdown Documentation: This page is the official documentation for R Markdown. It is a treasure trove of information on how to acheive different stylistic outcomes using R Markdown.
  • Yuhui Xie’s Knitr Documentation: This page is the official documentation for Knitr (in essence R Markdown is just a convenient interface to Knitr). This page is relatively comprehensive but I found it hard to use.
  • Pandoc’s Documentation: R Markdown uses the markdown conventions of pandoc, so if something is absent from the R Markdown documention, it’s worth examining the pandoc documention.

These three sources (plus lots of trial and error) are sufficient to learn how to produce high quality documents with R Markdown. Also I implore you to use R Studio when writing your documents. The people behind R Studio created R Markdown, and because of their deep interest in the format they have provided many conveniences you’d miss if you tried authoring from your text editor and command line. These instructions were written assuming you’re using R Studio, if not you will have to determine some of the housekeeping steps for yourself.

Getting Started

Alright! Now let’s get authoring! First thing you’re going to want to do is create a new file in R Studio. If you click the new file button, you’ll notice that there are options other than R Script for the type of file to create. Below the R Script option is the R Markdown option, choose that one. This gives you the option to give your document a title and your name and choose the output format for your document. By default the Knitr knits (creates the final document) to .html, but it can also create .pdf’s and .doc’s if you’d like. R Studio puts some demonstration markdown in the document, you can go ahead and erase that, we’ll be writing our document from scratch. Now save this document somewhere, by default the document knits to the same directory as the file. The file now exists as a .rmd, pretty easy to remember and easy to keep separate from your R scripts.

Give it a title!

R Markdown supports a couple of different heading schemes, the two I use rely on the number sign, the equal sign, and the dash. To give your document a title try one of the following:

# My title

or

My title
=========

Both work, but note, there can’t be a space between the beginning of the newline and the number sign or equal signs otherwise the special meaning of those characters is lost.

To add sub-headings, either use

## My sub-heading

or

My sub-heading
---------------

At first I tended to use the number sign method, but now I tend to use the underlining method. The advantage to the number sign method is that you can have greater than two heading levels (just keep adding number signs to represent deeper levels). I find I rarelly go deeper than two sub-headings, and the underlining method looks neater as you’re writing.

Write a paragraph

First thing to note about writing text paragraphs is that R Markdown does not honour your newline characters as you might imagine it would. Because scripting windows on differ in size, R Studio will auto-wrap long lines for you, and if you choose to manually break your lines it will ignore that when rendering the final document. To instruct R Markdown to honour a line break, either place two spaces at the end of the preceding line, or add a full empty line between the old and new paragraphs. Using more than one empty line to divide paragraphs when writing is perfectly acceptable and R Markdown will ignore additional newlines. This behaviour stems from markdown’s original mission: to produce documents that are readable as plain-text that produce well formatted documents when converted to other document types, so facillities exist to make plain-text look nicer (lots of stylistic white space).

So go ahead an write a little bit about what you’ve learned about R Markdown so far (or whatever you want), hit the knit HTML button in the top of the scripting window. Now have a look at your beautiful handiwork, pretty snazzy right? Once you’ve patted yourself on the back lets move on to the next step, adding some code!

Add some code

R Markdown has two primary modes of displaying code, there is inline code and code chunks. Inline code is primarily for small snippets, like variable names and parameter assignments. To denote an inline code segment, surround it with backticks (`, not ’).

The more interesting way to display code are code chunks, these allow code, results, and figures to be presented together in text. In all honesty, if R Markdown was just a tool for making writing HTML a lot easier it probably wouldn’t have earned this post. Of course the markdown family is incredibly useful if you publish a lot of web content that you want to make available quickly, but the real beauty of R Markdown is its ability to display and execute code from a variety of different languages. Just last week I posted my adventure into adding ipython as an interpreter(engine) for Knitr, but you can include code from python, haskell, bash, c, and fortran just to name a few2,3.

To add a code “chunk” into a document, you create a fenced code block like so

```{r}
# My fantastic R code!
```

What R Markdown (and eventually Knitr) does with your code chunk is highly customizable. You can have code that is shown but isn’t executed, code that is executed but isn’t shown, code that has its results saved so that it doesn’t have to be recalculated if your change the document, and a myriad of other potential customizations. So I think now is as good a time as any to introduce chunk options.

Customize your code chunks

One of the cryptic things I had to figure out when starting R Markdown is what do all of these chunk options do, and how to I use them. I’m going to cover five of the basic ones that will get a lot of use, and a few that may be useful more occaisonally.

  • eval: This option controls whether your carefully crafted code is executed while the document is being knit. Often I’m including demonstration code that doesn’t need to be run each time (or may not work out of context), so I’ll often set eval = FALSE, eval is TRUE by default.
  • echo: This option controls whether to display the code, often I’ll have a set-up chunk at the beginning of the document to do things like set my working directory, load packages, and import data. These aren’t interesting to a reader so I set echo = FALSE, echo is TRUE by default
  • results: This option controls whether to output the results of running the code into the document. Sometimes hiding the results of executed code makes sense, I often use results = FALSE with echo = FALSE, when my set-up chunk has output.
  • cache: This option allows Knitr to remember the results of executing a code chunk so that unless the code changes it doesn’t need to be re-run. This is useful when executing code that takes a while to run. R Markdown typically involves a lot of guess and check, so even a few minutes of execution time can make the authoring process unpleasant, for those code chunks set cache = TRUE, it is FALSE by default.
  • engine: This option allows you to choose another language engine to execute the code, perfect for presenting or comparing another language in your document. Just set engine = "name_of_engine" in the code chunk.
  • message,warning, and error: These options let you control whether Knitr should report messages, warnings, and errors in the document. Sometimes these are worth suppressing, by default they are shown (set to FALSE to suppress them).

To set these options on a chunk-by-chunk basis, just include all your options in the curly braces at the top of the chunk like so

```{r, eval = FALSE, echo = TRUE}
#Some code to be shown but not run
```

If you find yourself using the exact same option in all of your code chunks, you can set document-wide defaults. These defaults will be overriden by individual chunk options, so you are not locked-in even if you do set a global option. As I mentioned above, I like to have a set-up chunk at the top of my document performing all the preparation necessary for my code to run smoothly, this is the ideal place to set global options. Global options are set with opts_chunk$set(option = value)

Sample set-up chunk

I highly recommend including a set-up chunk in all of your documents, this is an ideal place to put system specific code, because in most cases, no one but you needs to know where on your computer your files are located, no one but you needs to know what global knitr options you used, et cetera.

```
setwd(“my/working/directory”)
myData <- readRDS(“somePreparedData.rds”)

library(knitr)
opts_chunk$set(echo = FALSE)
```

This will set your working directory, pull in some data, and tell Knitr not to bother showing the code (this is useful if you want to use R Markdown to keep figures with text, without the cumbersome code). Unfortunately you do have to tell R to load the Knitr package, the code gets executed in its own environment and needs to be made aware of the opts_chunk object in Knitr.

Did I mention figures?!

Another supremely useful feature of R Markdown is the ability to generate and keed your figures in the document with your code and writing. R Markdown is pretty smart about figuring out what to do with plots. The following graph was created inside a normal code chunk, with no specific options set:

x <- seq(-1,1, .01)
plot(x, x^2, type = "l", ylim = c(0,1.6), lwd = 3)
points(c(-.65, .65), c(1.5,1.5), cex = 3)
points(0, .9, pch = 2, cex = 2)

And now I’ll always remember how to draw a creepy smiley face using base R’s plotting functionality. Perhaps you can see why suppressing the code for an intricate figure may be useful.

Math

Often you’ll want to present equations in with your code to do this R Markdown allows your to write LaTeX directly in your document. Knitr automatically renders these as either LaTeX or Math Jax depending on your output format. Latex code can be specified as either inline or block, just like code. An inline math segment begins and ends with a dollar sign $...$ and block latex uses two dollar signs $$...$$.

Sharing your work (on the internet)

When I create an html report using R Markdown one of two things happen. The first and easiest is I scp the html file produced by knit HTML onto the server where my personal webpage lives. Or if I’m publishing to blogger (where this blog is hosted), I go through the rigmarole of copying the source code, removing extraneous tags, and pasting it into the html window of a post on blogger. Additionally you could share the file with the intended audience directly. But in any case, once you’ve created the html file, what you choose to do with it is up to you.

Take-Aways

So I hope this tutorial gave you a little bit of a primer on how to use R Markdown to keep your code, writing, and figures all together in one place, and I hope the benefit of doing so is apparent. I’d like to leave you with a few things:

  • Author documents with R Markdown. It’s easy, relatively painless, and helps keep scatter-brains like me organized. I report to my bosses for two of my jobs with a web page I make with R Markdown. It allows them to see relevant snippets of my code and my analysis of what’s happening.
  • Use a set-up chunk at the begining of your document, have a look at the sample above, it will make your document more compact and cleaner if all the mechanical stuff happens in the beginning behind the scenes.
  • Know where to look for help, see the resources I posted at the top of this post, especially be aware that pandoc’s documentation has things that the R Markdown documentation missed.
  • If your header/table/etc isn’t rendering properly, make sure it isn’t just an issue with a space after a newline. That space can deactivate all kinds of special meaning, so lookout.
  • Use R Markdown as an excuse to write often about things that interest you, it’s a great way to keep sharp.

If you have any questions please feel free to post them below, I’ll try my best to answer them. Also this document was made with R Markdown so I’ve made the .rmd that produced this post available for inspection.

Happy authoring!

Chris


  1. Here I fall victim to using the buzzword, but its easier than “code/document integration” or anything else I could think of. Let it be known that I think these tools are more than just for making research reproducible, but also to facillitate sharing ideas, and authoring documents in an age where code can be as important as words

  2. You can see the full list of available engines with library(knitr); names(knit_engines$get())

  3. I just noticed there’s no lisp engine! That needs to change