Writing a Book Using R

Think faster than “How to write a book in 28 days”. With the freely available R language you can create a book in less than 28 seconds. Unfortunately, you still have to write the text and do the programming. What you can do is integrate the R code and text into the same files, then generate the figures and latex text together. This adds a lot of flexibility and organization for highly technical productions, and avoids the hassle of cross referencing.

In my book, “Niche Modeling” which finally been sent to the publishers I incorporated many tables of and figures results on circularity and reconstructions published here over the last 6 months, almost all generated on the fly from R data structures Sweave and xtable. A push of a button runs all the R scripts for generating plots, tables, and outputting latex.

There were some technical issues and last minute formatting glitches though that I want to document here for posterity.

Organizing

The starting point for organizing a multipart book was this short note — LaTeX Files for a Book or Thesis. However, I put all the 12 chapters in subdirectories, and included the chapters from a master.tex file in the parent directory. I also used a single chapter.tex file, so I could work on one chapter at a time easily.

Sweave

Sweave is an R package that allows ‘literate programming’ or integrating code and documentation. For example the code blocks are included in the latex like below. Then the figure is referred to in the text as Figure ref~{fig1}. On running sweave, the figure is generated as a postscript file, and the appropriate latex for inserting and referencing it in the document added. This save a lot of annoying cross-referencing.

begin{figure}
< >=
... R code ...
@
caption{This is a figure}
label{fig1}
end{figure}

I also needed to include an option to write figures to another directory
to keep them from cluttering up the chapter directories.

SweaveOpts{prefix.string=../figs/chap12}

Sweave is run with the command Sweave(“script.R”).

Here is where the problems started. The publisher required all fonts
embedded in the pdf file. This includes the figures. The default font in R figures is Helvetica which is not available for embedding by the latex compiler. I had to use ps.options(family=”NimbusSan”) to specify another font.

Embed fonts with ghostscript

Normally I used iTexMac for compiling latex files. For final preparation I also had to compile then ensure all the fonts were embedded with the following ghostscript command.

pdflatex master
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=master2.pdf master.pdf

To check the fonts are embedded, open the file in acrobat and look at “document properties” under “fonts”. All the fonts should say “Embedded Subsetted”. After doing this there was still a single font not embedded called R1002 or R1004 depending on how I compiled it and I could not find any information about it. The publishers technical person found it was due to a single apostrophe in a code listing! Something to watch out for.

The publishers also required that I use their style file. This style used the latex directive tabletitle{} instead of caption{} for table captions. As I was using the R package xtable to generate tables I couldn’t change them. xtable is really useful, producing nicely formated latex for R data structures like dataframes, model output, time series. But I had to change the xtable code where it writes caption{ to tabletitle{ and also set it to write them on top of the table block by default, not below.

Another issue was the code listing in R would exceed the page with. I found that by reducing the width of the console window would also shorten the breaks in output strings written to latex files.

That is about it for the moment. I wish I had another 3 months to fiddle with the figures and explain things more. But I have to get it in or it won’t be published this year.

Advertisements

0 thoughts on “Writing a Book Using R

  1. This is a very useful summary of useful tools, some quite familiar to me and others brand new. Rather than books, it seems that the R-preprocessing paradigm is most appropriate for dynamic web pages — say ones that provide standarized analyses of some current financial data. There would be natural hooks via nice tooks like MySQL, PHP, and Ruby on Rails. We are sure to see lots more of this, and I will pass your essay along to students and friend.

  2. This is a very useful summary of useful tools, some quite familiar to me and others brand new. Rather than books, it seems that the R-preprocessing paradigm is most appropriate for dynamic web pages — say ones that provide standarized analyses of some current financial data. There would be natural hooks via nice tooks like MySQL, PHP, and Ruby on Rails. We are sure to see lots more of this, and I will pass your essay along to students and friend.

  3. Yes Michael, That is the approach I have always tried to use. See here where I have used R, PHP and shell scripts.

    There are advantages: the figures are always current, people can replicate your analysis easily via the web.

    There are disadvantages: particularly with R I could find no way for it to output incrementally. It writes to a buffer and dumps a buffer when it finishes. So it is unsuitable for a top level script for writing a web page that works better writing incrementally, so it is viewed in stages, but R can be called as a subscript.

    There are other disadvantages too, and some of the newer systems like AJAX may be better but its actually hard to find something that integrates analysis with web output well.

    Regards

  4. Yes Michael, That is the approach I have always tried to use. See here where I have used R, PHP and shell scripts.

    There are advantages: the figures are always current, people can replicate your analysis easily via the web.

    There are disadvantages: particularly with R I could find no way for it to output incrementally. It writes to a buffer and dumps a buffer when it finishes. So it is unsuitable for a top level script for writing a web page that works better writing incrementally, so it is viewed in stages, but R can be called as a subscript.

    There are other disadvantages too, and some of the newer systems like AJAX may be better but its actually hard to find something that integrates analysis with web output well.

    Regards

  5. Very helpful. But the embedding fonts part will not work for thse “common fonts” such as Times or Symbol. Some publishers ask for embedding “all” fonts.

  6. Very helpful. But the embedding fonts part will not work for thse “common fonts” such as Times or Symbol. Some publishers ask for embedding “all” fonts.

  7. Pingback: wypozyczalnia samochodow

  8. Pingback: click here

  9. Pingback: bateria do laptopa samsung

  10. Pingback: wypozyczalnia samochodów dostawczych

  11. Pingback: steroids legal

  12. Pingback: polecam

  13. Pingback: london taxi

  14. Pingback: anti aging

  15. Pingback: massage erotique paris 19

  16. Pingback: follow url

  17. Pingback: makeanygirlwanttofuck

  18. Pingback: polecam link

  19. Pingback: click this link now

  20. Pingback: witryna www

  21. Pingback: polecam link

  22. Pingback: kliknij link

  23. Pingback: witryna

  24. Pingback: polecam link

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s