Workflow w/ reports package

NOTE: THIS IS NOW A PACKAGE SEE THIS LINK FOR DETAILS

Let me start with a video for people who just want to see what I’m demo-ing first:

I’ve been interested in speeding up workflow lately and spending a lot of time doing so. I’ve seen people already try to tackle this in R in the past.  This blog post covers many aspects of workflow and increasing productivity.  John Myles White has tackled this problem and created the ProjectTempalte package.  The idea is terrific but the problem is that the R user is so varied in their work flows that it’s difficult to make one workflow template for everyone.  I’ve given up on that.  Instead I propose:

1. The R community modularize workflow into field dependent pieces.

For instance in qdap, an R package for quantitative discourse analysis, I’ve added a work flow template that people in my field would find suiting.  However, the report writing part I intentionally left underdeveloped because I plan to add the reports package as a piece of the workflow.  While my entire work flow is likely only useful for discourse analysis people, the reports section is much more generalizable.  In this way we build work flow from modular pieces.

2. Make the pieces flexible (within reason).

For example in the beta version of reports I have added the ability for users to submit templates via doc_temp (not sure how well this will work) which provides a template that alters the documents that the new_report template will generate. The doc_temp function is similar to package.skeleton.  The functionality will be similar to the way CRAN or CTAN house packages with the templates library housed within the package, provided it doesn’t get to large. The submissions still need to conform to a standard (the within reason part) though the user may choose to keep their template local.

3. Use existing tools (powerful, flexible and efficient).

R has had some great developments in tools, combined with latex, we can really speed up workflow; RStudio, knitr, MikTex/Tex Live, bibtexknitcitations and of course R to name a few.  By utilizing all these tools we really maximize productivity in that we’re not going to multiple places and reloading libraries and user defined functions.  As an example, recently, R bloggers Daniel Liidecke and Andrew Landgraf discussed custom functions that they use frequently .  By placing these in the extra_functions.R script and then opening with RStudio, the project’s .Rprofile will source these functions automatically and load them as well just by opening the project. Better still if these are constantly used functions that don’t yet have a package home the user can supply the path(s) to new_report and the code will be added automatically to the report project’s .Rprofile for sourcing.

The idea is to generate a template that is fast and flexible which keeps everything for a report housed in one place.  In this way the report framework of the reports package can be added as a piece to the rest of your workflow.

Trying the reports package

 #INSTALLING
library(devtools)
install_github("reports", "trinker")

#GETTING STARTED
library(reports)
# setwd("~/your/favorite/directory/here")
new_report("New")

#PLAY AROUND A BIT
templates()   #current internally housed templates

new_report("new proj2", templates(FALSE)[2]) #quantitative Rnw
new_report("new proj3", templates(FALSE)[3]) #qualitative docx

I encourage you to view the intro video, look at the help manual, check out the html5 introductory slides and just play with the reports package a bit.  I want your feedback to make a tool others can use to help them in their work flow. If your comments are more substantial please use the Issue Tracking of GitHub.

Advertisement

About tylerrinker

Data Scientist, open-source developer , #rstats enthusiast, #dataviz geek, and #nlp buff
This entry was posted in qdap, work flow and tagged , , , , , , , , , , . Bookmark the permalink.

13 Responses to Workflow w/ reports package

  1. tylerrinker says:

    For people wanting to add to this project I have a current problem: http://stackoverflow.com/q/15042418/1000343. I’d like to use this info to create a function that esily sends the local report repo to github without first creating the repo in the cloud. It works on Linux but I can’t get it to work from Windows yet. I’d appreciate help on this task.

  2. Jean-Pierre Gattuso says:

    I get several errors during the installation

    —————————————————————————–
    install_github(“reports”, “trinker”)
    Installing github repo(s) reports/master from trinker
    Installing reports.zip from https://github.com/trinker/reports/archive/master.zip
    Installing reports
    Installing dependencies for reports:
    qdap
    Installing package(s) into ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library’
    (as ‘lib’ is unspecified)
    also installing the dependency ‘openNLP’

    packages ‘openNLP’, ‘qdap’ are available as source packages but not as binaries

    ‘/Library/Frameworks/R.framework/Resources/bin/R’ –vanilla CMD build \
    ‘/private/var/folders/7l/8xl7yqb12m30589vzz5gwz340000gp/T/RtmpzWTr7o/reports-master’ \
    –no-manual –no-resave-data

    * checking for file ‘/private/var/folders/7l/8xl7yqb12m30589vzz5gwz340000gp/T/RtmpzWTr7o/reports-master/DESCRIPTION’ … OK
    * preparing ‘reports’:
    * checking DESCRIPTION meta-information … OK
    * checking for LF line-endings in source and make files
    * checking for empty or unneeded directories
    * looking to see if a ‘data/datalist’ file should be added
    * building ‘reports_0.1.0.tar.gz’

    ‘/Library/Frameworks/R.framework/Resources/bin/R’ –vanilla CMD INSTALL \
    ‘/var/folders/7l/8xl7yqb12m30589vzz5gwz340000gp/T//RtmpzWTr7o/reports_0.1.0.tar.gz’ \
    –library=’/Library/Frameworks/R.framework/Versions/2.15/Resources/library’ \
    –with-keep.source

    ERROR: dependency ‘qdap’ is not available for package ‘reports’
    * removing ‘/Library/Frameworks/R.framework/Versions/2.15/Resources/library/reports’
    Error: Command failed (1)
    In addition: Warning message:
    packages ‘openNLP’, ‘qdap’ are not available (for R version 2.15.2)

  3. Ulf says:

    1) Thx for your efforts…video and everything else really look promising!
    2.) I’d like to report a probable bug (maybe it is just my personal fault)…installing of all required software including your package does not seem to make any problem, but if I try to start a new_report() than following error-message comes up (translated from german – not exactly literally):

    error in y[[2]] : indices out of range
    additional: warning:
    In is.na(text.var) :
    is.na() applied to -(list or vector) of type ‘NULL’

    Any ideas what this can mean? I am no good at debugging (nor beginner..), so may you know what’s this about?

    Thank you in advance!

    • tylerrinker says:

      @Ulf Thank you for the feedback. I’m not sure. There shouldn’t be an issue though this may be the cause (fixed it already). Try re-downloading/installing and test again. If this does not solve the issue please post full code and system info via sessionInfo() to report package’s issues page

      • Ulf says:

        Sorry for posting here again, but I do not have a github accoutn (yet)…when re-installing, now this comes up…:

        * installing *source* package ‘reports’ …
        ** R
        Error in .install_package_code_files(“.”, instdir) :
        files in ‘Collate’ field missing from ‘C:/Users/Ulf/AppData/Local/Temp/RtmpeqIfYm/R.INSTALL524666719aa/reports/R’:
        CA.R
        CW.R
        GQ.R
        LL.R
        ERROR: unable to collate and parse R files for package ‘reports’
        * removing ‘C:/Users/Ulf/Documents/R/win-library/2.15/reports’

        As requested, here is the sessionInfo():

        R version 2.15.2 (2012-10-26)
        Platform: x86_64-w64-mingw32/x64 (64-bit)

        locale:
        [1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252
        [4] LC_NUMERIC=C LC_TIME=German_Germany.1252

        attached base packages:
        [1] stats graphics grDevices utils datasets methods base

        other attached packages:
        [1] devtools_1.1

        loaded via a namespace (and not attached):
        [1] digest_0.6.3 evaluate_0.4.3 httr_0.2 memoise_0.1 parallel_2.15.2 RCurl_1.95-3
        [7] stringr_0.6.2 tools_2.15.2 whisker_0.1

        Hope that it helps…

  4. tylerrinker says:

    Give it another try. Thanks for being patient.

    • Ulf says:

      NICE! Now everything works fine…!

      And maybe this could be a hint for you why it originally didn’t work to start a new report: There was a message when installing “reports” the first time, i.e., “method ‘show’ is temporarily hidden from package:rjava” (maybe not exactly this phrase, but nearly), which was now missing.

      Thanks a lot! 🙂

  5. tylerrinker says:

    @Ulf that message occurred because I loaded qdap which calls rjava which has a function named show. This is the same name as the base install methods package’s show; when both are loaded there’s a conflict and thus you get the message but is not a cause for concern. I believe the original problem was because of the singular and sign and is now fixed. The collate issue occurred because of some naming issues on the github repo (I renamed files from lower case to capital and the old files of the same name are not removed). I removed all the contents on github and started from scratch and all is well. I appreciate your feedback. It was helpful.

  6. Pingback: Simplify frequency plots with ggplot in R #rstats | Strenge Jacke!

  7. Pingback: reports 0.1.2 Released | TRinker's R Blog

  8. Pingback: Writing a MS-Word document using R (with as little overhead as possible)R-statistics blog | R-statistics blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s