---
title: "_systemPipeRdata_: NGS workflow templates and sample data" 
author: "Author: Thomas Girke (thomas.girke@ucr.edu)"
date: "Last update: `r format(Sys.time(), '%d %B, %Y')`" 
output:
  BiocStyle::html_document:
    toc: true
    toc_depth: 3
    fig_caption: yes

fontsize: 14pt
bibliography: bibtex.bib
---
<!--
%\VignetteEngine{knitr::rmarkdown}
%\VignetteIndexEntry{Overview Vignette}
%\VignetteDepends{methods}
%\VignetteKeywords{compute cluster, pipeline, reports}
%\VignettePackage{systemPipeRdata}
-->

<!---
- Compile from command-line
echo "rmarkdown::render('systemPipeRdata.Rmd')" | R -slave; R CMD Stangle systemPipeRdata.Rmd

- Commit to github
git commit -am "some edits"; git push -u origin master

- To customize font size and other style features, add this line to output section in preamble:  
    css: style.css
-->

```{r style, echo = FALSE, results = 'asis'}
BiocStyle::markdown()
options(width=100, max.print=1000)
knitr::opts_chunk$set(
    eval=as.logical(Sys.getenv("KNITR_EVAL", "TRUE")),
    cache=as.logical(Sys.getenv("KNITR_CACHE", "TRUE")))
```

```{r setup, echo=FALSE, messages=FALSE, warnings=FALSE}
suppressPackageStartupMessages({
    library(systemPipeR)
    library(systemPipeRdata)
    library(BiocGenerics)
})
```

Note: the most recent version of this vignette can be found <a href="https://htmlpreview.github.io/?https://github.com/tgirke/systemPipeRdata/blob/master/vignettes/systemPipeRdata.html">here</a> and a short overview slide show [here](https://htmlpreview.github.io/?https://github.com/tgirke/systemPipeR/blob/master/inst/extdata/slides/systemPipeRslides.html).

# Introduction
[_`systemPipeRdata`_](https://github.com/tgirke/systemPipeRdata) is a helper package to generate with a single command NGS workflow templates that are intended to be used by its parent package [_`systemPipeR`_](http://www.bioconductor.org/packages/devel/bioc/html/systemPipeR.html) [@Girke2014-oy]. The latter is an environment for building *end-to-end* analysis pipelines with automated report generation for next generation sequence (NGS) applications such as RNA-Seq, Ribo-Seq, ChIP-Seq, VAR-Seq and many others. The directory structure of the workflow templates and the sample data used by _`systemPipeRdata`_ are described [here](http://bioconductor.org/packages/release/bioc/vignettes/systemPipeR/inst/doc/systemPipeR.html#load-sample-data-and-workflow-templates).

<div align="right">[Back to Table of Contents]()</div>

# Getting Started

## Installation
The R software for using _`systemPipeRdata`_ can be downloaded from [CRAN](http://cran.at.r-project.org). The _`systemPipeRdata`_ package can be installed from within R as follows: 
```{r install, eval=FALSE}
source("http://bioconductor.org/biocLite.R") # Sources the biocLite.R installation script 
biocLite("tgirke/systemPipeRdata", build_vignettes=TRUE, dependencies=TRUE) # Installs from github
biocLite("systemPipeRdata") # Installs from Bioconductor once available there
```
<div align="right">[Back to Table of Contents]()</div>

## Loading package and documentation

```{r load_systemPipeRdata, eval=TRUE}
library("systemPipeRdata") # Loads the package
```
```{r documentation_systemPipeRdata, eval=FALSE}
library(help="systemPipeRdata") # Lists package info
vignette("systemPipeRdata") # Opens vignette
```
<div align="right">[Back to Table of Contents]()</div>

## Generate workflow template
Load one of the available NGS workflows into your current working directory. The following does this for the _`varseq`_ template. The name of the resulting workflow directory can be specified under the _`mydirname`_ argument. The default _`NULL`_ uses the name of the chosen workflow. An error is issued if a directory of the same name and path exists already.
```{r generate_workenvir, eval=FALSE}
genWorkenvir(workflow="varseq", mydirname=NULL)
setwd("varseq")
```

On Linux and OS X systems the same can be achieved from the command-line of a terminal with the following commands.
```{.sh generate_workenvir_from_shell, eval=FALSE}
$ echo 'library(systemPipeRdata); genWorkenvir(workflow="varseq", mydirname=NULL)' | R --slave
$ cd varseq
$ R
```

The workflow templates generated by _`genWorkenvir`_ contain the following preconfigured directory structure:
<br></br>
```{r workflow_template_structure, eval=FALSE}
workflow_name/            # *.Rnw/*.Rmd scripts and targets file
                param/    # parameter files for command-line software 
                data/     # inputs e.g. FASTQ, reference, annotations
                results/  # analysis result files
```


<div align="right">[Back to Table of Contents]()</div>

## Run workflows
Next, run from within R the chosen sample workflow by executing the code provided in the corresponding _`*.Rnw`_ template file. If preferred the corresponding _`*.Rmd`_ or _`*.R`_ versions can be used instead. Alternatively, one can run an entire workflow from start to finish with a single command by executing from the command-line _`'make -B'`_ within the workflow directory (here _`'varseq'`_). Much more detailed information on running and customizing [_`systemPipeR`_](http://www.bioconductor.org/packages/devel/bioc/html/systemPipeR.html) workflows is available in its overview vignette [here](http://www.bioconductor.org/packages/devel/bioc/vignettes/systemPipeR/inst/doc/systemPipeR.html). This vignette can also be opened from R with the following command. 
```{r load_systemPipeR, eval=TRUE}
library("systemPipeR") # Loads systemPipeR which needs to be installed via biocLite() from Bioconductor 
```
```{r documentation_systemPipeR, eval=FALSE}
vignette("systemPipeR", package = "systemPipeR")
```
<div align="right">[Back to Table of Contents]()</div>

## Return paths to sample data
The location of the sample data provided by _`systemPipeRdata`_ can be returned as a _`list`_.
```{r return_samplepaths, eval=TRUE}
pathList()
```
<div align="right">[Back to Table of Contents]()</div>

# Version information

```{r sessionInfo}
sessionInfo()
```
<div align="right">[Back to Table of Contents]()</div>

# Funding
This project was supported by funds from the National Institutes of Health (NIH) and the National Science Foundation (NSF).

<div align="right">[Back to Table of Contents]()</div>


# References