Introduction to pipeR

pipeR provides high-performance pipeline operator and object.

%>>%

Pipe as first-argument to function

The pipe operator %>>% by default inserts the object on the left-hand side to the first argument of the function on the right-hand side. In other words, x %>>% f(a=1) will be transformed to and be evaluated as f(.,a=1) where . takes the value of x. It accepts both function call, e.g. plot() or plot(col="red"), and function name, e.g. log or plot.

rnorm(100) %>>% 
  plot

rnorm(100) %>>% 
  plot()

rnorm(100) %>>% 
  plot(col="red")

rnorm(100) %>>% 
  sample(size=100,replace=FALSE) %>>% 
  hist

You can write commands in a chain (or pipeline) like

rnorm(10000,mean=10,sd=1) %>>%
  sample(size=100,replace=FALSE) %>>%
  log %>>%
  diff %>>%
  plot(col="red",type="l")

In some cases, the next function needs first-argument piping and uses the piped object elsewhere too. Therefore, you can directly use . to represent the piped object within that call.

rnorm(100) %>>%
  plot(col="red",main=sprintf("Number of points: %d",length(.)))

*Notice: function call in a namespace must end up with parentheses like x %>>% base::mean().

Pipe as . to expression

If a function name or call directly follows %>>%, it means first-argument piping. If the operator is follows by braces ({}), the inner expression will be evaluated with . representing the piped object.

rnorm(100) %>>% 
  { plot(.) }

rnorm(100) %>>% 
  { plot(., col="red") }

rnorm(100) %>>% 
  { sample(., size=length(.)*0.5) }

mtcars %>>% {
  lm(mpg ~ cyl + disp, data=.) %>>% 
  summary
}

rnorm(100) %>>% {
  par(mfrow=c(1,2))
  hist(.,main="hist")
  plot(.,col="red",main=sprintf("%d",length(.)))
}

Pipe by lambda expression

It can be confusing to see multiple . symbols in the same context. In some cases, they may represent different things in the same expression. Even though the expression mostly still works, it may not be a good idea to keep it in that way. Here is an example:

mtcars %>>%
  { lm(mpg ~ ., data=.) } %>>%
  summary

The code above works correctly even though the two dots in the second line have different meanings. . in formula mpg ~ . represents all variables other than mpg in data frame mtcars; . in data=. represents mtcars. One way to reduce ambiguity is to use lambda expression that names the piped object on the left of ~ or -> and specifies the expression to evaluate on the right.

%>>% will assume lambda expression follows when the next expression is enclosed by parentheses (). The lambda expression can be in the following forms:

The previous example can be rewritten with lambda expression like this:

mtcars %>>%
  (df -> lm(mpg ~ ., data=df)) %>>%
  summary

Pipe object

Pipe() creates a Pipe object where built-in symbols are designed for building pipeline.

Pipe as first-argument to a function:

Pipe(rnorm(100,mean=10))$
  log()$
  diff()$
  plot(col="red")
Pipe(1:10)$
  fun(x -> x + rnorm(1))$
  mean() []