pipeR provides high-performance pipeline operator and light-weight Pipe function based on a set of simple and intuitive rules, making command chaining definite, readable and fast.
%>>%
The pipe operator %>>%
by default inserts the object on the left-hand side to the first argument of the function on the right-hand side. In other words, x %>>% f(a=1)
will be transformed to and be evaluated as f(.,a=1)
where .
takes the value of x
. It accepts both function call, e.g. plot()
or plot(col="red")
, and function name, e.g. log
or plot
.
rnorm(100) %>>%
plot
rnorm(100) %>>%
plot()
rnorm(100) %>>%
plot(col="red")
rnorm(100) %>>%
sample(size=100,replace=FALSE) %>>%
hist
You can write commands in a chain (or pipeline) like
rnorm(10000,mean=10,sd=1) %>>%
sample(size=100,replace=FALSE) %>>%
log %>>%
diff %>>%
plot(col="red",type="l")
In some cases, the next function needs first-argument piping and uses the piped object elsewhere too. Therefore, you can directly use .
to represent the piped object within that call.
rnorm(100) %>>%
plot(col="red",main=sprintf("Number of points: %d",length(.)))
*Notice: function call in a namespace must end up with parentheses like x %>>% base::mean()
.
.
to expressionIf a function name or call directly follows %>>%
, it means first-argument piping. If the operator is follows by braces ({}
), the inner expression will be evaluated with .
representing the piped object.
rnorm(100) %>>%
{ plot(.) }
rnorm(100) %>>%
{ plot(., col="red") }
rnorm(100) %>>%
{ sample(., size=length(.)*0.5) }
mtcars %>>% {
lm(mpg ~ cyl + disp, data=.) %>>%
summary
}
rnorm(100) %>>% {
par(mfrow=c(1,2))
hist(.,main="hist")
plot(.,col="red",main=sprintf("%d",length(.)))
}
It can be confusing to see multiple .
symbols in the same context. In some cases, they may represent different things in the same expression. Even though the expression mostly still works, it may not be a good idea to keep it in that way. Here is an example:
mtcars %>>%
{ lm(mpg ~ ., data=.) } %>>%
summary
The code above works correctly even though the two dots in the second line have different meanings. .
in formula mpg ~ .
represents all variables other than mpg
in data frame mtcars
; .
in data=.
represents mtcars
. One way to reduce ambiguity is to use lambda expression that names the piped object on the left of ~
or ->
and specifies the expression to evaluate on the right.
%>>%
will assume lambda expression follows when the next expression is enclosed by parentheses ()
. The lambda expression can be in the following forms:
expr
where .
is by default used to represent the piped object.x -> expr
or x ~ expr
where expr
will be evaluated with x
representing the piped object.The previous example can be rewritten with lambda expression like this:
mtcars %>>%
(df -> lm(mpg ~ ., data=df)) %>>%
summary
If a name is enclosed within ()
following %>>%
, like x %>>% (name)
, the operator will extract the element named name
from x
.
mtcars %>>%
(mpg)
mtcars %>>%
(lm(mpg ~ ., data = .)) %>>%
summary() %>>%
(coefficients)
The extraction works not only for list and data frame but also for vector, environment, and S4 object.
To evaluate an expression within the piped object if it is a list or environment, use with()
can be helpful.
list(a = 1, b = 2) %>>%
with(a+2*b)
But this method does not work for vector and S4 object.
Pipe()
creates a Pipe object where built-in symbols are designed for building pipeline.
$
chains functions by first-argument piping and always returns a Pipe object..(...)
evaluates an expression with .
or by lambda expression, or simply extract a named element. The usage is exactly the same with x %>>% (...)
.$value
or []
extracts the final value of the Pipe object.Pipe as first-argument to a function:
Pipe(rnorm(100,mean=10))$
log()$
diff()$
plot(col="red")
Pipe(1:10)$
.(x -> x + rnorm(1))$
mean() []
Pipe is lazily evaluated. Consider working with ggvis
.
p1 <- Pipe(mtcars)$
ggvis(~ mpg, ~ wt)
The plot will not be evaluated until p1 is called or further Pipe is evaluated.
p1$layer_points() []
p1$layer_bars() []
Pipe can also be stored as function.
f1 <- Pipe(rnorm(100))$plot
f1(col="red")
f1(col="green")
When the arguments are supplied, plot()
will be evaluated. Although Pipe is lazy but its value is determined at first evaluation.