This document provides some examples of how to specify nodal
attributes and their transformations in ergm terms. For the
R help on the topic, see ?nodal_attributes and for help on
implementing terms that use this interface, see
API?nodal_attributes.
It is sometimes desirable to specify a transformation of a nodal
attribute as a covariate in a model term. Most ergm terms
now support a new Tidyverse-inspired user interface to do so. Arguments
using this interface are typically called attr,
attrs, by, or on and are
interpreted depending on their type:
Extract the vertex attribute with this name.
Extract the vertex attributes and paste them together, separated by dots if the term expects categorical attributes and (typically) combine into a covariate matrix if it expects quantitative attributes.
The function is called on the LHS network, expected to return a vector or matrix of appropriate dimension. (Shorter vectors and matrix columns will be recycled as needed.)
Borrowing the interface from {tidyverse}, the expression on the right
hand side of the formula is evaluated in an environment of the vertex
attributes of the network, expected to return a vector or matrix of
appropriate dimension. (Shorter vectors and matrix columns will be
recycled as needed.) Within this expression, the network itself
accessible as either . or .nw. For example, in
the example below,
nodecov(~abs(Grade-mean(Grade))/network.size(.)) would
return the absolute difference of each actor’s “Grade” attribute from
its network-wide mean, divided by the network size.
AsIs object created by I()Use as is, checking only for correct length and type, with optional
attribute "name" indicating the predictor’s name.
Any of these arguments may also be wrapped in
COLLAPSE_SMALLEST(attr, n, into), a convenience function
that will transform the attribute by collapsing the smallest
n categories into one, naming it according to the
into argument. Note that into must be of the
same type (numeric, character, etc.) as the vertex attribute in
question. This is compatible with using magrittr’s pipes
for improved readability, i.e.,
attr %>% COLLAPSE_SMALLEST(n, into). This is illustrated
in the next section.
Then, taking faux.mesa.high dataset’s actor attribute
Grade, representing the grade of the student, we can
evaluate, equivalently, the linear effect of grade on overall activity
of an actor:
## nodecov.Grade
## 3491
## nodecov.Grade
## 3491
## nodecov.nw%v%"Grade"
## 3491
Taking advantage of nodecov’s new ability to take
matrix-valued arguments, we might also evaluate a polynomial effect of
Grade:
## Warning: In term 'nodecov' in package 'ergm': Attribute specification
## '~cbind(Grade, Grade^2)' is a matrix with some column names set and others
## not; you may need to set them manually. See example(nodal_attributes) for
## more information.
## nodecov.cbind(Grade,Grade^2).1 nodecov.cbind(Grade,Grade^2).2
## 3491 31123
Notice the Warning. This is because the way cbind()
assigns column names, the name of the second column will be blank unless
we set it directly, in which case it can be anything:
## x
## [1,] 1 1
## [2,] 2 4
## x x2
## [1,] 1 1
## [2,] 2 4
## x x^2
## [1,] 1 1
## [2,] 2 4
As the warning suggested, we can ensure that all columns have names, in which case they are not replaced with numbers:
## nodecov.Grade nodecov.Grade2
## 3491 31123
General functions, such as stats::poly(), can also be
used:
## nodecov.poly(Grade,2).1 nodecov.poly(Grade,2).2
## -2.412818 4.974174
We can even pass a random nodal covariate. Notice that setting an attribute “name” gives it a label:
randomcov <- structure(I(rbinom(network.size(faux.mesa.high),1,0.5)), name="random")
summary(faux.mesa.high~nodefactor(I(randomcov)))## nodefactor.random.1
## 197
For categorical attributes, to select which levels are of interest
and their ordering, use the argument levels. Selection of
nodes (from the appropriate vector of nodal indices) is likewise handled
as the selection of levels, using the argument nodes. For
mixing matrix effects such as nodemix() and
mm(), first, the nodal attribute selection is performed
using levels, then the cells of the resulting mixing matrix
can be selected using levels2. These arguments are
interpreted as follows:
AsIs object created by I()Use the given list of levels as is.
Used for indexing of a list of all possible levels (typically, unique
values of the attribute) in default order (typically lexicographic). In
particular, levels=TRUE will retain all levels. Negative
values exclude. Another special value is LARGEST, which
will refer to the most frequent category, so, say, to set such a
category as the baseline, pass levels=-LARGEST. In
addition, LARGEST(n) will refer to the n
largest categories. SMALLEST works analogously. Note that
if there are ties in frequencies, they will be broken arbitrarily. To
specify numeric or logical levels literally, wrap in
I().
NULL(Not recommended.) Retain all possible levels; usually equivalent to
passing TRUE. Note that this is not the same as
passing a numeric or logical vector of length 0, which will be
interpreted as excluding all levels.
Use as is.
The function is called on the list of unique values of the attribute, the values of the attribute themselves, and the network itself, depending on its arity. Its return value is interpreted as above.
The expression on the right hand side of the formula is evaluated in
an environment in which the network itself is accessible as
.nw, the list of unique values of the attribute as
. or as .levels, and the attribute vector
itself as .attr. Its return value is interpreted as
above.
Note that levels or nodes often has a
default that is sensible for the term in question.
base and keep arguments:Earlier versions of many of the terms had arguments base
and keep for selecting categorical attribute levels. They
have been superseded since {ergm} 3.9.4 and will be removed soon.
In general, keep = X can be replaced by
levels = X, as both typically have the same semantics.
The effect of base is opposite that of
levels, nodes, and levels2:
levels specify which levels to include, whereas
base specifies which levels to exclude. Thus, if
X is not 0 or NULL, base = X can
be replaced with levels = -X. If X is 0 or
NULL, base = X means to include all levels, so
it should be replaced with levels = TRUE.
Returning to the faux.mesa.high example, and treating
Grade as a categorical variable, we can use a number of
combinations:
## nodefactor.Grade.8 nodefactor.Grade.9 nodefactor.Grade.10
## 75 65 36
## nodefactor.Grade.11 nodefactor.Grade.12
## 49 28
## nodefactor.Grade.7 nodefactor.Grade.8 nodefactor.Grade.9
## 153 75 65
## nodefactor.Grade.10 nodefactor.Grade.11 nodefactor.Grade.12
## 36 49 28
##
## 7 8 9 10 11 12
## 62 40 42 25 24 12
## nodefactor.Grade.8 nodefactor.Grade.9 nodefactor.Grade.10
## 75 65 36
## nodefactor.Grade.11 nodefactor.Grade.12
## 49 28
# Collapse the smallest two grades (11 and 12) into a new category, 99.
library(magrittr) # For the %>% operator.
summary(faux.mesa.high~nodefactor((~Grade) %>% COLLAPSE_SMALLEST(2, 99)))## nodefactor.Grade.8 nodefactor.Grade.9 nodefactor.Grade.10
## 75 65 36
## nodefactor.Grade.99
## 77
## mm[Grade>=10=FALSE,Grade>=10=TRUE] mm[Grade>=10=TRUE,Grade>=10=TRUE]
## 27 43
## mm[Grade=7,Grade=8] mm[Grade=8,Grade=8]
## 0 33
## mm[Grade=7,Grade=8] mm[Grade=8,Grade=8]
## 0 33
# or using levels2 (see ? mm) to filter the combinations of levels,
summary(faux.mesa.high~mm("Grade",
levels2=~sapply(.levels,
function(l)
l[[1]]%in%c(7,8) && l[[2]]%in%c(7,8))))## mm[Grade=7,Grade=7] mm[Grade=7,Grade=8] mm[Grade=8,Grade=8]
## 75 0 33
Generally, levels2= selects from among the combinations
of levels selected by levels=. Here are some examples,
using the attribute Sex (which as two levels):
# Here is the full list of combinations of sexes in an undirected network:
summary(faux.mesa.high~mm("Sex", levels2=TRUE))## mm[Sex=F,Sex=F] mm[Sex=F,Sex=M] mm[Sex=M,Sex=M]
## 82 71 50
## mm[Sex=F,Sex=M]
## 71
## mm[Sex=F,Sex=M]
## 71
## mm[Sex=F,Sex=M]
## 71
## mm[Sex=F,Sex=F] mm[Sex=M,Sex=M]
## 82 50
# We can select via a mixing matrix: (Network is undirected and
# attributes are the same on both sides, so we can use either M or
# its transpose.)
(M <- matrix(c(FALSE,TRUE,FALSE,FALSE),2,2))## [,1] [,2]
## [1,] FALSE FALSE
## [2,] TRUE FALSE
## mm[Sex=F,Sex=M] mm[Sex=F,Sex=M]
## 71 71
## mm[Sex=F,Sex=M]
## 71
# Or, select by specific attribute value combinations, though note the
# names 'row' and 'col' and the order for undirected networks:
summary(faux.mesa.high~mm("Sex",
levels2 = I(list(list(row="M",col="M"),
list(row="M",col="F"),
list(row="F",col="M")))))## Warning: In term 'mm' in package 'ergm': Selected cells '[M,F]' are
## redundant (below the diagonal) in the mixing matrix and will have count 0.
## mm[Sex=M,Sex=M] mm[Sex=M,Sex=F] mm[Sex=F,Sex=M]
## 50 0 71
The attributes of the mm() term can be a two-sided
formula with different attributes:
## mm[Grade=7,Race=Black] mm[Grade=8,Race=Black] mm[Grade=9,Race=Black]
## 1 6 5
## mm[Grade=10,Race=Black] mm[Grade=11,Race=Black] mm[Grade=12,Race=Black]
## 4 7 3
## mm[Grade=7,Race=Hisp] mm[Grade=8,Race=Hisp] mm[Grade=9,Race=Hisp]
## 92 15 28
## mm[Grade=10,Race=Hisp] mm[Grade=11,Race=Hisp] mm[Grade=12,Race=Hisp]
## 10 19 14
## mm[Grade=7,Race=NatAm] mm[Grade=8,Race=NatAm] mm[Grade=9,Race=NatAm]
## 37 53 25
## mm[Grade=10,Race=NatAm] mm[Grade=11,Race=NatAm] mm[Grade=12,Race=NatAm]
## 16 15 10
## mm[Grade=7,Race=Other] mm[Grade=8,Race=Other] mm[Grade=9,Race=Other]
## 0 0 1
## mm[Grade=10,Race=Other] mm[Grade=11,Race=Other] mm[Grade=12,Race=Other]
## 0 0 0
## mm[Grade=7,Race=White] mm[Grade=8,Race=White] mm[Grade=9,Race=White]
## 23 1 6
## mm[Grade=10,Race=White] mm[Grade=11,Race=White] mm[Grade=12,Race=White]
## 6 8 1
# It is possible to have collapsing functions in the formula; note
# the parentheses around "~Race": this is because a formula
# operator (~) has lower precedence than pipe (|>):
summary(faux.mesa.high~mm(Grade~(~Race) %>% COLLAPSE_SMALLEST(3,"BWO"), levels2=TRUE))## mm[Grade=7,Race=BWO] mm[Grade=8,Race=BWO] mm[Grade=9,Race=BWO]
## 24 7 12
## mm[Grade=10,Race=BWO] mm[Grade=11,Race=BWO] mm[Grade=12,Race=BWO]
## 10 15 4
## mm[Grade=7,Race=Hisp] mm[Grade=8,Race=Hisp] mm[Grade=9,Race=Hisp]
## 92 15 28
## mm[Grade=10,Race=Hisp] mm[Grade=11,Race=Hisp] mm[Grade=12,Race=Hisp]
## 10 19 14
## mm[Grade=7,Race=NatAm] mm[Grade=8,Race=NatAm] mm[Grade=9,Race=NatAm]
## 37 53 25
## mm[Grade=10,Race=NatAm] mm[Grade=11,Race=NatAm] mm[Grade=12,Race=NatAm]
## 16 15 10