In this vignette, we will walk you through how to create a
caugi, query it, modify it, and then compare it to other
caugis.
caugi objectYou can create a caugi graph object using the
caugi() function along with infix operators to define
edges. Let’s create a directed acyclic graph (DAG) with 5 nodes and 5
edges.
cg <- caugi(
A %-->% B %-->% C + D,
A %-->% C,
class = "DAG"
)
cg
#> <caugi object; 4 nodes, 4 edges; simple: TRUE; built: TRUE; ptr=0x55af8cf148c0>
#> graph_class: DAG
#> nodes: A, B, C, D
#> edges: A-->B, B-->C, B-->D, A-->CYou might scratch your head a bit, when looking at the call above. To
clarify, A %-->% B creates the edge -->
from A to B. The syntax
B %-->% C + D is equivalent to B %-->% C
and B %-->% D. Notice that the graph prints two
tibbles. The first is equivalent to cg@nodes
and the second cg@edges. Besides that, the
caugi holds other properties. Let’s check the
other properties.
ptrThis is the pointer to the Rust object that caugi
utilizes for performance.
simpleThis indicates whether the graph is simple or not. Let’s try to create a non-simple graph:
caugi(A %-->% B, B %-->% A)
#> Error in graph_builder_add_edges(b, as.integer(unname(id[edges$from])), : parallel edges not allowed in simple graphsThis cannot be done unless you initialize the graph with
simple = FALSE. Note that, currently, all of the supported
graph classes only support simple = TRUE unless the class
is UNKNOWN.
caugiWe can query the caugi graph object with the built-in
queries provided in the package. Let’s try to find the descendants of
all the parents of the node C:
First note that the output is a list of named character vectors. How
come? Since the parents of C is c(A, B):
So for each parent of C we have a named vector in the
list that represents the descendants of that parent node.
caugiLet’s try to modify the graph from before, so we get a new DAG.
cg_modified <- cg |>
remove_edges(A %-->% B, B %-->% C + D) |>
add_edges(B %-->% A, D %-->% C)
cg_modified
#> <caugi object; 4 nodes, 3 edges; simple: TRUE; built: FALSE; ptr=NULL>
#> graph_class: DAG
#> nodes: A, B, C, D
#> edges: A-->C, B-->A, D-->CWould you like to add nodes? Then use add_nodes().
Now that we have two different graphs, we can use different metrics to measure the difference between the two graphs. Here, we use the adjustment identification distance (AID) and the structural Hamming distance (SHD):
You have now created a graph, inspected it, modified it, and measured the difference between the two graphs – both structurally and interventionally.
For further reading, we recommend the vignettes
vignette("package_use") and
vignette("performance") to see how to use
caugi in your own packages, and to see how
caugi performs compared to other graph packages in R. For
the interested reader, we also recommend
vignette("motivation"), which goes deeper into the
motivation behind caugi and what we aspire to do with
caugi.
builtThis indicates whether the graph has been “built” or not on the Rust
side. This is important, as the Rust object may not agree with the R
object if built = FALSE.
name_index_mapThe name_index_map is a hashmap that takes node names as
keys and outputs zero-based indices. This is used to access nodes’
zero-based indices, when converting node names to indices for Rust
calls, as the Rust backend uses zero-based indices.
.stateThis is the internal state of the caugi graph object. It
is used to ensure that the caugi object can be modified in
R and, so to speak, saves the modifications you might make to
the graph in R without having to rebuild the graph in Rust each time.
The most important takeaway about the state is that you should
avoid modifying the state directly. Instead, you should use the
verbs.