R语言网络分析2：graph函数和应用

2023-05-16

产生网络

自定义

BioC 中用得最多的网络类型可能是 graphNEL 类。使用 graphNEL 类的同名函数可以产生自定义网络：

  library(graph)
  str(graphNEL)

#> function (nodes = character(), edgeL = list(), edgemode = "undirected")

  nds <- letters[1:3]
  gx1 <- graphNEL(nodes = nds, edgemode = "undirected")
  plot(gx1)

nodes: 节点名称，需合法（比如不能是数字）
edgeL：边列表。新版本可以为空（较为人性，以前可不是这样！）
edgemode：无向（undirected，默认）和有向（directed）两种

上面产生的是只有节点没有连接的网络，如果连接各节点的边已知，可以在 edgeL 参数中直接书写，不过很麻烦。下面通过 addEdge 函数添加：

  gx1 <- addEdge("a", "b", gx1)
  gx1 <- addEdge("a", "c", gx1)
  plot(gx1)

有向网络也一样，只不过边是有方向的：

  gx2 <- graphNEL(nodes = nds, edgemode = "directed")
  gx2 <- addEdge("a", "b", gx2)
  gx2 <- addEdge("a", "c", gx2)
  plot(gx2)

可以在原网络基础上继续添加节点：

gx1

#> A graphNEL graph with undirected edges
#> Number of Nodes = 3 
#> Number of Edges = 2

  gx1 <- addNode("d", gx1)
  gx1

#> A graphNEL graph with undirected edges
#> Number of Nodes = 4 
#> Number of Edges = 2

随机网络

randomEGraph(V, p, edges)
randomGraph(V, M, p, weights=TRUE)
randomNodeGraph(nodeDegree)

randomEGraph 可产生阿尔德什-雷尼随机网络（指定节点和边数）和埃德加-吉尔伯特随机网络（指定节点和两节点间连接概率）：

  set.seed(1314)
  gx1 <- randomEGraph(letters[1:8], p=0.3)
  gx2 <- randomEGraph(letters[1:8], edges=6)
  gx1

#> A graphNEL graph with undirected edges
#> Number of Nodes = 8 
#> Number of Edges = 13

gx2

#> A graphNEL graph with undirected edges
#> Number of Nodes = 8 
#> Number of Edges = 6

产生 graphNEL 类网络，即用节点（N）和边（E）列表（L）表示的网络。plot 函数可以直接作图：

  par(mfrow = c(1:2))
  plot(gx1)
  plot(gx2)

网络编辑

网络的编辑主要是节点、边和相关数据的编辑，也可以对网络进行合并。

逐个编辑

添加节点和边的函数和用法已在上面展示。和添加相对应，节点和边都有删除方法：

removeNode：删除节点（及相关联的边）
removeEdge：删除边（节点保留）

  showMethods(removeNode, classes = "graphNEL")

#> Function: removeNode (package graph)
#> node="character", object="graphNEL"

  showMethods(removeEdge, classes = "graphNEL")

#> Function: removeEdge (package graph)
#> from="character", to="character", graph="graphNEL"

  gxx <- removeNode("g", gx1)
  gxx <- removeEdge("a", "c", gxx)
  gx1

#> A graphNEL graph with undirected edges
#> Number of Nodes = 8 
#> Number of Edges = 13

gxx

#> A graphNEL graph with undirected edges
#> Number of Nodes = 7 
#> Number of Edges = 9

分组编辑

combineNodes：节点融合/坍塌，相当于删除多个节点和边，然后添加一个新节点和多个边，用于代表已删除的节点和边
- 节点和边可能有其他附加属性数据，该函数只处理权重数据，如果有其他数据，慎用该函数
clearNode：此函数删除与指定节点相关的边（节点保留）

  gxx <- clearNode("d", gx1)
  par(mfrow = c(1,2))
  plot(gx1)
  plot(gxx)

  gxx <- combineNodes(c("a", "b", "c"), gx1, newName = "S")
  str(edgeWeights(gxx))

#> List of 6
#>  $ d: Named num [1:3] 1 1 3
#>   ..- attr(*, "names")= chr [1:3] "f" "h" "S"
#>  $ e: Named num 1
#>   ..- attr(*, "names")= chr "S"
#>  $ f: Named num [1:3] 1 1 1
#>   ..- attr(*, "names")= chr [1:3] "d" "g" "h"
#>  $ g: Named num [1:2] 1 2
#>   ..- attr(*, "names")= chr [1:2] "f" "S"
#>  $ h: Named num [1:3] 1 1 2
#>   ..- attr(*, "names")= chr [1:3] "d" "f" "S"
#>  $ S: Named num [1:4] 3 2 2 1
#>   ..- attr(*, "names")= chr [1:4] "d" "g" "h" "e"

  par(mfrow = c(1,2))
  plot(gx1)
  plot(gxx)

编辑属性

节点、边和网络有一些固有或默认的属性数据，也可以添加附加属性数据。下面函数可以获取节点和边的属性数据：

nodeDataDefaults
nodeData
edgeDataDefaults
edgeData

以 edgeData 为例：

  edgeDataDefaults(gx1)

#> $weight
#> [1] 1

  head(edgeData(gx1), 2)

#> $`a|c`
#> $`a|c`$weight
#> [1] 1
#> 
#> 
#> $`a|d`
#> $`a|d`$weight
#> [1] 1

  head(edgeData(gx1, attr = "weight"), 2)

#> $`a|c`
#> [1] 1
#> 
#> $`a|d`
#> [1] 1

  edgeData(gx1, "f", "h")

#> $`f|h`
#> $`f|h`$weight
#> [1] 1

  edgeData(gx1, "f", "h", "weight")

#> $`f|h`
#> [1] 1

NOTE：graphNEL 对象具有默认权重属性，有专门的辅助函数 edgeWeights 用于获取权重。

这些函数也可以用于设置（赋值）属性：

  edgeData(gx1, "f", "h", "weight") <- 9
  edgeData(gx1, "f", "h", "weight")

#> $`f|h`
#> [1] 9

网络的集合运算

intersection 和 union 可以获取两个网络的交集和并集，但是要求参与运算的网络有相同的节点，这就限制了它们的应用。

join 函数也可以合并两个网络，不要求它们有相同的节点，用处可能多一点：

  str(join)

#> Formal class 'standardGeneric' [package "methods"] with 8 slots
#>   ..@ .Data     :function (x, y)  
#>   ..@ generic   : chr "join"
#>   .. ..- attr(*, "package")= chr "graph"
#>   ..@ package   : chr "graph"
#>   ..@ group     : list()
#>   ..@ valueClass: chr(0) 
#>   ..@ signature : chr [1:2] "x" "y"
#>   ..@ default   : NULL
#>   ..@ skeleton  : language (function (x, y)  stop("invalid call in method dispatch to 'join' (no default method)", domain = NA))(x, y)

它是一个 S4 泛型，从参数看一次只能对两个对象进行操作：

  nodes(gx1)

#> [1] "a" "b" "c" "d" "e" "f" "g" "h"

  nodes(gx2)

#> [1] "a" "b" "c" "d" "e" "f" "g" "h"

  gxx <- join(gx1, gx2)
  nodes(gxx)

#> [1] "a" "b" "c" "d" "e" "f" "g" "h"

  par(mfrow = c(1,3))
  plot(gx1); plot(gx2); plot(gxx)

注意：属性数据的默认处理是重置：

  edgeData(gx1, "f", "h", "weight")

#> $`f|h`
#> [1] 9

  edgeData(gxx, "f", "h", "weight")

#> $`f|h`
#> [1] 1

网络属性

除了 nodeData 和 edgeData 包含的一些数据外，网络本身以及节点和边还有很多其他性质。了解相关的函数的用法对以后处理复杂网络是相当有用的。

名称或列表

  nodes(gx1)

#> [1] "a" "b" "c" "d" "e" "f" "g" "h"

  edges(gx1)

#> $a
#> [1] "c" "d" "g" "h"
#> 
#> $b
#> [1] "d" "g"
#> 
#> $c
#> [1] "a" "d" "e" "h"
#> 
#> $d
#> [1] "a" "b" "c" "f" "h"
#> 
#> $e
#> [1] "c"
#> 
#> $f
#> [1] "d" "g" "h"
#> 
#> $g
#> [1] "a" "b" "f"
#> 
#> $h
#> [1] "a" "c" "d" "f"

节点是字符向量
边是列表

  edgeNames(gx1)

#>  [1] "a~c" "a~d" "a~g" "a~h" "b~d" "b~g" "c~d" "c~e" "c~h" "d~f" "d~h" "f~g" "f~h"

边名称用波浪线（“~”）分隔节点，和其他函数（用“|”间隔）不同！

网络性质

  edgemode(gx1)

#> [1] "undirected"

  numNodes(gx1)

#> [1] 8

  numEdges(gx1)

#> [1] 13

  degree(gx1)

#> a b c d e f g h 
#> 4 2 4 5 1 3 3 4

  aveNumEdges(gx1)

#> [1] 1.625

  numNoEdges(gx1)

#> [1] 0

判断函数

网络判断

  isConnected(gx1)

#> [1] TRUE

  isDirected(gx1)

#> [1] FALSE

节点相邻判断

  isAdjacent(gx1, "a", "b")

#> [1] FALSE

  set.seed(1234)
  gxx <- randomEGraph(letters[1:6], p=0.3)
  plot(gxx)

  mostEdges(gxx)

#> [1] "e"

  acc(gxx, c("a", "b"))

#> $a
#> b c e f 
#> 1 3 1 2 
#> 
#> $b
#> a c e f 
#> 1 3 1 2

查询函数

  set.seed(1234)
  gxx <- randomEGraph(letters[1:6], p=0.3)
  plot(gxx)

adj: 获取给定节点的相邻节点列表
acc：获取给定节点（起始节点）的可到达节点（终节点）列表，并且给出相应距离

  adj(gxx, c("a", "b"))

#> $a
#> [1] "b" "e"
#> 
#> $b
#> [1] "a" "e"

  acc(gxx, c("a", "b"))

#> $a
#> b c e f 
#> 1 3 1 2 
#> 
#> $b
#> a c e f 
#> 1 3 1 2

adj 和 acc 这个两个函数都相当有用。

mostEdges：获取网络中边最多的节点

    mostEdges(gxx)

#> [1] "e"

其他函数

complement：获取给定网络（图）的互补图：完全图减去原图。即：任意两节点间：
- 原图有连接的，删除这些连接
- 原图没连接的，添加连接

  par(mfrow = c(1, 2))
  plot(gx1)
  plot(complement(gx1))

connComp：获取连通分枝列表
DFS：深度优先搜索算法。这个最好使用 RBGL 中的 dfs 函数
ugraph：把有向网络转为无向网络（节点数不变，边数可能会减少）
reverseEdgeDirections：逆转所有边的方向（有向网络）

可视化

绘图相关的内容比较琐碎，了解自己需要的部分就行。

graph.par
parRenderInfo
nodeRenderInfo
edgeRenderInfo
graphRenderInfo

  str(graph.par())

#> List of 3
#>  $ nodes:List of 11
#>   ..$ col      : chr "black"
#>   ..$ fill     : chr "transparent"
#>   ..$ textCol  : chr "black"
#>   ..$ fontsize : num 14
#>   ..$ lty      : num 1
#>   ..$ lwd      : num 1
#>   ..$ label    : NULL
#>   ..$ fixedsize: logi FALSE
#>   ..$ shape    : chr "circle"
#>   ..$ iwidth   : num 0.75
#>   ..$ iheight  : num 0.5
#>  $ edges:List of 6
#>   ..$ col     : chr "black"
#>   ..$ lty     : num 1
#>   ..$ lwd     : num 1
#>   ..$ textCol : chr "black"
#>   ..$ cex     : num 1
#>   ..$ fontsize: num 14
#>  $ graph:List of 9
#>   ..$ laidout   : logi FALSE
#>   ..$ recipEdges: chr "combined"
#>   ..$ main      : chr ""
#>   ..$ sub       : chr ""
#>   ..$ cex.main  : num 1.2
#>   ..$ cex.sub   : num 1
#>   ..$ label     : NULL
#>   ..$ col.main  : chr "black"
#>   ..$ col.sub   : chr "black"

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)