Código e resultados da dissertação

0.1 Objetivo
0.2 Legenda dos rótulos dos sintomas
- 0.2.1 Hamilton
- 0.2.2 BDI
0.3 Frequência dos itens das escalas
- 0.3.1 Hamilton
- 0.3.2 BDI
0.4 Estimação das redes de sintomas
- 0.4.1 Estratégia de análise 1
- 0.4.2 Estratégia de análise 2
0.5 Estabilidade (network accuracy)
0.6 Detecção de comunidades de sintomas
- 0.6.1 Hamilton
- 0.6.2 BDI
0.7 Variância explicada (predictability)
- 0.7.1 Estratégia 1
- 0.7.2 Estratégia 2
0.8 Network comparison test

## Warning: package 'dplyr' was built under R version 4.0.3

## Warning: package 'tibble' was built under R version 4.0.3

## Warning: package 'igraph' was built under R version 4.0.3

#load("../cache/session_networks_suicide_attempt2.RData")

bdi_env <- new.env()
hdrs_env <- new.env()
eocl <- new.env()

bdiins_env <- new.env()
hdrsmeins_env <- new.env()

load("../session/session_cliquepercolation_clustering.RData", envir = eocl)
load("../session/session_bdi_networks.RData", envir = bdi_env)
load("../session/session_hdrs_networks.RData", envir = hdrs_env)

load("../session/session_hdrs_common_merged_ins_networks.RData", envir = hdrsmeins_env)
load("../session/session_bdi_common_one_ins_networks.RData", envir = bdiins_env)

0.1 Objetivo

O objetivo deste markdown é oferecer mais detalhes sobre como as análises para obter os resultados da dissertação foram conduzidas.

O link de acesso à dissertação estará disponível em breve.

0.2 Legenda dos rótulos dos sintomas

0.2.1 Hamilton

hdrs_env$labs_df

0.2.2 BDI

bdi_env$labs_df

0.3 Frequência dos itens das escalas

0.3.1 Hamilton

hamilton_freq <- as.data.frame(map_df(symptoms_df[, 1:17], table))
rownames(hamilton_freq) <- hdrs_env$labs_df$nodes_labs

hamilton_freq

0.3.2 BDI

bdi_freq <- as.data.frame(map_df(symptoms_df[, 18:ncol(symptoms_df)], table))
rownames(bdi_freq) <- bdi_env$labs_df$nodes_labs

bdi_freq

0.4 Estimação das redes de sintomas

Para ambas estratégias de análise o comando abaixo foi utilizado para a estimar as redes, onde df é o conjunto de dados em z-score.

estimateNetwork(df, default = c(“EBICglasso”))

0.4.1 Estratégia de análise 1

0.4.1.1 Hamilton

summary(hdrs_env$model_net)

## 
## === Estimated network ===
## Number of nodes: 17 
## Number of non-zero edges: 38 / 136 
## Mean weight: 0.02099407 
## Network stored in object$graph 
##  
## Default set used: EBICglasso 
##  
## Use plot(object) to plot estimated network 
## Use bootnet(object) to bootstrap edge weights and centrality indices 
## 
## Relevant references:
## 
##      Friedman, J. H., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9 (3), 432-441.
##  Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. 
##  Friedman, J. H., Hastie, T., & Tibshirani, R. (2014). glasso: Graphical lasso estimation of gaussian graphical models. Retrieved from https://CRAN.R-project.org/package=glasso
##  Epskamp, S., Cramer, A., Waldorp, L., Schmittmann, V. D., & Borsboom, D. (2012). qgraph: Network visualizations of relationships in psychometric data. Journal of Statistical Software, 48 (1), 1-18.
##  Epskamp, S., Borsboom, D., & Fried, E. I. (2016). Estimating psychological networks and their accuracy: a tutorial paper. arXiv preprint, arXiv:1604.08462.

0.4.1.2 BDI

summary(bdi_env$model_net)

## 
## === Estimated network ===
## Number of nodes: 21 
## Number of non-zero edges: 105 / 210 
## Mean weight: 0.02905073 
## Network stored in object$graph 
##  
## Default set used: EBICglasso 
##  
## Use plot(object) to plot estimated network 
## Use bootnet(object) to bootstrap edge weights and centrality indices 
## 
## Relevant references:
## 
##      Friedman, J. H., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9 (3), 432-441.
##  Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. 
##  Friedman, J. H., Hastie, T., & Tibshirani, R. (2014). glasso: Graphical lasso estimation of gaussian graphical models. Retrieved from https://CRAN.R-project.org/package=glasso
##  Epskamp, S., Cramer, A., Waldorp, L., Schmittmann, V. D., & Borsboom, D. (2012). qgraph: Network visualizations of relationships in psychometric data. Journal of Statistical Software, 48 (1), 1-18.
##  Epskamp, S., Borsboom, D., & Fried, E. I. (2016). Estimating psychological networks and their accuracy: a tutorial paper. arXiv preprint, arXiv:1604.08462.

0.4.2 Estratégia de análise 2

0.4.2.1 Hamilton

summary(hdrsmeins_env$model_net)

## 
## === Estimated network ===
## Number of nodes: 9 
## Number of non-zero edges: 16 / 36 
## Mean weight: 0.0397573 
## Network stored in object$graph 
##  
## Default set used: EBICglasso 
##  
## Use plot(object) to plot estimated network 
## Use bootnet(object) to bootstrap edge weights and centrality indices 
## 
## Relevant references:
## 
##      Friedman, J. H., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9 (3), 432-441.
##  Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. 
##  Friedman, J. H., Hastie, T., & Tibshirani, R. (2014). glasso: Graphical lasso estimation of gaussian graphical models. Retrieved from https://CRAN.R-project.org/package=glasso
##  Epskamp, S., Cramer, A., Waldorp, L., Schmittmann, V. D., & Borsboom, D. (2012). qgraph: Network visualizations of relationships in psychometric data. Journal of Statistical Software, 48 (1), 1-18.
##  Epskamp, S., Borsboom, D., & Fried, E. I. (2016). Estimating psychological networks and their accuracy: a tutorial paper. arXiv preprint, arXiv:1604.08462.

0.4.2.2 BDI

summary(bdiins_env$model_net)

## 
## === Estimated network ===
## Number of nodes: 9 
## Number of non-zero edges: 30 / 36 
## Mean weight: 0.06876034 
## Network stored in object$graph 
##  
## Default set used: EBICglasso 
##  
## Use plot(object) to plot estimated network 
## Use bootnet(object) to bootstrap edge weights and centrality indices 
## 
## Relevant references:
## 
##      Friedman, J. H., Hastie, T., & Tibshirani, R. (2008). Sparse inverse covariance estimation with the graphical lasso. Biostatistics, 9 (3), 432-441.
##  Foygel, R., & Drton, M. (2010). Extended Bayesian information criteria for Gaussian graphical models. 
##  Friedman, J. H., Hastie, T., & Tibshirani, R. (2014). glasso: Graphical lasso estimation of gaussian graphical models. Retrieved from https://CRAN.R-project.org/package=glasso
##  Epskamp, S., Cramer, A., Waldorp, L., Schmittmann, V. D., & Borsboom, D. (2012). qgraph: Network visualizations of relationships in psychometric data. Journal of Statistical Software, 48 (1), 1-18.
##  Epskamp, S., Borsboom, D., & Fried, E. I. (2016). Estimating psychological networks and their accuracy: a tutorial paper. arXiv preprint, arXiv:1604.08462.

0.5 Estabilidade (network accuracy)

bootnet(model_net, nBoots = 2500, caseN = 1000, type = “case”, nCores = 2, statistics = c(“Strength”))

0.6 Detecção de comunidades de sintomas

0.6.1 Hamilton

hdrs_w <- qgraph(hdrs_env$model_net$graph) 

hdrs_thresholds <- cpThreshold(hdrs_w, method = "weighted", k.range = 3,     
                           I.range = c(seq(0.38, 0.01, by = -0.01)), 
                           threshold = "entropy")

set.seed(1234)
hdrs_permute <- cpPermuteEntropy(hdrs_w, cpThreshold.object = hdrs_thresholds,
                            n = 100, interval = 0.95)

hdrs_results <- cpAlgorithm(hdrs_w, k = 3, method = "weighted", I=.04) 

hdrs_clique <- cpColoredGraph(hdrs_w, list.of.communities = hdrs_results$list.of.communities.numbers, layout="spring", theme='colorblind',
                    vsize=10, cut=0, border.width=1.5, labels = hdrs_env$labs_df$nodes_labs,
                     border.color='black', legend.cex=.37,
                     edge.width = 2, title ="Clique Percolation (optimizing entropy)")

0.6.2 BDI

bdi_w <- qgraph(bdi_env$model_net$graph) 

bdi_thresholds <- cpThreshold(bdi_w, method = "weighted", k.range = 3,     
                               I.range = c(seq(0.2, 0.01, by = -0.01)), 
                               threshold = "entropy")

set.seed(1234)
bdi_permute <- cpPermuteEntropy(bdi_w, cpThreshold.object = bdi_thresholds,
                                 n = 100, interval = 0.95)

best_i <- bdi_thresholds$Intensity[which.max(bdi_thresholds$Entropy.Threshold)]
best_i # melhor valor de i = 0.1

bdi_thresholds[which.max(bdi_thresholds$Entropy.Threshold), ]

bdi_results <- cpAlgorithm(bdi_w, k = 3, method = "weighted", I = best_i) # obtain final Clique Percolation 

bdi_clique <- cpColoredGraph(bdi_w, list.of.communities = bdi_results$list.of.communities.numbers, layout="spring", theme='colorblind',
                              vsize=8, cut=0, border.width=1.5, labels = bdi_env$labs_df$nodes_labs,
                              border.color='black', legend.cex=.37,
                              edge.width = 2, title ="Clique Percolation (optimizing entropy)")

0.7 Variância explicada (predictability)

Os comandos abaixo foram executados para cada rede. Clique em CODE para visualizar.

model_mgm <- estimateNetwork(df, default = "mgm")

pred_mgm <- predict(object = model_mgm$results, data = df, errorCon = c("RMSE", "R2"), errorCat = c("CC", "nCC"))
pred_mgm$errors

# Split in train and test
set.seed(1234)
train_index <- sample(x = c(1:nrow(df)), size = 0.8*nrow(df))
train_index

df_train <- df[train_index, ]
df_test <- df[-train_index, ]
dim(df_test)

library(bootnet)
model_mgm_train <- estimateNetwork(df_train, default = "mgm")
pred_mgm_train <- predict(object = model_mgm_train$results, data = df_train, errorCon = c("RMSE", "R2"), errorCat = c("CC", "nCC"))
pred_mgm_test <- predict(object = model_mgm_train$results, data = df_test, errorCon = c("RMSE", "R2"), errorCat = c("CC", "nCC"))

mgm_tt_rsquared <- data.frame("Variable" = pred_mgm_train$errors$Variable, 
                              "All_R2" = pred_mgm$errors$R2,
                              "Train_R2" = pred_mgm_train$errors$R2, 
                              "Test_R2" = pred_mgm_test$errors$R2)

0.7.1 Estratégia 1

0.7.1.1 Hamilton

hdrs_env$model_mgm

## 
## === Estimated network ===
## Number of nodes: 17 
## Number of non-zero edges: 21 / 136 
## Mean weight: 0.01827155 
## Network stored in x$graph 
##  
## Default set used: mgm 
##  
## Use plot(x) to plot estimated network 
## Use bootnet(x) to bootstrap edge weights and centrality indices 
## 
## Relevant references:
## 
##      Jonas M. B. Haslbeck, Lourens J. Waldorp (2016). mgm: Structure Estimation for Time-Varying Mixed Graphical Models in high-dimensional Data arXiv preprint:1510.06871v2 URL http://arxiv.org/abs/1510.06871v2.
##  Epskamp, S., Borsboom, D., & Fried, E. I. (2016). Estimating psychological networks and their accuracy: a tutorial paper. arXiv preprint, arXiv:1604.08462.

hdrs_env$mgm_tt_rsquared

0.7.1.2 BDI

bdi_env$model_mgm

## 
## === Estimated network ===
## Number of nodes: 21 
## Number of non-zero edges: 59 / 210 
## Mean weight: 0.02270954 
## Network stored in x$graph 
##  
## Default set used: mgm 
##  
## Use plot(x) to plot estimated network 
## Use bootnet(x) to bootstrap edge weights and centrality indices 
## 
## Relevant references:
## 
##      Jonas M. B. Haslbeck, Lourens J. Waldorp (2016). mgm: Structure Estimation for Time-Varying Mixed Graphical Models in high-dimensional Data arXiv preprint:1510.06871v2 URL http://arxiv.org/abs/1510.06871v2.
##  Epskamp, S., Borsboom, D., & Fried, E. I. (2016). Estimating psychological networks and their accuracy: a tutorial paper. arXiv preprint, arXiv:1604.08462.

bdi_env$mgm_tt_rsquared

0.7.2 Estratégia 2

0.7.2.1 Hamilton

hdrsmeins_env$model_mgm

## 
## === Estimated network ===
## Number of nodes: 9 
## Number of non-zero edges: 10 / 36 
## Mean weight: 0.03816103 
## Network stored in x$graph 
##  
## Default set used: mgm 
##  
## Use plot(x) to plot estimated network 
## Use bootnet(x) to bootstrap edge weights and centrality indices 
## 
## Relevant references:
## 
##      Jonas M. B. Haslbeck, Lourens J. Waldorp (2016). mgm: Structure Estimation for Time-Varying Mixed Graphical Models in high-dimensional Data arXiv preprint:1510.06871v2 URL http://arxiv.org/abs/1510.06871v2.
##  Epskamp, S., Borsboom, D., & Fried, E. I. (2016). Estimating psychological networks and their accuracy: a tutorial paper. arXiv preprint, arXiv:1604.08462.

hdrsmeins_env$mgm_tt_rsquared

0.7.2.2 BDI

bdiins_env$model_mgm

## 
## === Estimated network ===
## Number of nodes: 9 
## Number of non-zero edges: 16 / 36 
## Mean weight: 0.0478196 
## Network stored in x$graph 
##  
## Default set used: mgm 
##  
## Use plot(x) to plot estimated network 
## Use bootnet(x) to bootstrap edge weights and centrality indices 
## 
## Relevant references:
## 
##      Jonas M. B. Haslbeck, Lourens J. Waldorp (2016). mgm: Structure Estimation for Time-Varying Mixed Graphical Models in high-dimensional Data arXiv preprint:1510.06871v2 URL http://arxiv.org/abs/1510.06871v2.
##  Epskamp, S., Borsboom, D., & Fried, E. I. (2016). Estimating psychological networks and their accuracy: a tutorial paper. arXiv preprint, arXiv:1604.08462.

bdiins_env$mgm_tt_rsquared

0.8 Network comparison test

library(NetworkComparisonTest)
nct_result <- readRDS(file = "../cache/nct_meins_result.rds")

options(max.print = 1000)
nct_result

## 
##  NETWORK INVARIANCE TEST 
##  Test statistic M:  0.176638 
##  p-value 0.0028 
## 
##  GLOBAL STRENGTH INVARIANCE TEST 
##  Global strength per group:  1.431263 2.475372 
##  Test statistic S:  1.044109 
##  p-value 0.0556 
## 
##  EDGE INVARIANCE TEST 
## 
##     Var1  Var2 p-value
## 10  Mood Guilt   0.060
## 19  Mood  Suic   0.669
## 20 Guilt  Suic   0.000
## 28  Mood   Ins   0.490
## 29 Guilt   Ins   0.616
## 30  Suic   Ins   0.704
## 37  Mood  WkAc   0.704
## 38 Guilt  WkAc   0.297
## 39  Suic  WkAc   0.060
## 40   Ins  WkAc   0.111
## 46  Mood SomGI   0.704
## 47 Guilt SomGI   0.501
## 48  Suic SomGI   0.704
## 49   Ins SomGI   0.602
## 50  WkAc SomGI   0.067
## 55  Mood   Lib   0.490
## 56 Guilt   Lib   1.000
## 57  Suic   Lib   0.061
## 58   Ins   Lib   0.490
## 59  WkAc   Lib   1.000
## 60 SomGI   Lib   0.058
## 64  Mood   Hyp   0.061
## 65 Guilt   Hyp   0.270
## 66  Suic   Hyp   1.000
## 67   Ins   Hyp   0.419
## 68  WkAc   Hyp   0.058
## 69 SomGI   Hyp   0.768
## 70   Lib   Hyp   1.000
## 73  Mood   LsW   0.540
## 74 Guilt   LsW   0.282
## 75  Suic   LsW   0.768
## 76   Ins   LsW   0.704
## 77  WkAc   LsW   0.160
## 78 SomGI   LsW   0.058
## 79   Lib   LsW   1.000
## 80   Hyp   LsW   1.000
## 
##  CENTRALITY INVARIANCE TEST 
##  
##        strength
## Mood  0.0954000
## Guilt 0.0000000
## Suic  0.1644000
## Ins   0.0954000
## WkAc  0.6880000
## SomGI 0.4433143
## Lib   0.1644000
## Hyp   0.0954000
## LsW   0.5319000