0

My dataframe looks like this.

Word1    Word2    Count
--------------------------
 a         b        4

 c         a        2

 b         c        1
-------------------------

I want the following result.

from      to    count
-----------------------
  1       3       4

  2       1       2

  3       2       1

----------------------

I know I can achieve this easily using as_tbl_graph(df). But I want this result only using base r code without using other packages. How can I create identical result without using other packages such as igraph, ggraph, tidyverse ...?

jjw
  • 481
  • 1
  • 4
  • 14

1 Answers1

0

You can convert the values to factor and then integer to accomplish that:

lvls <- unique(df$Word1)                    # first we create an object containing the levels found in Word1

df$Word1 <- factor(df$Word1, levels = lvls) # Using this we convert both columns to factor
df$Word2 <- factor(df$Word2, levels = lvls)

df$Word1 <- as.integer(df$Word1)            # When converting this to integer, only level IDs are kept
df$Word2 <- as.integer(df$Word2)

df
#>   Word1 Word2 Count
#> 1     1     3     4
#> 2     2     1     2
#> 3     3     2     1

In igraph, tidygraph etc you also keep a second data.frame which consists of the level names (i.e., the node description). We can create this from the levels saved before:

df_nodes <- data.frame(names = lvls, stringsAsFactors = FALSE)
df_nodes
#>   names
#> 1     a
#> 2     c
#> 3     b

data

df <- read.csv(text = "Word1,Word2,Count
a,b,4
c,a,2
b,c,1", stringsAsFactors = FALSE)
JBGruber
  • 11,727
  • 1
  • 23
  • 45