How to compute in a binary matrix in R

Question

Here's my problem I couldn't solve it all.

Suppose that we have the following code as follows:

## A data frame named a    
a <- data.frame(A = c(0,0,1,1,1), B = c(1,0,1,0,0), C = c(0,0,1,1,0), D = c(0,0,1,1,0), E = c(0,1,1,0,1))
## 1st function calculates all the combinaisons of colnames of a and the output is a character vector named item2
items2 <- c()
countI <- 1 
while(countI <= ncol(a)){
        for(i in countI){
                countJ <- countI + 1
                while(countJ <= ncol(a)){
                        for(j in countJ){
                                items2 <- c(items2, paste(colnames(a[i]), colnames(a[j]), collapse = '', sep = ""))
                        }
                        countJ <- countJ + 1
                }
                countI <- countI + 1
        }
}

And here's my code I'm trying to solve (the output is a numeric vector called count_1):

## 2nd function
colnames(a) <- NULL ## just for facilitating the calculation
count_1 <- numeric(ncol(a)*2)
countI <- 1
while(countI <= ncol(a)){
        for(i in countI){
                countJ <- countI + 1
                while(countJ <= ncol(a)){
                        for(j in countJ){
                                s <- a[, i]
                                p <- a[, j]
                                count_1[i*2] <- as.integer(s[i] == p[j] & s[i] == 1)
                        }
                        countJ <- countJ + 1
                }
                countI <- countI + 1
        }
}

But when I execute this code in RStudio Console, a non-expectation result returned!:

 count_1
 [1] 0 0 0 0 0 1 0 1 0 0

However, I am expecting the following result:

count_1
[1] 1 2 2 2 1 1 1 1 2 1

You can see visit the following URL where you can find an image on Dropbox for detailed explanation. https://www.dropbox.com/s/5ylt8h8wx3zrvy7/IMAG1074.jpg?dl=0

I'll try to explain a little more, I posted the 1st function (code) just to show you what I'm looking for exactly that is an example that's all. What I'm trying to get from the second function (code) is calculating the number of occurrences of number 1 (firstly we put counter = 0) in each row (while each row of two columns (AB, for example) must equal to one in both columns to say that counter = counter + 1) we continue by combing each column by all other columns (with AC, AD, AE, BC, BD, BE, CD, CE, and then DE), combination is n!/2!(n-2)!, that means for example if I have the following data frame:

a =

A B C D E
0 1 0 0 0

0 0 0 0 1

1 1 1 1 1

1 0 0 1 0

1 0 1 0 1

Then, the number of occurrences of the number 1 for each row by combining the two first columns is as follows: (Note that I put colnames(a) <- NULL just to facilitate the work and be more clear)

0 1 0 0 0

0 0 0 0 1

1 1 1 1 1

1 0 0 1 0

1 0 1 0 1

### Example 1: #####################################################

so from here I put (for columns A and B (AB))

s <- a[, i]
## s is equal to
## [1] 0 0 1 1 1
p <- a[, j]
## p is equal to
## [1] 1 0 1 0 0

Then I'll look for the occurrence of the number 1 in both vectors in condition it must be the same, i.e. a[, i] == 1 && a[, j] == 1 && a[, i] == a[, j], and for this example a numeric vector will be [1] 1

### Example 2: #####################################################

From here I put (for columns A and D (AD))

s <- a[, i]
## s is equal to
## [1] 0 0 1 1 1
p <- a[, j]
## p is equal to
## [1] 0 0 1 1 0

Then I'll look for the occurrence of the number 1 in both vectors in condition it must be the same, i.e. a[, i] == 1 && a[, j] == 1 && a[, i] == a[, j], and for this example a numeric vector will be [1] 2

And so on, I'll have a numeric vector named count_1 equal to:

[1] 1 2 2 2 1 1 1 1 2 1

while each index of count_1 is a combination of each column by others (without the names of the data frame)

AB AC AD AE BC BD BE CD CE DE

1 2 2 2 1 1 1 1 2 1

Please explain what your code is supposed to do. – zx8754 Nov 23 '17 at 20:32 — zx8754, Nov 23 '17 at 20:32

score 0 · Accepted Answer · edited Jun 20 '20 at 09:12

0

Not clear what you're after at all.

As to the first code chunk, that is some ugly R coding involving a whole bunch of unnecessary while/for loops.

You can get the same result items2 in one single line.

items2 <- sort(toupper(unlist(sapply(1:4, function(i)
    sapply(5:(i+1), function(j)
        paste(letters[i], letters[j], sep = ""))))));
items2;
# [1] "AB" "AC" "AD" "AE" "BC" "BD" "BE" "CD" "CE" "DE"

As to the second code chunk, please explain what you're trying to calculate. It's likely that these while/for loops are as unnecessary as in the first case.

Update

Note that this is based on a as defined at the beginning of your post. Your expected output is based on a different a, that you changed further down the post.

There is no need for a for/while loop, both "functions" can be written in two one-liners.

# Your sample dataframe a
a <- data.frame(A = c(0,0,1,1,1), B = c(1,0,1,0,0), C = c(0,0,1,1,0), D = c(0,0,1,1,0), E = c(0,1,1,0,1))

# Function 1
items2 <- toupper(unlist(sapply(1:(ncol(a) - 1), function(i) sapply(ncol(a):(i+1), function(j)
        paste(letters[i], letters[j], sep = "")))));
# Function 2
count_1 <- unlist(sapply(1:(ncol(a) - 1), function(i) sapply(ncol(a):(i+1), function(j)
        sum(a[, i] + a[, j] == 2))));

# Add names and sort
names(count_1) <- items2;
count_1 <- count_1[order(names(count_1))];
# Output
count_1;
#AB AC AD AE BC BD BE CD CE DE
# 1  2  2  2  1  1  1  2  1  1

edited Jun 20 '20 at 09:12

Community

1
1

answered Nov 23 '17 at 21:41

Maurits Evers

49,617
4
47
68

In fact, the first function (code) is just a part of a hole code that I am working on my project. What you did will not help me to achieve my goal. However, this is a plus to my knowledge and thanks a lot :) – Fouzi TAKELAIT Nov 23 '17 at 21:57
@zx8754 and I asked you to clearly state (in *words*) how `count_1` is being calculated. Why do you post code chunk 1 if it doesn't help you "to achieve your goal". What is its relevance? The linked fuzzy image doesn't really help. So again, at this point, it remains **entirely unclear** what you're trying to calculate, and what the two code chunks do. That usually leads to people loosing interest very quickly who are generally interested in helping. – Maurits Evers Nov 23 '17 at 22:24
I hope this changes will help to understand the problem I am facing. Thanks for advance – Fouzi TAKELAIT Nov 23 '17 at 23:14
Dataframe `a` as declared in the beginning doesn't match your `a` further down. Therefore your expected output doesn't correspond to the original `a`. Either way, I think I understand the question now; please see my updated answer. No need for `for`/`while` loops. Both functions are one-liners. – Maurits Evers Nov 23 '17 at 23:41
please, I want to have as an output of count_1 only the following vector # 1 2 2 2 1 1 1 2 1 1. Thanks for advance. – Fouzi TAKELAIT Nov 23 '17 at 23:52
`count_1` *is* that vector! You can strip the names with `as.numeric(count_1)`. – Maurits Evers Nov 23 '17 at 23:55
@FouziTAKELAIT No worries, glad to help & good luck with your project. – Maurits Evers Nov 24 '17 at 00:04
Excuse me please, I can't find the data frame **a** in the items2 function. by this way, it will be difficult to me to implement this in my project because I have a dynamic matrix which I can set ncols and nrows as I need. – Fouzi TAKELAIT Nov 24 '17 at 00:08
See my updated code. You just needed to adjust the column indices, e.g. `sapply(1:4, ...)` becomes `sapply(1:(ncol(a) - 1), ...)` and so on. – Maurits Evers Nov 24 '17 at 00:19
I see that's right. I'm very appreciating your help. – Fouzi TAKELAIT Nov 24 '17 at 00:22
Please, could you review this question for me? https://stackoverflow.com/q/47478507/7916257 . I have 15 days trying to solve this but I couldn't find a solution. This is a part of my project I'm trying to solve as a project assignment at university. – Fouzi TAKELAIT Nov 24 '17 at 19:15

How to compute in a binary matrix in R

1 Answers1

Update