Questions tagged [tapply]

tapply is a function in the R programming language for apply a function to subsets of a vector.

tapply is a function in the R programming language for apply a function to subsets of a vector. A vector is broken in to subsets, potentially of different lengths (aka a ragged array) based on the values of one or more other vector. The second vector is either already a factor or coerced to be a factor by as.factor. A function is applied to each of these subsets. tapply then returns either an array or a list, depending on the output of the function.

354 questions
0
votes
1 answer

Why does tapply take the subset as NA and not exclude them totally

I have a question. I want to make a barplot with the mean and errorbars, where it is grouped for two factors. To get the mean and the standard errors I used the function tapply. However for one of the factor I want to drop one level. So what I…
Marinka
  • 1,189
  • 2
  • 11
  • 24
0
votes
2 answers

How do I convert table formats in R

Specifically, I used the following set up: newdata <- tapply(mydata(#), list(mydata(X), mydata(Y)), sum) I currently have a table that currently is listed as follows: X= State, Y= County within State, #= a numerical total of something __ Y1 Y2…
0
votes
1 answer

How to use tapply() within a for loop and print output in R?

I am using tapply() to apply a function to my data Myrepfun <- function(x,n){ nstudents <- replicate(1000,sum(sample(x, size=n,replace=TRUE))) quantile(nstudents,probs=0.95) } tapply(weight,schoolcode,Myrepfun,n=2) I would like to use this…
user1407670
  • 21
  • 1
  • 3
-1
votes
1 answer

tapply ()? Equal lengths?

Good afternoon, I'm trying applying the tapply function in order to obtain means reading through different treatment group (the 'Placebo' one and the 'Active' one) of the following dataset: > str(dat_long) 'data.frame': 1500 obs. of 7 variables: …
12666727b9
  • 1,133
  • 1
  • 8
  • 22
-1
votes
1 answer

Error in tapply() output return a vector of factors

When reading a data frame into R with read.csv() and using tapply() to compute treatment means, the result is a vector of factors. cfpr<-read.csv("E:/temp/vars.csv",sep=";") cfpr ano mes usd_brl x y 1 2014 5 2.221 181.83…
-1
votes
1 answer

A condition to all variable in r

I want to make a table consisting of 0 and 1. If a variable is larger than 0, it will be 1 otherwise 0. As the dataset has over 1,000 columns, I should use the 'sapply?' function on this question. how do I make the code?
jhyeon
  • 456
  • 4
  • 14
-1
votes
2 answers

Run a function by groups

I'm currently working on removing outliers and I'm using Klodian Dhana's function on outlier subject (https://datascienceplus.com/identify-describe-plot-and-removing-the-outliers-from-the-dataset/#comment-3592066903). My dataset consists of 95000…
-1
votes
1 answer

Box plot error when adding mean

I realize this question has been asked previously and it has been suggested to use ggplot, lattice etc. My question relates to adding the mean value onto boxplot according to a categorical variable. Here is my code and it does not work: STEP 1: I…
-1
votes
1 answer

ttest between two datasets (e.g. cases and controls) for multiple bins

I want to compare two datasets at differet bins. My input data is something like this: dataIn <- read.table(text = "bin_slots val_cases val_controls A 0.075 0.05 A 0.252 0.276 A 0.338 0.41 A 0.911 0.983 A 0.912 0.809 A …
Meraj
  • 1
  • 1
-1
votes
1 answer

R: tapply(x,y,sum) returns NA instead of 0

I have a data set that contains occurrences of events over multiple years, regions, quarters, and types. Sample: REGION Prov Year Quarter Type Hit Miss xxx yy 2008 4 Snow 1 0 xxx yy 2009 2 Rain 0 1 I have variables…
-1
votes
1 answer

Is it necessary to use factor to INDEX argument for tapply in r?

x #X Income Commute Job.Growth Physicians #1 A 26000 49.2 10.8 1987 #2 B 29300 45.3 9.5 517 #3 C 24800 39.8 8.2 592 #4 D 27900 46.8 7.6 3310 #5 E 37500 39.9 12.2 …
student.J
  • 35
  • 1
  • 4
-1
votes
1 answer

R: Extract non-NA elements from a matrix and return with row/column labels

I have a large matrix as a result of using tapply with an INDEX argument of two rows from a dataframe. Most of the matrix is empty (NA). Here is how I used tapply: latavgs <- tapply(geodata$latitude,geodata[5:6],FUN=mean) where latavgs is my…
-1
votes
1 answer

Tapply only producing missing values

I'm trying to generate estimates of the percent of Catholics within a given municipality in a country and I'm using multilevel regression and post-stratification of survey data. The approach fits a multilevel logit and generates predicted…
-2
votes
1 answer

Using R to calculate the mean of the part of a vector

My R vector looks like this: vector <- c(3, 2, 1, 4, 6, 2, 7) I want to use the function tapply() to calculate the mean of the first 4 number from the vector. How do I do it? What did I do? tapply(vector(1,4), mean) but it seems like it does not…
floss
  • 2,603
  • 2
  • 20
  • 37
-2
votes
1 answer

Use function in loop over set of files in directory

I'm trying to do some data analysis as follows: I have about 100 subjects, each of whom have a file containing 40,000 lines of numbers. I also have an index file with 40,000 corresponding lines containing group number. I am trying to get the means…
Jake
  • 3
  • 2
1 2 3
23
24