Questions tagged [tapply]

tapply is a function in the R programming language for apply a function to subsets of a vector.

tapply is a function in the R programming language for apply a function to subsets of a vector. A vector is broken in to subsets, potentially of different lengths (aka a ragged array) based on the values of one or more other vector. The second vector is either already a factor or coerced to be a factor by as.factor. A function is applied to each of these subsets. tapply then returns either an array or a list, depending on the output of the function.

354 questions
2
votes
2 answers

Convert a list from tapply(.) to data.frame in R

I have the following code t <- tapply(z[,3],z[,1],summary) # > t # $AUS # Min. 1st Qu. Median Mean 3rd Qu. Max. # -0.92420 -0.57920 0.08132 -0.13320 0.35940 0.39650 # # $NZ # Min. 1st Qu. Median Mean 3rd Qu. …
darkage
  • 857
  • 3
  • 12
  • 22
2
votes
1 answer

tapply a row based function to subsets

I'm new to the tapply function and didn't succeed computing a time difference calculation within each asked subset. I have an input dataframe containing observations dates (column DATE) for some RN. In my script, I'm subsetting this dataframe in…
user2542995
  • 241
  • 2
  • 4
  • 11
2
votes
3 answers

Calculations within subsets of dataframe [R]

Facing difficulties with subset calculations. I am able to get overall stats like average purchase by customer (factor) using ave, tapply, ddply but I am not able to calculate visit by visit stats for each customer. Some simplified data below to…
5tanczak
  • 161
  • 1
  • 2
  • 8
2
votes
2 answers

alternative to a for loop for replacing a subset of elements in a matrix with elements in a vector in R

I'm using a for loop to replace a subset of elements of myarray using mycons vector. The subset in each column would be from mydatesuntil the end. Is there an alternative to the for loop? mydates <-…
nopeva
  • 1,583
  • 5
  • 22
  • 38
2
votes
1 answer

sum by group in a data.frame

I'm trying to get the sum of a numerical variable per a categorical variable (in a data frame). I've tried using tapply, but it's doesn't take a whole data.frame. Here is a working example with some data that looks like this: > set.seed(667) > df…
Eric Fail
  • 8,191
  • 8
  • 72
  • 128
2
votes
1 answer

R - "linearizing" the results of tapply (to one single vector, unpacked by column)

In a dataframe I have a vector with some values, and vectors of categories that each value belongs to. I want to apply a function to the values, that operates "by category", so I use tapply. For example, in my case I want to rescale the values…
amit
  • 3,332
  • 6
  • 24
  • 32
2
votes
2 answers

Performance of reshaping table

How can I go from a table like this: ID Day car_id value 1 1 1 0 1 1 2 4 1 2 1 1 1 3 2 0 2 1 3 0 2 2 3 2 2 3 3 0 ... To one like this? I have tried using…
RodrigoReis
  • 111
  • 4
1
vote
3 answers

tapply like issue, but require dataframe output - R

This is my first post, so hopefully I explain what I need to do properly. I am still quite new to R and I may have read posts that answer this, but I just can't for the life of me understand what they mean. So apologies in advance if this has…
HeidelbergSlide
  • 293
  • 3
  • 13
1
vote
2 answers

Changing arguments in tapply?

I have a several groups, let's say A,B,C and I want to cut another variable based on these groups, i.e. each group has specific breaks for the same variable. If I had to calculate the groups mean, i´d use tapply like this:…
Matt Bannert
  • 27,631
  • 38
  • 141
  • 207
1
vote
1 answer

Error in running a manhattan plot using qqman package?

I'm trying to create a manhattan plot in linux. This is my first time doing so using qqman and I am stuck on this error. Here is my R code: library(data.table) library(qqman) data =…
Johnny
  • 59
  • 5
1
vote
3 answers

How to tapply in dplyr and create a new column

I´m stuck with dplyr (again!) and trying to solve my problem without dying in the attemp. The first lines of my df look like this: df <- structure(list(fecha = c(1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990,…
Juan Carlos
  • 173
  • 13
1
vote
2 answers

Emtpy factor level with tapply in after_stat causes hodgepodge

I would like to draw a plot with percentage labels per x-axis group. This works fine without empty groups: # library library(ggplot2) library(reshape2) # example data from reshape2 str(tips) #> 'data.frame': 244 obs. of 7 variables: #> $…
captcoma
  • 1,768
  • 13
  • 29
1
vote
2 answers

How to use an apply() or equivalent function to perform math operations on current and adjacent data frame rows?

I am performing simple column-wise math operations on data frame rows that also involve accessing adjacent, previous data frame rows. Although the below code works, it's cumbersome (at least with respect to my liberal use of cbind() and subset()…
1
vote
1 answer

Binding a list of summary data to a data.frame creates an unknown column in R

I have a large df (+100k rows, see snapshot of data below) that I'm trying to summarize (min, mean, median, max, etc.) a variable (salinity) in a table by group (species) using tapply, but if I use the whole dataset (which contains a few NA's, but…
Nate
  • 411
  • 2
  • 10
1
vote
1 answer

How to add a title to a histogram graph?

I have a set of data named ais. This data set is in the package sn in R. I used the following codes to read this data set: library(sn) data(ais) attach(ais) This data shows information (such as the gender, sport, height, weight, etc.) of 202…
Rojer
  • 335
  • 2
  • 9