Questions tagged [r-faq]

The r-faq tag is created to group a limited number of questions discussing problems that come up regularly on the R tag. It is not the official FAQ on R for SO, but should serve as an interesting source of information on common problems.

The tag is created to group a limited number of questions discussing problems that come up regularly on the tag. It is not the official FAQ on R for SO, but should serve as an interesting source of information on common problems.

The "real" original R-FAQ is reliably found at: https://cran.r-project.org/doc/FAQ/R-FAQ.html and is typically the first google hit for "r-faq".

251 questions
96
votes
5 answers

What are the differences between R's new native pipe `|>` and the magrittr pipe `%>%`?

In R 4.1 a native pipe operator was introduced that is "more streamlined" than previous implementations. I already noticed one difference between the native |> and the magrittr pipe %>%, namely 2 %>% sqrt works but 2 |> sqrt doesn't and has to be…
sieste
  • 8,296
  • 3
  • 33
  • 48
96
votes
11 answers

How to plot all the columns of a data frame in R

The data frame has n columns and I would like to get n plots, one plot for each column. I'm a newbie and I am not fluent in R, anyway I found two solutions. The first one works but it does not print the column name (and I need them!): data <-…
Alessandro Jacopson
  • 18,047
  • 15
  • 98
  • 153
96
votes
12 answers

Counting unique / distinct values by group in a data frame

Let's say I have the following data frame: > myvec name order_no 1 Amy 12 2 Jack 14 3 Jack 16 4 Dave 11 5 Amy 12 6 Jack 16 7 Tom 19 8 Larry 22 9 Tom 19 10 Dave …
Mehper C. Palavuzlar
  • 10,089
  • 23
  • 56
  • 69
96
votes
10 answers

Calculate the mean by group

I have a large data frame that looks similar to this: df <- data.frame(dive = factor(sample(c("dive1","dive2"), 10, replace=TRUE)), speed = runif(10) ) > df dive speed 1 dive1 0.80668490 2 dive1…
Jojo
  • 4,951
  • 7
  • 23
  • 27
95
votes
12 answers

Read an Excel file directly from a R script

How can I read an Excel file directly into R? Or should I first export the data to a text- or CSV file and import that file into R?
waanders
  • 8,907
  • 22
  • 70
  • 102
95
votes
7 answers

Understanding the order() function

I'm trying to understand how the order() function works. I was under the impression that it returned a permutation of indices, which when sorted, would sort the original vector. For instance, > a <- c(45,50,10,96) > order(a) [1] 3 1 2 4 I would…
jeffshantz
  • 983
  • 1
  • 8
  • 6
95
votes
9 answers

Pasting two vectors with combinations of all vectors' elements

I have two vectors: vars <- c("SR", "PL") vis <- c(1,2,3) Based on these vectors I would like to create the following vector: "SR.1" "SR.2" "SR.3" "PL.1" "PL.2" "PL.3" With paste I have the following result: paste(vars, vis, sep=".") [1]…
DSSS
  • 1,923
  • 4
  • 16
  • 15
94
votes
7 answers

How to plot a function curve in R

What are the alternatives for drawing a simple curve for a function like eq = function(x){x*x} in R? It sounds such an obvious question, but I could only find these related questions on stackoverflow, but they are all more specific Plot line…
sjdh
  • 3,907
  • 8
  • 25
  • 32
94
votes
1 answer

Re-ordering factor levels in data frame

I have a data.frame as shown below: task measure right m1 left m2 up m3 down m4 front m5 back m6 . . . The task column takes only six different values, which are treated as factors, and are ordered by R as: "back", "down",…
siva82kb
  • 1,910
  • 4
  • 19
  • 28
94
votes
17 answers

Generate a dummy-variable

I have had trouble generating the following dummy-variables in R: I'm analyzing yearly time series data (time period 1948-2009). I have two questions: How do I generate a dummy variable for observation #10, i.e. for year 1957 (value = 1 at 1957 and…
Pantera
  • 1,051
  • 1
  • 8
  • 6
91
votes
10 answers

Using ggplot2, can I insert a break in the axis?

I want to make a bar plot where one of the values is much bigger than all other values. Is there a way of having a discontinuous y-axis? My data is as follows: df <- data.frame(a = c(1,2,3,500), b = c('a1', 'a2','a3', 'a4')) p <- ggplot(data = df,…
djq
  • 14,810
  • 45
  • 122
  • 157
88
votes
4 answers

R: what are Slots?

Does anyone know what a slot is in R? I did not find the explanation of its meaning. I get a recursive definition: "Slot function returns or set information about the individual slots of an objects" Help would be appreciated, Thanks - Alley
user573347
  • 975
  • 1
  • 7
  • 11
87
votes
3 answers

How to subtract/add days from/to a date?

I'm trying to build folders to store data pulls. I want to label the folders with the day of that data in the pull. Ex. I pull 5 days ago data from mysql i want to name the folder the date from 5 days ago. MySQL can easily handle date arithmetic.…
Dan
  • 6,008
  • 7
  • 40
  • 41
87
votes
3 answers

Why is `vapply` safer than `sapply`?

The documentation says vapply is similar to sapply, but has a pre-specified type of return value, so it can be safer [...] to use. Could you please elaborate as to why it is generally safer, maybe providing examples? P.S.: I know the answer and I…
flodel
  • 87,577
  • 21
  • 185
  • 223
86
votes
3 answers

Subset / filter rows in a data frame based on a condition in a column

Given a data frame "foo", how can I select only those rows from "foo" where e.g. foo$location = "there"? foo = data.frame(location = c("here", "there", "here", "there", "where"), x = 1:5, y = 6:10) foo # location x y # 1 here 1 6 # 2 …
wishihadabettername
  • 14,231
  • 21
  • 68
  • 85