Questions tagged [r]

R is a free, open-source programming language & software environment for statistical computing, bioinformatics, visualization & general computing. Please use minimal reproducible examples others can run using copy & paste. Show desired output entirely. Use dput() for data & specify all non-base packages with library(). Don't embed pictures for data or code, use indented code blocks instead. For statistics questions, use https://stats.stackexchange.com.

R Programming Language

R is a free, open-source programming language and software environment for statistical computing, bioinformatics, information graphics, and general computing. It is a multi-paradigm language and dynamically typed. R is an implementation of the S programming language combined with lexical scoping semantics inspired by Scheme. R was created by Ross Ihaka and Robert Gentleman and is now developed by the R Development Core Team. The R environment is easily extended through a packaging system on CRAN, the Comprehensive R Archive Network.

Scope of questions

This tag should be used for programming-related questions about R. Including a minimal reproducible example in your question will increase your chances of getting a timely, useful answer. Questions should not use the tag unless they relate specifically to the RStudio interface and not just the R language.

If your question is more focused on statistics or data science, use Cross Validated or Data Science, respectively. Bioinformatics-specific questions may be better received on Bioconductor Support or Biostars. General questions about R (such as requests for off-site resources or discussion questions) are unsuitable for Stack Overflow and may be appropriate for one of the general, or special-interest, R mailing lists.

Please do not cross-post across multiple venues. Do research (read tag wikis, look at existing questions, or search online) to determine the most appropriate venue so that you have a better chance of receiving solutions to your question. Your question may be automatically migrated to a more appropriate Stack Exchange site. If you receive no response to your questions after a few days, or if your question is put on hold for being off-topic, it is then OK to post to another venue, giving a link to your Stack Overflow question - but don't cross-post just because your question is down-voted or put on hold for being unclear. Instead, work on improving your question.

Stack Overflow resources

Official CRAN Documentation

Other CRAN resources

Free Resources

Interactive R learning

  • Coursera - Learn how to use R for effective data analysis
  • DataCamp - Many interactive R and data science courses
  • Dataquest - Interactive R courses for data science
  • edX - Basic Statistics and R (basic course, not just for life sciences)
  • edX - Introduction to R Programming
  • R-exercises - 1000+ R exercises and solutions
  • RPubs - Easy web publishing from R
  • Swirl - R-package to learn R interactively

Free books on R:

Programming Chrestomathy (problems written in many languages)

Other free resource materials

IDEs and editors for R

Web application framework for R

  • Shiny - Turn your analyses into interactive web applications. No HTML, CSS, or JavaScript knowledge required.
  • FastRWeb - Fast Interactive Web Framework for Data Mining Using R

Graphical User Interfaces (GUI) in R

Code style guides

Other Resources

Recommended additional R resources include:

Alternative R engines

All alternative R engines have the goal of increasing R's performance and memory management.

Downstream distributions with complete compatibility

Forks of R with near 100% code compatibility

  • pqR by Radford Neal (C-based).
  • Rho by Karl Millar, based upon CXXR by Andrew Runnalls (C++-based). The development on Rho has been suspended indefinitely.

Rewrites with high code compatibility

  • Renjin by BeDataDriven (Java-based).
  • TERR by Tibco (C++-based).

Experimental and early-stage rewrites

  • Riposte by Justin Talbot (C++-based).
  • FastR by Jan Vitek and Tomas Kalibera (Java-based).

Unrelated tags

Due to R's simple name, questions sometimes get tagged with the tag when a different topic is meant. Here is a list of tags that mistagged R questions might be re-tagged to

  • for questions related to the file R.java on
  • "A command line tool for running JavaScript scripts that use the Asynchronous Module Definition API (AMD) for declaring and using JavaScript modules and regular JavaScript script files. It is part of the RequireJS project, and works with the RequireJS implementation of AMD." (from the wiki summary)
  • for questions related to RStudio use the rstudio tag. Don't use this tag just because you are working with RStudio.
496613 questions
215
votes
11 answers

Error: could not find function ... in R

This is meant to be a FAQ question, so please be as complete as possible. The answer is a community answer, so feel free to edit if you think something is missing. This question was discussed and approved on meta. I am using R and tried…
Joris Meys
  • 106,551
  • 31
  • 221
  • 263
214
votes
9 answers

Difference between R MarkDown and R NoteBook

I am trying to understand at a high level what the differences between R Markdown and R NoteBook. I know they are interrelated but I would like to figure out how they are related. My understanding is this: I know R Notebooks are really R Markdown…
PagMax
  • 8,088
  • 8
  • 25
  • 40
214
votes
5 answers

How to convert a table to a data frame

I have a table in R that has str() of this: table [1:3, 1:4] 0.166 0.319 0.457 0.261 0.248 ... - attr(*, "dimnames")=List of 2 ..$ x: chr [1:3] "Metro >=1 million" "Metro <1 million" "Non-Metro Counties" ..$ y: chr [1:4] "q1" "q2" "q3"…
Victor Van Hee
  • 9,679
  • 7
  • 33
  • 41
212
votes
12 answers

Access lapply index names inside FUN

Is there a way to get the list index name in my lapply() function? n = names(mylist) lapply(mylist, function(list.elem) { cat("What is the name of this list element?\n" }) I asked before if it's possible to preserve the index names in the lapply()…
Robert Kubrick
  • 8,413
  • 13
  • 59
  • 91
212
votes
2 answers

What do hjust and vjust do when making a plot using ggplot?

Every time I make a plot using ggplot, I spend a little while trying different values for hjust and vjust in a line like + opts(axis.text.x = theme_text(hjust = 0.5)) to get the axis labels to line up where the axis labels almost touch the axis,…
William Gunn
  • 2,925
  • 8
  • 26
  • 22
212
votes
5 answers

How to use R's ellipsis feature when writing your own function?

The R language has a nifty feature for defining functions that can take a variable number of arguments. For example, the function data.frame takes any number of arguments, and each argument becomes the data for a column in the resulting data table.…
Ryan C. Thompson
  • 40,856
  • 28
  • 97
  • 159
210
votes
10 answers

Speed up the loop operation in R

I have a big performance problem in R. I wrote a function that iterates over a data.frame object. It simply adds a new column to a data.frame and accumulates something. (simple operation). The data.frame has roughly 850K rows. My PC is still working…
Kay
  • 2,109
  • 3
  • 13
  • 3
210
votes
7 answers

Reasons for using the set.seed function

Many times I have seen the set.seed function in R, before starting the program. I know it's basically used for the random number generation. Is there any specific need to set this?
Vignesh
  • 2,247
  • 2
  • 14
  • 12
209
votes
3 answers

How to find common elements from multiple vectors?

Can anyone tell me how to find the common elements from multiple vectors? a <- c(1,3,5,7,9) b <- c(3,6,8,9,10) c <- c(2,3,4,5,7,9) I want to get the common elements from the above vectors (ex: 3 and 9)
Chares
  • 2,753
  • 3
  • 17
  • 7
209
votes
17 answers

Determine the number of NA values in a column

I want to count the number of NA values in a data frame column. Say my data frame is called df, and the name of the column I am considering is col. The way I have come up with is following: sapply(df$col, function(x) sum(length(which(is.na(x))))) …
user3274289
  • 2,426
  • 3
  • 16
  • 14
209
votes
6 answers

Order data frame rows according to vector with specific order

Is there an easier way to ensure that a data frame's rows are ordered according to a "target" vector as the one I implemented in the short example below? df <- data.frame(name = letters[1:4], value = c(rep(TRUE, 2), rep(FALSE, 2))) df # name…
Rappster
  • 12,762
  • 7
  • 71
  • 120
208
votes
8 answers

Installing older version of R package

I am trying to use Rpy2 and ggplot2 but I get an error. After some searching for the error online, I found that the error occurs because there are changes in the ggplot2 package that are not yet reflected in Rpy2 (for example, see this post (Edit:…
hirolau
  • 13,451
  • 8
  • 35
  • 47
206
votes
9 answers

Show percent % instead of counts in charts of categorical variables

I'm plotting a categorical variable and instead of showing the counts for each category value. I'm looking for a way to get ggplot to display the percentage of values in that category. Of course, it is possible to create another variable with the…
wishihadabettername
  • 14,231
  • 21
  • 68
  • 85
205
votes
21 answers

Replacing NAs with latest non-NA value

In a data.frame (or data.table), I would like to "fill forward" NAs with the closest previous non-NA value. A simple example, using vectors (instead of a data.frame) is the following: > y <- c(NA, 2, 2, NA, NA, 3, NA, 4, NA, NA) I would like a…
Ryogi
  • 5,497
  • 5
  • 26
  • 46
205
votes
5 answers

How to combine multiple conditions to subset a data-frame using "OR"?

I have a data.frame in R. I want to try two different conditions on two different columns, but I want these conditions to be inclusive. Therefore, I would like to use "OR" to combine the conditions. I have used the following syntax before with lot…
Sam
  • 7,922
  • 16
  • 47
  • 62