Questions tagged [r]

R is a free, open-source programming language & software environment for statistical computing, bioinformatics, visualization & general computing. Please use minimal reproducible examples others can run using copy & paste. Show desired output entirely. Use dput() for data & specify all non-base packages with library(). Don't embed pictures for data or code, use indented code blocks instead. For statistics questions, use https://stats.stackexchange.com.

R Programming Language

R is a free, open-source programming language and software environment for statistical computing, bioinformatics, information graphics, and general computing. It is a multi-paradigm language and dynamically typed. R is an implementation of the S programming language combined with lexical scoping semantics inspired by Scheme. R was created by Ross Ihaka and Robert Gentleman and is now developed by the R Development Core Team. The R environment is easily extended through a packaging system on CRAN, the Comprehensive R Archive Network.

Scope of questions

This tag should be used for programming-related questions about R. Including a minimal reproducible example in your question will increase your chances of getting a timely, useful answer. Questions should not use the tag unless they relate specifically to the RStudio interface and not just the R language.

If your question is more focused on statistics or data science, use Cross Validated or Data Science, respectively. Bioinformatics-specific questions may be better received on Bioconductor Support or Biostars. General questions about R (such as requests for off-site resources or discussion questions) are unsuitable for Stack Overflow and may be appropriate for one of the general, or special-interest, R mailing lists.

Please do not cross-post across multiple venues. Do research (read tag wikis, look at existing questions, or search online) to determine the most appropriate venue so that you have a better chance of receiving solutions to your question. Your question may be automatically migrated to a more appropriate Stack Exchange site. If you receive no response to your questions after a few days, or if your question is put on hold for being off-topic, it is then OK to post to another venue, giving a link to your Stack Overflow question - but don't cross-post just because your question is down-voted or put on hold for being unclear. Instead, work on improving your question.

Stack Overflow resources

Official CRAN Documentation

Other CRAN resources

Free Resources

Interactive R learning

  • Coursera - Learn how to use R for effective data analysis
  • DataCamp - Many interactive R and data science courses
  • Dataquest - Interactive R courses for data science
  • edX - Basic Statistics and R (basic course, not just for life sciences)
  • edX - Introduction to R Programming
  • R-exercises - 1000+ R exercises and solutions
  • RPubs - Easy web publishing from R
  • Swirl - R-package to learn R interactively

Free books on R:

Programming Chrestomathy (problems written in many languages)

Other free resource materials

IDEs and editors for R

Web application framework for R

  • Shiny - Turn your analyses into interactive web applications. No HTML, CSS, or JavaScript knowledge required.
  • FastRWeb - Fast Interactive Web Framework for Data Mining Using R

Graphical User Interfaces (GUI) in R

Code style guides

Other Resources

Recommended additional R resources include:

Alternative R engines

All alternative R engines have the goal of increasing R's performance and memory management.

Downstream distributions with complete compatibility

Forks of R with near 100% code compatibility

  • pqR by Radford Neal (C-based).
  • Rho by Karl Millar, based upon CXXR by Andrew Runnalls (C++-based). The development on Rho has been suspended indefinitely.

Rewrites with high code compatibility

  • Renjin by BeDataDriven (Java-based).
  • TERR by Tibco (C++-based).

Experimental and early-stage rewrites

  • Riposte by Justin Talbot (C++-based).
  • FastR by Jan Vitek and Tomas Kalibera (Java-based).

Unrelated tags

Due to R's simple name, questions sometimes get tagged with the tag when a different topic is meant. Here is a list of tags that mistagged R questions might be re-tagged to

  • for questions related to the file R.java on
  • "A command line tool for running JavaScript scripts that use the Asynchronous Module Definition API (AMD) for declaring and using JavaScript modules and regular JavaScript script files. It is part of the RequireJS project, and works with the RequireJS implementation of AMD." (from the wiki summary)
  • for questions related to RStudio use the rstudio tag. Don't use this tag just because you are working with RStudio.
496613 questions
242
votes
5 answers

Change size of axes title and labels in ggplot2

I have a really simple question, which I am struggling to find the answer to. I hoped someone here might be able to help me. An example dataframe is presented below: a <- c(1:10) b <- c(10:1) df <- data.frame(a,b) library(ggplot2) g =…
KT_1
  • 8,194
  • 15
  • 56
  • 68
242
votes
11 answers

Numbering rows within groups in a data frame

Working with a data frame similar to this: set.seed(100) df <- data.frame(cat = c(rep("aaa", 5), rep("bbb", 5), rep("ccc", 5)), val = runif(15)) df <- df[order(df$cat, df$val), ] df cat val 1 aaa 0.05638315 2 aaa…
eli-k
  • 10,898
  • 11
  • 40
  • 44
241
votes
10 answers

Relative frequencies / proportions with dplyr

Suppose I want to calculate the proportion of different values within each group. For example, using the mtcars data, how do I calculate the relative frequency of number of gears by am (automatic/manual) in one go with…
jenswirf
  • 7,087
  • 11
  • 45
  • 65
240
votes
5 answers

Why use purrr::map instead of lapply?

Is there any reason why I should use map(, function(x) ) instead of lapply(, function(x) ) the output should be the same and the benchmarks I made seem to show that lapply is slightly faster…
Tim
  • 7,075
  • 6
  • 29
  • 58
239
votes
12 answers

Selecting only numeric columns from a data frame

Suppose, you have a data.frame like this: x <- data.frame(v1=1:20,v2=1:20,v3=1:20,v4=letters[1:20]) How would you select only those columns in x that are numeric?
Brandon Bertelsen
  • 43,807
  • 34
  • 160
  • 255
238
votes
8 answers

How do you delete a column by name in data.table?

To get rid of a column named "foo" in a data.frame, I can do: df <- df[-grep('foo', colnames(df))] However, once df is converted to a data.table object, there is no way to just remove a column. Example: df <- data.frame(id = 1:100, foo =…
Maiasaura
  • 32,226
  • 27
  • 104
  • 108
238
votes
4 answers

R - Markdown avoiding package loading messages

I have been using Knitr via R-Studio, and think it is pretty neat. I have a minor issue though. When I source a file in an R-Chunk, the knitr output includes external comments as follows: + FALSE Loading required package: ggplot2 + FALSE Loading…
Roark
  • 2,575
  • 2
  • 14
  • 8
236
votes
3 answers

Label points in geom_point

The data I'm playing with comes from the internet source listed below nba <- read.csv("http://datasets.flowingdata.com/ppg2008.csv", sep=",") What I want to do, is create a 2D points graph comparing two metrics from this table, with each player…
Green Demon
  • 4,078
  • 6
  • 24
  • 32
236
votes
3 answers

Use of ~ (tilde) in R programming Language

I saw in a tutorial about regression modeling the following command: myFormula <- Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width What exactly does this command do, and what is the role of ~ (tilde) in the command?
Ankita
  • 2,798
  • 4
  • 18
  • 25
233
votes
8 answers

Remove NA values from a vector

I have a huge vector which has a couple of NA values, and I'm trying to find the max value in that vector (the vector is all numbers), but I can't do this because of the NA values. How can I remove the NA values so that I can compute the max?
CodeGuy
  • 28,427
  • 76
  • 200
  • 317
232
votes
7 answers

Installing R on Mac - Warning messages: Setting LC_CTYPE failed, using "C"

I would like install R on my laptop Mac OS X version 10.7.3 I downloaded the last version and I double click on it and it was installed, when i start up I get the following error, I searched in internet but I could not solve the problem, any help…
user1267127
232
votes
12 answers

pull out p-values and r-squared from a linear regression

How do you pull out the p-value (for the significance of the coefficient of the single explanatory variable being non-zero) and R-squared value from a simple linear regression model? For example... x = cumsum(c(0, runif(100, -1, +1))) y =…
grautur
  • 29,955
  • 34
  • 93
  • 128
232
votes
9 answers

Explicitly calling return in a function or not

A while back I got rebuked by Simon Urbanek from the R core team (I believe) for recommending a user to explicitly calling return at the end of a function (his comment was deleted though): foo = function() { return(value) } instead he…
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
230
votes
10 answers

Load multiple packages at once

How can I load a bunch of packages at once with out retyping the require command over and over? I've tried three approaches all of which crash and burn. Basically, I want to supply a vector of package names to a function that will load…
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
228
votes
7 answers

How to prevent ifelse() from turning Date objects into numeric objects

I am using the function ifelse() to manipulate a date vector. I expected the result to be of class Date, and was surprised to get a numeric vector instead. Here is an example: dates <- as.Date(c('2011-01-01', '2011-01-02', '2011-01-03',…
Zach
  • 29,791
  • 35
  • 142
  • 201