Questions tagged [r]

R is a free, open-source programming language & software environment for statistical computing, bioinformatics, visualization & general computing. Please use minimal reproducible examples others can run using copy & paste. Show desired output entirely. Use dput() for data & specify all non-base packages with library(). Don't embed pictures for data or code, use indented code blocks instead. For statistics questions, use https://stats.stackexchange.com.

R Programming Language

R is a free, open-source programming language and software environment for statistical computing, bioinformatics, information graphics, and general computing. It is a multi-paradigm language and dynamically typed. R is an implementation of the S programming language combined with lexical scoping semantics inspired by Scheme. R was created by Ross Ihaka and Robert Gentleman and is now developed by the R Development Core Team. The R environment is easily extended through a packaging system on CRAN, the Comprehensive R Archive Network.

Scope of questions

This tag should be used for programming-related questions about R. Including a minimal reproducible example in your question will increase your chances of getting a timely, useful answer. Questions should not use the tag unless they relate specifically to the RStudio interface and not just the R language.

If your question is more focused on statistics or data science, use Cross Validated or Data Science, respectively. Bioinformatics-specific questions may be better received on Bioconductor Support or Biostars. General questions about R (such as requests for off-site resources or discussion questions) are unsuitable for Stack Overflow and may be appropriate for one of the general, or special-interest, R mailing lists.

Please do not cross-post across multiple venues. Do research (read tag wikis, look at existing questions, or search online) to determine the most appropriate venue so that you have a better chance of receiving solutions to your question. Your question may be automatically migrated to a more appropriate Stack Exchange site. If you receive no response to your questions after a few days, or if your question is put on hold for being off-topic, it is then OK to post to another venue, giving a link to your Stack Overflow question - but don't cross-post just because your question is down-voted or put on hold for being unclear. Instead, work on improving your question.

Stack Overflow resources

Official CRAN Documentation

Other CRAN resources

Free Resources

Interactive R learning

  • Coursera - Learn how to use R for effective data analysis
  • DataCamp - Many interactive R and data science courses
  • Dataquest - Interactive R courses for data science
  • edX - Basic Statistics and R (basic course, not just for life sciences)
  • edX - Introduction to R Programming
  • R-exercises - 1000+ R exercises and solutions
  • RPubs - Easy web publishing from R
  • Swirl - R-package to learn R interactively

Free books on R:

Programming Chrestomathy (problems written in many languages)

Other free resource materials

IDEs and editors for R

Web application framework for R

  • Shiny - Turn your analyses into interactive web applications. No HTML, CSS, or JavaScript knowledge required.
  • FastRWeb - Fast Interactive Web Framework for Data Mining Using R

Graphical User Interfaces (GUI) in R

Code style guides

Other Resources

Recommended additional R resources include:

Alternative R engines

All alternative R engines have the goal of increasing R's performance and memory management.

Downstream distributions with complete compatibility

Forks of R with near 100% code compatibility

  • pqR by Radford Neal (C-based).
  • Rho by Karl Millar, based upon CXXR by Andrew Runnalls (C++-based). The development on Rho has been suspended indefinitely.

Rewrites with high code compatibility

  • Renjin by BeDataDriven (Java-based).
  • TERR by Tibco (C++-based).

Experimental and early-stage rewrites

  • Riposte by Justin Talbot (C++-based).
  • FastR by Jan Vitek and Tomas Kalibera (Java-based).

Unrelated tags

Due to R's simple name, questions sometimes get tagged with the tag when a different topic is meant. Here is a list of tags that mistagged R questions might be re-tagged to

  • for questions related to the file R.java on
  • "A command line tool for running JavaScript scripts that use the Asynchronous Module Definition API (AMD) for declaring and using JavaScript modules and regular JavaScript script files. It is part of the RequireJS project, and works with the RequireJS implementation of AMD." (from the wiki summary)
  • for questions related to RStudio use the rstudio tag. Don't use this tag just because you are working with RStudio.
496613 questions
56
votes
1 answer

How to control number of minor grid lines in ggplot2?

By default, it seems that ggplot2 uses a minor grid that is just half of the major grid. Is there any way to to break this up? For example, I have a plot where the x-axis is years, and the major breaks are (1850, 1900, 1950, 2000). This means the…
naught101
  • 18,687
  • 19
  • 90
  • 138
56
votes
3 answers

Number formatting axis labels in ggplot2?

I'm plotting a fairly simple chart using ggplot2 0.9.1. x <- rnorm(100, mean=100, sd = 1) * 1000000 y <- rnorm(100, mean=100, sd = 1) * 1000000 df <- data.frame(x,y) p.new <- ggplot(df,aes(x,y)) + geom_point() print(p.new) Which works, but…
mediaczar
  • 1,960
  • 3
  • 18
  • 23
55
votes
9 answers

ggplot2 pdf import in Adobe Illustrator missing font AdobePiStd

I created several simple ggplot2 plots and saved them to PDF files using the following commands: p <- ggplot(plotobject, aes(x=Pos, y=Pval),res=300) ggsave(plot=p,height=6,width=6,dpi=200, filename="~/example.pdf") If I now open this example.pdf in…
Sander
  • 801
  • 2
  • 7
  • 13
55
votes
5 answers

ggplot2: histogram with normal curve

I've been trying to superimpose a normal curve over my histogram with ggplot 2. My formula: data <- read.csv (path...) ggplot(data, aes(V2)) + geom_histogram(alpha=0.3, fill='white', colour='black', binwidth=.04) I tried several things: +…
Bloomy
  • 2,103
  • 4
  • 17
  • 9
55
votes
5 answers

Merge or combine by rownames

In the example below I have two datasets (Z and A). I want to merge or combine these sets by the ILMN numbers. If there is no match, fill in NA. z <- matrix(c(0,0,1,1,0,0,1,1,0,0,0,0,1,0,1,1,0,1,1,1,1,0,0,0,"RND1","WDR",…
Lisann
  • 5,705
  • 14
  • 41
  • 50
55
votes
5 answers

Euclidean distance of two vectors

How do I find the Euclidean distance of two vectors: x1 <- rnorm(30) x2 <- rnorm(30)
Jana
  • 1,523
  • 3
  • 14
  • 17
55
votes
7 answers

ld: warning: text-based stub file are out of sync. Falling back to library file for linking

When I am trying to sourceCpp, it gives a warning: ld: warning: text-based stub file /System/Library/Frameworks//CoreFoundation.framework/CoreFoundation.tbd and library file …
MOOn
  • 651
  • 1
  • 5
  • 4
55
votes
6 answers

Convert all columns to characters in a data.frame

Consider a data.frame with a mix of data types. For a weird purpose, a user needs to convert all columns to characters. How is it best done? A tidyverse attempt at solution is this: map(mtcars,as.character) %>% map_df(as.list) %>%…
userJT
  • 11,486
  • 20
  • 77
  • 88
55
votes
4 answers

Convert date-time string to class Date

I have a data frame with a character column of date-times. When I use as.Date, most of my strings are parsed correctly, except for a few instances. The example below will hopefully show you what is going on. # my attempt to parse the string to Date…
Btibert3
  • 38,798
  • 44
  • 129
  • 168
55
votes
6 answers

Difference between subset and filter from dplyr

It seems to me that subset and filter (from dplyr) are having the same result. But my question is: is there at some point a potential difference, for ex. speed, data sizes it can handle etc? Are there occasions that it is better to use one or the…
Ruthger Righart
  • 4,799
  • 2
  • 28
  • 33
55
votes
2 answers

Adding a column to a dataframe in R

I have the following dataframe (df) start end 1 14379 32094 2 151884 174367 3 438422 449382 4 618123 621256 5 698271 714321 6 973394 975857 7 980508 982372 8 994539 994661 9 1055151 1058824 . . . . . …
David B
  • 29,258
  • 50
  • 133
  • 186
55
votes
2 answers

Aggregate multiple columns at once

I have a data-frame likeso: x <- id1 id2 val1 val2 val3 val4 1 a x 1 9 2 a x 2 4 3 a y 3 5 4 a y 4 9 5 b x 1 7 6 b y 4 4 7 b x 3 9 8 b y 2 8 I wish to aggregate the…
Rookie
  • 5,179
  • 13
  • 41
  • 65
55
votes
3 answers

Unpacking argument lists for ellipsis in R

I am confused by the use of the ellipsis (...) in some functions, i.e. how to pass an object containing the arguments as a single argument. In Python it is called "unpacking argument lists", e.g. >>> range(3, 6) # normal call with…
mhermans
  • 2,097
  • 4
  • 18
  • 31
55
votes
1 answer

list.files() all files in directory and subdirectories

I'm trying to list all the files in a directories including subdirectories that end with _input.txt. - folder 1 - a_input.txt - folder 2 - b_input.txt If folder 1 were my working directory, I would like list.files(pattern = "\\_input.txt$")…
alki
  • 3,334
  • 5
  • 22
  • 45
55
votes
2 answers

What is the difference between cat and print?

cat and print both seem to offer a "print" functionality in R. x <- 'Hello world!\n' cat(x) # Hello world! print(x) # [1] "Hello world!\n" My impression is that cat most resembles the typical "print" function. When do I use cat, and when do I use…
Simon Kuang
  • 3,870
  • 4
  • 27
  • 53