18

Sometimes I get accustomed to a particular R package's design and want to search CRAN for all packages by that author (let's use Hadley Wickham for instance). How can I do such a search (I'd like to use R but this doesn't have to be the mode of search)?

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • I think this post http://stackoverflow.com/questions/8722233/available-packages-by-publication-date has the basic ingredients you need ... – Ben Bolker Apr 10 '12 at 01:50
  • 2
    I posted a similar question a few days ago (http://stackoverflow.com/questions/10032079/crantastic-packages-sorted-by-number-of-users) but it was quickly closed for not being a real programming question. I hope you'll be more lucky than me (admittedly, your wording is much better than mine!). If you are interested in a R solution, I have posted an article with code for scraping (some of) crantastic's data into a data.frame at http://r-de-jeu.blogspot.com/2012/04/50-most-used-r-packages.html. – flodel Apr 10 '12 at 01:57
  • I removed the answer posted in the question, and added it to the answer provided by @DWin. Please don't answer your own question inside the question - this gets too confusing. If the posted answer don't quite get there, post and accept your own answer. – Andrie Jun 16 '12 at 05:35

4 Answers4

14

Crantastic can search by author. You can do quite a bit more with crantastic but the functionality you're looking for is already provided there.

Dason
  • 60,663
  • 9
  • 131
  • 148
  • That works but I can't seem to search but must search through the list of packages until I find the package of an author I want and then I can click on their name. If I'm using it wrong let me know. For example if I search for [dbConnect](http://crantastic.org/packages/dbConnect) I can find that author and click his name but I can't seem to type "Kurkiewicz" (dbConnect's author) into the search bar and return his packages. If this is the best approach it'll do but it seems like there's got to be a better way or maybe I'm doing it wrong. – Tyler Rinker Apr 10 '12 at 01:47
  • I guess I didn't actually try the search bar which doesn't appear to search through the package maintainers. On that page I just did a simple Ctrl-f and searched the page that way. – Dason Apr 10 '12 at 02:03
  • didn't know about Ctrl + f That works. First response and probably the quickest thus far gets the check. – Tyler Rinker Apr 10 '12 at 02:08
14

Not exactly by author but perhaps access by maintainer would also be useful?

http://cran.r-project.org/web/checks/check_summary_by_maintainer.html#summary_by_maintainer

EDIT by Tyler Rinker

DWin's suggestion can be brought to fruition with these lines of code:

search.lib <- function(term, column = 1){
    require(XML)
    URL <- "http://cran.r-project.org/web/checks/check_summary_by_maintainer.html#summary_by_maintainer"
    dat <-readHTMLTable(doc=URL, which=1, header=T, as.is=FALSE)
    names(dat) <- trimws(names(dat))
    dat$Maintainer[dat$Maintainer == ""] <- NA
    dat$Maintainer = zoo::na.locf(dat$Maintainer)
    if (is.numeric(column)) {
        dat[agrep(term, dat[, column]), 1:3]
    } else {
        dat[agrep(term, dat[, agrep(column, colnames(dat))]), 1:3]
    }
}

search.lib("hadley")
search.lib("bolker")
search.lib("brewer", 2)
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • DWin I posted an edit (a solution) to my question using your suggestion +1 – Tyler Rinker Apr 10 '12 at 03:32
  • Due to blank rows, only the alphabetically first package by each author was being returned---maybe due to formatting updates on the site? Edited to fill in missing values and return all results. – Gregor Thomas Dec 19 '17 at 18:15
11

Adapted from available.packages by publication date :

## restrict to first 100 packages (by alphabetical order)
pkgs <- unname(available.packages()[, 1])[1:100]
desc_urls <- paste0(options("repos")$repos,"/web/packages/", pkgs, 
    "/DESCRIPTION")
desc <- lapply(desc_urls, function(x) read.dcf(url(x)))
authors <- sapply(desc, function(x) x[, "Author"])

Since I'm a narcissist (and Hadley Wickham has no packages in the first 100 [this was true in 2012 but cannot possibly be true now, in 2018!]):

pkgs[grep("Bolker",authors)]
# [1] "ape"

The main problem with this solution is that doing it for real (rather than just for the first 100 packages) means hitting CRAN 3000+ times for the package information ...

edit: a better solution, based on Jeroen Oom's solution in the same place:

recent.packages.rds <- function(){
    mytemp <- tempfile()
    download.file(paste0(options("repos")$repos,"/web/packages/packages.rds"),
                  mytemp)
    mydata <- as.data.frame(readRDS(mytemp), row.names=NA)
    mydata$Published <- as.Date(mydata[["Published"]])
    mydata
}

mydata <- recent.packages.rds()
unname(as.character(mydata$Package[grep("Wickham",mydata$Author)]))
# [1] "classifly"    "clusterfly"   "devtools"     "evaluate"     "fda"         
# [6] "geozoo"       "ggmap"        "ggplot2"      "helpr"        "hints"       
# [11] "HistData"     "hof"          "itertools"    "lubridate"    "meifly"      
# [16] "memoise"      "munsell"      "mutatr"       "normwhn.test" "plotrix"     
# [21] "plumbr"       "plyr"         "productplots" "profr"        "Rd2roxygen"  
# [26] "reshape"      "reshape2"     "rggobi"       "roxygen"      "roxygen2"    
# [31] "scales"       "sinartra"     "stringr"      "testthat"     "tourr"       
# [36] "tourrGui"  
MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • 3
    The code should go to the `fortunes` package. Q: what do you get if you grep Bolker in R package authors? A: An ape. – Yihui Xie Apr 10 '12 at 02:04
  • Thanks Ben. Definitely an approach but as you point out takes a considerable amount of time. Dason's approach is likely the most efficient. Thanks for the R solution :) – Tyler Rinker Apr 10 '12 at 02:09
  • @Ben your method looks interesting but I can't get it to work. I'm using a Win 7 with the latest version of R (2.15 Easter ...). I get an error that says `Error in readRDS(mytemp) : error reading from connection` If you want more about the error let me know but the problem may be with the windows machine with `download.file` though I've used this before. – Tyler Rinker Apr 10 '12 at 03:31
  • Don't know, sorry ... worked for me (Ubuntu 10.04, r-devel). Does setting `options(repos=...)` explicitly first help? – Ben Bolker Apr 10 '12 at 12:45
2

Bolker's solution above is quite quick and still works, but since 2018 there's a package called pkgsearch that outputs more complete information. Here's a demo, continuing the trend of shameless self-promotion:

r$> pkgsearch::advanced_search(Author = "Waldir", size = 100)                                                                                                                               
- "advanced search" --------------------------------------------------------------------- 11 packages in 0.001 seconds -
  #     package           version by                     @ title                                                                          
  1 100 matlab2r          1.0.0   Waldir Leoncio        1M Translation Layer from MATLAB to R                                             
  2 100 simExam           1.0.0   Waldir Leoncio        3y Generate Simulated Data for IRT-Enabled Exams                                  
  3  83 citation          0.6.2   Jan Philipp Dietrich  1M Software Citation Tools                                                        
  4  83 LOGAN             1.0.0   Denise Reis Costa     3y Log File Analysis in International Large-Scale Assessments                     
  5  82 TruncExpFam       1.0.0   Waldir Leoncio        7d Truncated Exponential Family                                                   
  6  61 contingencytables 1.0.0   Waldir Leoncio        1M Statistical Analysis of Contingency Tables                                     
  7  60 DIscBIO           1.2.0   Waldir Leoncio       10M A User-Friendly Pipeline for Biomarker Discovery in Single-Cell Transcriptomics
  8  51 BayesSUR          2.0.1   Zhi Zhao              3M Bayesian Seemingly Unrelated Regression                                        
  9  44 lsasim            2.1.2   Waldir Leoncio        4M Functions to Facilitate the Simulation of Large Scale Assessment Data          
 10  39 BayesMallows      1.1.0   Oystein Sorensen      3M Bayesian Preference Learning with the Mallows Rank Model                       
 11  11 xaringan          0.22    Yihui Xie             8M Presentation Ninja   

Notice I had to increase size from the default of 10 otherwise I wouldn't get all the packages.

For comparison with the output on the aforementioned answer:

r$> unname(as.character(mydata$Package[grep("Waldir",mydata$Author)]))                        
 [1] "BayesMallows"      "BayesSUR"          "citation"          "contingencytables" "DIscBIO"           "LOGAN"             "lsasim"            "matlab2r"          "simExam"          
[10] "TruncExpFam"       "xaringan"
Waldir Leoncio
  • 10,853
  • 19
  • 77
  • 107