17

Has anyone been able to import a SharePoint list in R as a dataframe?

I have two separate data sources, one from a SharePoint list and the other from a DB that I wish to run an analysis on. I am able to connect to the DB without any problem but can't seem to find anything to connect to a SharePoint list.

The SharePoint server is 2007

zx8754
  • 52,746
  • 12
  • 114
  • 209
John Smith
  • 2,448
  • 7
  • 54
  • 78
  • would it be possible to provide an example? it is also possible to import your list into something else (e.g. xls, txt ...) then import it to R –  Mar 01 '15 at 13:03
  • Hi @Memo, the list itself is basically just an online excel sheet that users can update themselves. Its basically a feedback form for other parts of the business to update simultaneously based on operations that occur on the ground. I know its possible to directly link to the sheet using Ms access or with SQL server with a bit of difficulty but I was hoping there was a package that allowed you to do it similiarly to python where it treats the sharepoint list as just another table – John Smith Mar 01 '15 at 18:52

4 Answers4

17

I've been working on reading SharePoint 2010 lists using R for a little while now. Basically, I use the SharePoint web service to return the results from the list, then use xmlToDataFrame to convert to a dataframe.

URL <- "http://yoursharepointserver/_vti_bin/ListData.svc/yourlist"    
data = xmlParse(readLines(URL))

## get the individual list items    
items = getNodeSet(data, "//m:properties")

## convert to a data frame
df = xmlToDataFrame(items, stringsAsFactors = FALSE)

Since I'm using the web service I can filter the list before I return the results, which is really helpful in overcoming the limitations of the SharePoint web service. The following link is quite helpful... http://www.dotnetmafia.com/blogs/dotnettipoftheday/archive/2010/01/21/introduction-to-querying-lists-with-rest-and-listdata-svc-in-sharepoint-2010.aspx

Lee Mendoza
  • 186
  • 1
  • 2
3

Lee Mendoza's answer may well work if ListData.svc is running and/or you have administrative access to the SharePoint server.

If both of those aren't true: the following might work. At least it does for me on SharePoint 2010. If there's a better way of doing it when ListData.svc isn't present, I'd love to hear it.

 library(RCurl)
 library(XML)
 library(data.table)
 URL <- "http://<site>/_vti_bin/owssvr.dll?Cmd=Display&Query=*&XMLDATA=TRUE&List={GUID_OF_LIST}"
 rawData <- getURL(URL, userpwd = "username:password")
 # in real life  prompt for user credentials, don't put in script
 xmlData <- xmlParse (rawData, options=HUGE, useInternalNodes=TRUE)
 dataList <- xmlToList(xmlRoot(xmlData)[["data"]])
 # check the system return, on my SP2010 server the data block is 
 # named rs:data so this works
 dataMatrix <- do.call(rbind,dataList)
 finalDataTable <- data.table(dataMatrix)
David Wagle
  • 141
  • 5
  • 8
    I can't get this part `getURL(URL, userpwd = "username:password")`... no matter what I put in, I get a 403 forbidden. I've tried putting in my credentials a couple of different ways... any suggestions? When you put in ``, how much info should be included after xxxx.sharepoint.com? – Amit Kohli Sep 22 '15 at 11:04
1

The answer above works for lists which are <= 1000 rows only. Using "$Top" and "$Skip" in the URL, you can use the function below which iterates multiple times and imports all the data from the list regardless of the size. (This may not be the most clean way to write it, but it works!)

sp_import <- function(ListName) {

        urlstring <- "http://yoursharepointserver/_vti_bin/ListData.svc/yourlist" 
        data <- xmlParse(readLines(paste(urlstring, ListName, sep = ""), warn = FALSE))
        items <- getNodeSet(data, "//m:properties")
        df <- xmlToDataFrame(items, stringsAsFactors = FALSE)
        iterate <- nrow(df)
        skip <- 1

        while (nrow(df) == 1000 * skip) {
            data <- xmlParse(readLines(paste(urlstring, ListName, "?$top=1000&$skip=", iterate, sep = ""), warn = FALSE))
            items <- getNodeSet(data, "//m:properties")
            df <- rbind(df, xmlToDataFrame(items, stringsAsFactors = FALSE))
            iterate <- nrow(df)
            skip <- skip + 1
        }
        return(df)
}
SharpSharpLes
  • 302
  • 4
  • 20
1

The above approaches did not work for me. However, the approach below works for me. First, I went on a SharePoint list and I clicked on "extract as excel file" which generated a ".iqy" file. An example of the content of the ".iqy" file is in the text variable below. I added some X where I needed to hide the information. The approach is simple. Basically, you first create a temporary ".iqy" file. After, you open the ".iqy" file with Excel which automatically extracts the information of the SharePoint list and saves the information in the Excel Sheet. Afterwards, you simply need to extraction the information from the Excel file.

library(RDCOMClient)
library(openxlsx)

xlApp <- COMCreate("Excel.Application")
xlApp[["DisplayAlerts"]] <- FALSE
xlApp[["Visible"]] <- TRUE

text <- c("WEB", "1",
          "https://xxx.sharepoint.com/sites/xxx/_vti_bin/owssvr.dll?XMLDATA=1&List=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&View=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&RowLimit=0&RootFolder=",
          "", "Selection=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
          "EditWebPage=", "Formatting=None", "PreFormattedTextToColumns=True", "ConsecutiveDelimitersAsOne=True",
          "SingleBlockTextImport=False", "DisableDateRecognition=False", "DisableRedirections=False",
          "SharePointApplication=https://xxxx.sharepoint.com/sites/xxxx/_vti_bin",
          "SharePointListView=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx", "SharePointListName=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
          "RootFolder=")

temp_IQY_File <- tempfile(fileext = ".iqy")  
fileConn <- file(temp_IQY_File)
writeLines(text, fileConn)
close(fileConn)

temp_Excel_File <- gsub(pattern = ".iqy", replacement = ".xlsx", x = temp_IQY_File)

xlWbk <- xlApp$Workbooks()$Open(temp_IQY_File)
xlWbk$SaveAs(temp_Excel_File)

df <- openxlsx::read.xlsx(temp_Excel_File)
Emmanuel Hamel
  • 1,769
  • 7
  • 19