0

How to rename column headers that have "X or X.1 or X.3" values, but it should refer and rename with the next column's header.

code:

library(pdftools)
library(data.table)
library(tabulizer)
pdf_file <- "new.pdf"

out2 <- extract_tables(pdf_file, pages =c(89), output = "data.frame")
out2<-as.data.table(out2)
colnames(out2)

Actual output:

"G" "X" "Day.7" "X.1"   "Day.8" "X.2"   "Day.9" "X.3"  

Expected Output:

"G" "Day.7 "Day.7" "Day.8"   "Day.8" "Day.9"  "Day.9"
kumar
  • 5
  • 5

1 Answers1

0

Difficult to know without access to the PDF you are using but it may be because creating a data frame where 2 columns share the same name is not default behaviour. If you are certain you know what the columns are then you could simply change the names manually.

    # Make fake data that matches what the pdf extract_tables function might be doing
    originalnames <- c("G", "X", "Day.7", "X.1", "Day.8", "X.2", "Day.9", "X.3")
    mm <- matrix(data=NA, nrow=1, ncol=8)
    colnames(mm) <- originalnames
    
    # Create a data frame with required names
    df <- data.frame(mm)
    requirednames <- c("G", "X", "Day.7", "Day.7", "Day.8",   "Day.8", "Day.9",  "Day.9")
    names(df) <- requirednames
Andrew Chisholm
  • 6,362
  • 2
  • 22
  • 41