0

I have a .txt file in following way

Date/Time  Temp [C] Pressure [P]
2006-01-01T00:00:00  25  1018
2006-01-01T00:01:00  25  1018
.
.

While reading it in R, the header and data appears as follows:

Date.Time  Temp..C. Pressur..P.    [I dont know why [] have converted to.. It would be help if somebody can explain me this too]
2006-01-01T00:00:00  25  1018
2006-01-01T00:01:00  25  1018
.
.

As data in Date.Time column is stored in one column seperated by T, I want to divide this column into two columns Date and Time. To achieve this, I tried following code in R:

library(stringr)

df[c('Date', 'Time')] <- str_split_fixed(df$Date.Time, 'T', 2)

After running the code, I am able to obtain Date and Column as spearate columns but at the end of the table and the original Date.Time column is intact (shown below). Whereas I want Date.Time column to be removed and seperate Date and Time columns in the start of the table. How can I do it? Could someone please help. (I can fix it manually too but because I have thousands of .txt file, I am looking for a code solution.)

Date.Time  Temp..C. Pressur..P. Date Time
2006-01-01T00:00:00  25  1018  2006-01-01 00:00:00
2006-01-01T00:01:00  25  1018  2006-01-01 00:01:00
.
.
  • 3
    You can use `library(dplyr);library(tidyr); df %>% separate(Date.Time, into = c("Date", "Time"), sep = "T")` which automatically removes the original column as `remove = TRUE` by default – akrun May 03 '23 at 19:47
  • 1
    "Standard" column names contain only letters, numbers, `.` and `_`. Many of R's functions for importing columns replace non-standard parts of column names with `.`. If you're using `read.table()` or similar to import the data, you can use the argument `check.names = FALSE` to not check the column names and keep them as-is. – Gregor Thomas May 03 '23 at 19:47

1 Answers1

0

akrun's comment-answer works, however, separate has been superceded by the separate_wider_* and separate_longer_* functions, so it remains to be seen for how long that is the case.

Below are the ways you would do it with the new functions, alongside akrun's solution

library(tidyverse)

df <- structure(list(`Date/Time` = c("2006-01-01T00:00:00", "2006-01-01T00:01:00"
), `Temp [C]` = c(25L, 25L), `Pressure [P]` = c(1018L, 1018L)), class = "data.frame", row.names = c(NA, 
-2L))

# all of the below are equivalent:
# 1. using a delimeter (the 'T') to split the columns
df <- separate_wider_delim(df, "Date/Time", delim = "T", names = c("Date", "Time"))

# 2. using the number of characters to split the columns
df <- separate_wider_position(df, "Date/Time", widths = c("Date" = 10, 1, "Time" = 8))

# 3. using a regex to split the columns
df <- separate_wider_regex(df, "Date/Time", patterns = c("Date" = "^.{10}","T","Time" = ".{8}$"))

# 4. akrun's answer
df <- separate(df, "Date/Time", into = c("Date", "Time"), sep = "T") # there is an option to try to conver the resulting columns to their type `convert = TRUE`, but it didn't work in this case

Then, if you wanted to convert the Date and Time columns into Date and Time formats respectively, you can run the following:

mutate(df, Date = ymd(Date), Time = hms(Time)) # an alternative to ymd is as.Date

Output:

# A tibble: 2 × 4
  Date       Time     `Temp [C]` `Pressure [P]`
  <date>     <Period>      <int>          <int>
1 2006-01-01 0S               25           1018
2 2006-01-01 1M 0S            25           1018
Mark
  • 7,785
  • 2
  • 14
  • 34