0

I want to read a data file using:

ds <- read.table(file="/data/ken/tmp/tt", header=F, 
                  sep="\t", quote="\"", dec=".",
                  fill=T, comment.char="", 
                  stringsAsFactors=F, 
                  colClass=rep("character", 6))

The file tt looks as follow, with \t as delimiter

20130129074502\thttp://xxx.com.cn/notebook/asus/526600_detail.html\t\t5025\t526600\t255dkmi

but it doesn't work:

caution:
In read.table(file = fcon, header = F, sep = "\t", quote = "\"",  :
  cols = 1 != length(data) = 6
Roland
  • 127,288
  • 10
  • 191
  • 288
catchyoulater
  • 89
  • 2
  • 4

1 Answers1

0

I think you over complicate things , Try this :

read.table(text=tt)
            V1                                                 V2   V3     V4      V5
1 2.013013e+13 http://xxx.com.cn/notebook/asus/526600_detail.html 5025 526600 255dkmi

where tt is :

tt ='20130129074502\thttp://xxx.com.cn/notebook/asus/526600_detail.html\t\t5025\t526600\t255dkmi'

EDIT

You can read line by line, and split using strsplit

sapply(readLines(file.name, n=-1),
                 function(x) strsplit(gsub('[\t|\t\\]','@',x),'@'))
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • that not creact, cause it have 6 cols in the raw data(with 5 '\t'), but you only get 5 cols – catchyoulater Feb 18 '13 at 08:21
  • where is the limits of this column? can you show me..maybe I need to split column V2... – agstudy Feb 18 '13 at 08:33
  • there is a empty string between two '\t', which is like "...detail.html\t\t5025" in the raw data. It seems the function read.table ignore sep if '\t' with an empty string around. – catchyoulater Feb 18 '13 at 09:43
  • @catchyoulater I think you need to use `readLines` with some regular expression... – agstudy Feb 18 '13 at 10:09