import - Reading multiple space-delimited text files from a folder in R -


i have 100 space-delimited text files in folder. each text file has paragraph of text in it. wish extract data in data frame column 1 file id , column 2 corresponding text paragraph.

this have tried far failed extract text paragraph in desired format.

lf <- list.files(path = "", pattern = "'*.txt", full.names = true, recursive = true, include.dirs = true) data <- lapply(lf, read.table, sep="", header=false) 

a sample text file looks this:

"yeah, , and repeated phone calls call in on continuously ask if there's promotional deal going on dvr's because i've had problems hopper , delays , today. bill or exchanging hopper enjoys better dvr's."

the output i'm getting list:

[[1]]      v1  v2  v3       v4    v5    v6 v7 v8   v9 v10 v11       v12 v13          v14 v15 v16     v17 1 yeah, , and repeated phone calls  call  in  on   continuously ask  if there's   v18         v19  v20   v21 v22   v23     v24  v25 v26  v27      v28  v29 v30    v31 v32 v33 1   promotional deal going  on dvr's because i've had problems hopper ,      v34 v35    v36 v37 v38     v39  v40 v41        v42    v43    v44    v45 v46    v47 v48 v49 1 delays , today.   bill  or exchanging hopper enjoys better dvr's. 

i wish in data frame format as:

file id         text  file1.txt       yeah, , and repeated phone calls... 

any pointers on i'm missing?

thanks in advance.

try this: (you not want have spaces delimiters since there many of them in paragraphs):

dat <- setnames( lapply(lf, read.table, sep="|", header=false), lf) 

choose separator suspect not in text. i'm afraid sep="" bad choice because gets interpreted default read.table "whitespace". "title" of entry each file should file name.


Comments