| From | Sent On | Attachments |
|---|---|---|
| Claudia liliana Ballesteros Mejia | Oct 16, 2009 8:28 am | |
| Corrado | Oct 16, 2009 8:34 am | |
| Claudia liliana Ballesteros Mejia | Oct 20, 2009 12:59 am |
| Subject: | Re: [R-sig-eco] reading large files in R | |
|---|---|---|
| From: | Claudia liliana Ballesteros Mejia (lail...@yahoo.com) | |
| Date: | Oct 20, 2009 12:59:08 am | |
| List: | org.r-project.r-sig-ecology | |
Dear List,
Thanks a lot, I could solve my problem, I didn't change the type of file it was
reading so that's why it gave an strange error.
Cheers, and thanks again
Liliana.
________________________________ From: Steve Friedman <frie...@gmail.com>
Sent: Sat, October 17, 2009 12:25:44 AM Subject: Re: [R-sig-eco] reading large files in R
Claudia,
The file you have described is probably too large for R to handle. What OS are
you working with. It matters. Also, do you need the whole file or can you read
in a portion of the file and process the data in logical geographical "zones".
If you have a file spdiez.txt, this should work
spdiez <- read.table("spdiez.txt, header = TRUE, sep = ",") # provided that
you want the header and in fact columns are comma separated.
Steve
Dear list,
I'm working with modeling spatial distributions of some species of butterflies
and I want to work with the BIOMOD package. But I have a very large file (1.25
GB) with 5925284 rows and 28 columns. When I try to load it with read.table it
says:
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
cannot allocate buffer in 'readTableHead'.
so I try to use the code written in "Using R to process large data files",
published in @CSC.
(http://www.csc.fi/sivut/atcsc/arkisto/atcsc3_2007/ohjelmistot_html/R_and_large_data/)
but I can't get it right. So here is my code.
"spdiez.txt" is my file, and they suggest to create a matrix dropping the
columns and rows names.
length(scan("spdiez.txt", nlines=1, sep="\t", what="character"))
m<-matrix(nrow=5925283, ncol=27) filecon<-file("spdiez.txt", open="r") pos<-seek(filecon, rw="r")
for(i in 1:5925283) {
if (i % % 100 == 0) {
print(i)
} tt<-readLines(filecon, n=1) tt2<-na.omit(as.numeric(unlist(strsplit(tt, "\t")))) if(i!=1) { m[(i-1),]<-t(tt2) } pos<-seek(filecon, rw="r") }
but after this, it throws this error
Error in m[(i - 1), ] <- t(tt2) : replacement has length zero In addition: Warning messages: 1: closing unused connection 3 (spdiez.txt) 2: In na.omit(as.numeric(unlist(strsplit(tt, "\t")))) : NAs introduced by coercion 3: In na.omit(as.numeric(unlist(strsplit(tt, "\t")))) : NAs introduced by coercion
I would appreciate any help or idea that I can use to solve my problem.
Kind regards, and thanks in advanced for any suggestion.
Liliana.
-------------------------------------- Liliana Ballesteros Mejia PhD. Student Institute of Biogeography University of Basel St. Johanns Vorstadt 10 CH 4056 Basel Tel: +41-612670803 Switzerland
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list
R-si...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[[alternative HTML version deleted]]
_______________________________________________ R-sig-ecology mailing list R-si...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology





