atom feed3 messages in org.r-project.r-sig-ecologyRe: [R-sig-eco] reading large files in R
FromSent OnAttachments
Claudia liliana Ballesteros MejiaOct 16, 2009 8:28 am 
CorradoOct 16, 2009 8:34 am 
Claudia liliana Ballesteros MejiaOct 20, 2009 12:59 am 
Subject:Re: [R-sig-eco] reading large files in R
From:Claudia liliana Ballesteros Mejia (
Date:Oct 20, 2009 12:59:08 am

Dear List,

Thanks a lot, I could solve my problem, I didn't change the type of file it was
reading so that's why it gave an strange error.

Cheers, and thanks again


________________________________ From: Steve Friedman <>

Sent: Sat, October 17, 2009 12:25:44 AM Subject: Re: [R-sig-eco] reading large files in R


The file you have described is probably too large for R to handle. What OS are
you working with. It matters. Also, do you need the whole file or can you read
in a portion of the file and process the data in logical geographical "zones".

If you have a file spdiez.txt, this should work

spdiez <- read.table("spdiez.txt, header = TRUE, sep = ",") # provided that
you want the header and in fact columns are comma separated.


Dear list,

I'm working with modeling spatial distributions of some species of butterflies
and I want to work with the BIOMOD package. But I have a very large file (1.25
GB) with 5925284 rows and 28 columns. When I try to load it with read.table it

Error in read.table(file = file, header = header, sep = sep, quote = quote, :

cannot allocate buffer in 'readTableHead'.

so I try to use the code written in "Using R to process large data files",
published in @CSC.

but I can't get it right. So here is my code.

"spdiez.txt" is my file, and they suggest to create a matrix dropping the
columns and rows names.

length(scan("spdiez.txt", nlines=1, sep="\t", what="character"))

m<-matrix(nrow=5925283, ncol=27) filecon<-file("spdiez.txt", open="r") pos<-seek(filecon, rw="r")

for(i in 1:5925283) {

if (i % % 100 == 0) {


} tt<-readLines(filecon, n=1) tt2<-na.omit(as.numeric(unlist(strsplit(tt, "\t")))) if(i!=1) { m[(i-1),]<-t(tt2) } pos<-seek(filecon, rw="r") }

but after this, it throws this error

Error in m[(i - 1), ] <- t(tt2) : replacement has length zero In addition: Warning messages: 1: closing unused connection 3 (spdiez.txt) 2: In na.omit(as.numeric(unlist(strsplit(tt, "\t")))) : NAs introduced by coercion 3: In na.omit(as.numeric(unlist(strsplit(tt, "\t")))) : NAs introduced by coercion

I would appreciate any help or idea that I can use to solve my problem.

Kind regards, and thanks in advanced for any suggestion.


-------------------------------------- Liliana Ballesteros Mejia PhD. Student Institute of Biogeography University of Basel St. Johanns Vorstadt 10 CH 4056 Basel Tel: +41-612670803 Switzerland

[[alternative HTML version deleted]]

[[alternative HTML version deleted]]