git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Question about streaming to memorymapped files


I don’t have any offhand, no, but I would imagine that direct file writes will at some point need to make a system call, which is expensive ( fwrite might buffer before eventually making the sys call, looks like FileOutputStream uses the raw system write for every write call).
The current MMap io interface isn’t usable as a streaming output unfortunately, though I suppose I could just implement my own

-----Original Message-----
From: Antoine Pitrou [mailto:solipsis@xxxxxxxxxx] 
Sent: Wednesday, May 09, 2018 11:11 AM
To: dev@xxxxxxxxxxxxxxxx
Subject: Re: Question about streaming to memorymapped files


Do you know of any benchmark numbers / performance studies about this?
While it's true that a memory-mapped file avoids explicit system calls,
I've heard file I/O is quite well optimized, at least on Linux,
nowadays.

Regards

Antoine.


On Wed, 9 May 2018 14:47:53 +0000
"Ambalu, Robert" <Robert.Ambalu@xxxxxxxxxxx> wrote:
> Antoine, thanks for the quick reply.
> You can actually grow memorymapped files with a mremap call ( and I think a seek/write on the file ), I do this in my applications and it works fine.
> I want the efficiency of writing via memory maps, so would prefer to avoid FileOutputStream
> 
> -----Original Message-----
> From: Antoine Pitrou [mailto:antoine@xxxxxxxxxx] 
> Sent: Wednesday, May 09, 2018 10:37 AM
> To: dev@xxxxxxxxxxxxxxxx
> Subject: Re: Question about streaming to memorymapped files
> 
> 
> Hi,
> 
> If you don't know the output size upfront then should probably use a
> FileOutputStream instead.  By definition, memory mapped files must have
> a fixed size (since they are mapped to a fixed area in virtual memory).
> 
> Regards
> 
> Antoine.
> 
> 
> Le 09/05/2018 à 16:31, Ambalu, Robert a écrit :
> > Hey, I'm looking into streaming table updates into a memory mapped file ( C++ )
> > I think I have everything I need ( MemoryMappedFile output streamer, RecordBatchStreamWriter ) but I don't understand how to properly create the memmap file.  It looks like it requires you to preset a size to the file when you create it, but since ill be streaming I don't actually know how big a file im going to need...
> > Am I missing some other API point here?  Any reason why size is required up front and the memmap doesn't auto-grow as needed?
> > 
> > Thanks in advance
> > - Rob
> > 
> > 
> > 
> > 
> > 
> > DISCLAIMER: This e-mail message and any attachments are intended solely for the use of the individual or entity to which it is addressed and may contain information that is confidential or legally privileged. If you are not the intended recipient, you are hereby notified that any dissemination, distribution, copying or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and permanently delete this message and any attachments.
> > 
> > 
> > 
> >