How to Really write a Server Control in ASP.Net
- November 13, 2013
Find Missing Dependency DLLs on Win 7
- March 5, 2012
Nullable TryParse
- May 26, 2011
Diff SQL Server Stored Procedures, November 15, 2010
Reporting Services Extranet Access, March 16, 2010
Case of the missing WaitCursor, January 7, 2009
Simple Submit Button Disable, December 9, 2009
An Efficient Memory Stream, September 29, 2009
Approach Plate Download - May 14, 2009
WPF Binding - Async Web Services - April 10, 2009
Developing the Blog
- April 4, 2009
|
|
Some things that I do as a programmer have stuck with me despite all the changes
in programming over the years. Take memory allocation. Back in 1987, despite only one program
running at a time on a PC, you had to make everything fit in 512K. Now you have almost unlimited
memory available to you as a programmer. You can't even take a photo smaller than
512K anymore. But when
I work with memory, I can't help but think about the blocks of bytes moving around, and how I should
operate on them in the most efficient way. 20 years ago, you had no choice. Your program crashed when
you ate up all the memory. But now you can be completely ignorant of memory allocation as a programmer,
and, except in special circumstances, get away with it.
A recent project I was working on is a good example. I am calling a web service which fetches a zip file,
and then extracting the zip to a directory. The web service, pulls the zip from a web site, so
it retrieves it in blocks of byte[]. The library I am using (DotNetZip)
to extract the file from the zip is expecting a single byte[], or a stream of some sort. But here I am with a byte[][].
The easy way to get the blocks of bytes is to retrieve them all, then allocate a big buffer to
hold them all, and copy the blocks into it. The 1987 programmer in me just couldn't stomach that. So,
how am I to get a byte[][] to the library? I could write them to a file, and load them back in now that
I know the file size. But that seems a waste of time. Plus, I don't want my Web Service to have to be
able to write to disk.
The solution I used was to create a memory stream that could work with a
byte[][], instead of the standard byte[]. This way, my bytes stay put, never having to be reallocated. I started by copying the definition of the
MemoryStream, but replaced the constructor to take an array of byte[].
public class
MemoryStreamArray :
MemoryStream
{
private long
m_position = 0;
private int
m_bufferNumber = 0;
private byte[][]
m_buffer;
private int
m_capacity;
private bool
m_isDisposed = false;
public MemoryStreamArray(byte[][] buffer)
{
if (buffer ==
null) throw new
ArgumentNullException("buffer", "buffer
cannot be null");
m_buffer = buffer;
m_capacity = 0;
foreach (byte[]
a_bytes in m_buffer)
m_capacity += a_bytes.Length;
}
The key to letting the Zip library operate on the MemoryStreamArray to propery
read and seek within the buffer:
public override
int Read(byte[]
buffer, int offset, int
count)
{
if (buffer ==
null) throw new
ArgumentNullException("buffer");
if (offset < 0)
throw new
ArgumentOutOfRangeException("offset");
if (count < 0)
throw new
ArgumentOutOfRangeException("count");
if (buffer.Length - offset < count)
throw new
ArgumentException("more bytes are being asked to be copied than space
available in buffer from the offset");
int a_bufferPosition = offset;
int a_count = 0;
long a_thisBufferStart = 0;
for (int
a_ix = 0; a_ix < m_bufferNumber; a_ix++)
a_thisBufferStart += m_buffer[a_ix].Length;
while (m_position < m_capacity && a_count
< count)
{
if (m_position - a_thisBufferStart >=
m_buffer[m_bufferNumber].Length)
{
if (m_bufferNumber + 1 >=
m_buffer.Length)
break;
a_thisBufferStart += m_buffer[m_bufferNumber].Length;
m_bufferNumber++;
}
buffer[a_bufferPosition] = m_buffer[m_bufferNumber][m_position -
a_thisBufferStart];
a_bufferPosition++;
a_count++;
m_position++;
}
return a_count;
}
public override
long Seek(long
offset, SeekOrigin loc)
{
if (offset >
Int32.MaxValue)
throw new
ArgumentOutOfRangeException("offset", "offset
is greater than System.Int32.MaxValue");
if (m_isDisposed)
throw new
ObjectDisposedException("The current stream instance is closed");
long a_position = m_position;
switch (loc)
{
case
SeekOrigin.Begin: a_position = offset; break;
case
SeekOrigin.Current: a_position += offset; break;
case
SeekOrigin.End: m_position = a_position + offset;
break;
default: throw
new
ArgumentException("invalid
System.IO.SeekOrigin", "loc");
}
if (a_position < 0)
throw new
IOException("Seeking
is attempted before the beginning of the stream");
return (Position = a_position);
}
The Position property will correctly keep track of the current buffer number
public override
long Position
{
get { return
m_position; }
set
{
m_position = value;
long a_count = 0;
m_bufferNumber = 0;
while (a_count < m_position)
a_count += m_buffer[m_bufferNumber++].Length;
}
}
The rest of the functions are pretty obvious. Since I only want to read the
buffer, I can throw a lot of exceptions in functions like Write() and
GetBuffer().
While this runs pretty fast, and is mostly "reallocation" free, there is still
oportunity for enhancements. For example, the web service retrieves the
original zip file in chunks, but allocates the whole file before returning it as
a byte[][]. We could serve it back to the calling program in chunks as
well. But that introduces some interesting issues. Considering the
maintenance, I chose to save that for another day.
|