-
Website
http://tirania.org/blog -
Original page
http://tirania.org/blog/archive/2008/Sep-24.html -
Subscribe
All Comments -
Community
-
Top Commenters
-
Max "WorldMaker" Battcher
7 comments · 1 points
-
psantosl
18 comments · 1 points
-
whitemice
8 comments · 1 points
-
barrkel
6 comments · 3 points
-
Ed Ropple
27 comments · 13 points
-
-
Popular Threads
-
C# Support for Tuples - Miguel de Icaza
1 day ago · 27 comments
-
C# String Interpolation - Miguel de Icaza
4 days ago · 62 comments
-
Nine Months Later: Mono 2.6 and MonoDevelop 2.2 - Miguel de Icaza
1 week ago · 46 comments
-
Releasing Moonlight 2, Roadmap to Moonlight 3 and 4 - Miguel de Icaza
1 week ago · 31 comments
-
New Moonlight Covenant has been posted - Miguel de Icaza
2 days ago · 6 comments
-
C# Support for Tuples - Miguel de Icaza
But I think I know why it is not part of Stream: There are simply too many decisions to make if such a method must robustly handle all kinds of streams:What buffer size to use; what is a suitable time-out, particularly for network streams; should failed operations be retried, and how many times etc..
Buffer size? A reasonable number, e.g. 4096 or 8192 bytes -- something close to filesystem block size without causing too much trouble for the GC.
Time outs? Ignore them. If timeouts are needed, then the calling code can set Stream.ReadTimeout and Stream.WriteTimeout appropriately; I don't see why a helper method should care.
Retries? Retries are evil: http://blogs.msdn.com/oldnewthing/archive/2005/....
A utility method doesn't need to be the be-all, end-all method to be useful. It just needs to provide a sane solution to a problem that would otherwise be hand-written every time.
In my intuition it almost always comes down to deficiencies in the desing of the stream libraries/consumers that create the need to 'copy' a stream. Invariably this is more like 'relabeling' just to to the taste of a receiving party. While thinking about this it might simply be a wrong intuition (I can think of simple counterexamples like copying a socket to a file).
In all non-trivial cases, however, Copy is simply a misnomer, because a fair bit of transformation is usually involved (charsets, endianness, line ends, etc).
In my experience, C++ <iostream> is about the only library that 'gets it right' (.NET duplicating to much of the pitfalls from Java, thought fortunately less so). In c++ any (compatible, i.e. wellknown conversions or 1:1 binary equivalent) streams can simply be copied by saying smart things like:
std::cout << fstream1.rdbuf() /* << std::eos */;
Essentially: it is the separation of buffer and stream that saves the day. (Don't try std::cout << fstream1; unless you are very interested in the (hex) address of fstream1 instance).
Stream is _only_ byte oriented. No encodings, no endianness, no line endings, just raw data. It is thus analogous to the C++ std::streambuf type, if even more primitive (there is no wstreambuf equivalent).
StreamReader and StreamWriter are responsible for text-oriented manipulation, such as encoding issues, end of line encodings, etc., which is what std::istream and std::ostream deal with in C++ (and more).
I removed it.
The problem is that a Stream-backed IEnumerable<T> IS NOT an IEnumerable<T>, as it tends to break things rather badly. For example, many LINQ methods (and rocks-playgound IEnumerable<T> extension methods) assume that the sequence can be consumed repeatedly.
A stream-backed IEnumerable<byte> can only be iterated over *once*, reliably. Beyond that, and you start getting into seeking issues, which _really_ start killing the thought (NetworkStreams don't seek), etc.
It's cute, but likely unworkable in practice (at least in my experience).
In spite of this, I'm still keeping some "once-only" iterator types (TextReaderRocks.Lines() and TextReaderRocks.Words()), as I think they're useful even if they can only be used once, but it's something you need to be careful about.
I'll agree that quite a few methods are predicated on re-enumerability, but I don't think that breaks things, if you're of the mind that IDisposable isn't abused when you use it for scoping Transactions (for example).
The problem is that the Stream in question could be *anything*, i.e. a NetworkStream. By reading data from this stream (in order to make a copy) you will render the initial stream unusable because you've already read the data out, and you cannot rewind it. What you end up with is a single usable version of the stream instead of two.
If you want to be able to 'Copy' a stream, you have to do it manually. You need to read all the data from the initial stream as a byte[], then instantiate all the MemoryStreams you want from this byte[].
Suppose you wanted to implement Stream.CopyStream, you could create a class like
public class StreamSplitter : Stream
{
public StreamSplitter (Stream initialStream)
{
// Do stuff
}
public Stream GetCopy()
{
// Do stuff
}
}
Internally the StreamSplitter will read from the initial stream and store the data in a byte[]. GetCopy will then return a MemoryStream type class based on this byte[]. This class would have the ability to tell the StreamSplitter to get more data from the initial stream and make it available to all the copies which have been created.
The problem is that any copy can initiate this request for more data at any time. This is a pain in the ass for thread synchronisation with regards to the initial stream. Then you have more threading problems when you try making this new data available to the existing copies. All i can say is that there's no way i'd ever implement something like that ;)
In that case, you would not use CopyStream, it is a perfectly fine restriction to say that the stream is consumed after CopyStream is finished.
There is nothing preventing a "TeeStream" (like the Unix Tee command) to duplicate the contents as it goes and composing the above described operation using this.
But many times I do not care about keeping a copy, and I do not care about chunked output, or if its network or not, I just want to move the bytes from the source to the destination. And I have seen various broken versions of this loop, or versions that are too optimistic and have never been properly tested.
Something like:
public class TeeStream : Stream
{
public TeeStream (IEnumerable<Stream> destinationStreams)
{
}
}
would be fairly easy to implement. Something like that should already be in the BCL I suppose. There's no real reason why not.
It just goes to show that if you ask a question like this you will get lots of people trying to look clever by thinking up reasons why it can't be done.
I've implemented this function multiple times. If the framework can have File.WriteAllLines then it sure as hell should have a CopyStream method. If people have esoteric situations where a simple implementation wouldn't work.... well then, don't use the simple implementation!
PS: TextReader should implement a 'Lines' enumerator.
stream2.Write(stream1), then that's trivial enough to implement. However if you want something more along the lines of giving you three identical copies of the same stream so you can independently read from the copies and progress the initial stream, then it's a lot harder.
byte[] bytes = GetData();
MemoryStream a = new MemoryStream(bytes);
MemoryStream b = new MemoryStream(bytes);
MemoryStream c = new MemoryStream(bytes);
If you want a CopyStream that essentially does the above except using a Stream as the base rather than a byte[] as the base, the task is very difficult.
http://gist.github.com/12956
In general, though, this only useful for low latency and small streams. In environments where scalability is key, copying may be best performed using async I/O (BeginRead-BeginWrite).
Are you aware of "Mono.Rocks"? It is a library of extension methods that some developers have been prototyping to add useful extension methods.
I checked the website and found that they've somehow managed to hack into my computer and steal my Int32.Times extension method, so I'll be contacting my lawyers forthwith.
Indeed it should, which is why Mono.Rocks has a TextReader.Lines() extension method in rocks-playground. :-)
But please name it appropriately: I think you want a "CopyData" Method (you want to copy the data that is transported by the stream not the stream itself if I understand correctly)