Sunday, Feb 22, 2004, 9:51 PM
Adding Ref-Counting to Rotor
Well, I can't put it off any longer. I wanted to wait 'til Chris Tavares and I were able to figure out what the problem was, but after months of help from the Compuware guys using a complementary hunk of their latest and greatest profiling software, we're no closer to finding out what causes the massive slow-down in compiling Rotor when we add our ref-counting implementation to it. The problem, of course, is that if we can't figure out what the problem is, we can't optimize our implementation to fix it.
The good news is that once we wait for the compilation to complete, the hybrid ref-counting (for deterministic resource management) + ref-tracking (for memory management) programming model is a thing of beauty. No longer do you have to write Dispose methods or even call Dispose methods, because our updated JITter takes care of it for you.
However, because compilation of Rotor makes such nice use as a managed test of our implementation and because it currently sucks butt perf-wise, we're stuck. If you're into the low-level guts and you've got the time (that's what Chris and I ran out of), we'd love your feedback on our perf problems. Who knows, if we can make it efficient, MS might stick it into the next version of the CLR... (but I don't work for that team, so don't hold me to that : ).
BTW, I'd like to really thank Chris Tavares for his implementation work. It was my idea and my funding, but he did all the work up to and including several drafts of the paper. All I did was complain. Thanks, Chris, and I hope you're enjoying your child, which is a far better way to spend your free time than implementing my cockamamie ideas in Rotor.
BTW2, I'd like to take this opportunity to publicly apologize to Brian Harry. It is definitely the case that ref-counting *can* be added to .NET but I have in no way shown that it *should* be done. Thanks for letting me try, though. : )
13 comments on this post
John:
This rant isn’t really addressed to anyone in particular, perhaps ‘programmers’ or ‘Microsoft’.
Reading between the lines: "Ref-counting in .NET isn't going to happen".
That means that we're stuck with IDisposable.
How about we start admitting that IDisposable is necessarily first-class because programmers frequently find themselves
in situations where they need precise control over object 'destruction' (in the general sense) and they need to
interact with the rest of the world outside of their own code and have a consistent understanding of the semantics
attributed to interfaces? If IDisposable is necessarily first class, then could we properly and formally define its
usage? Please? Pretty please, with sugar on top? I don't have a problem with 'contract' being part of the 'language',
(there is no use trying to pretend that it isn't, especially when you have language features calling on interfaces).
It would be nice if the BCL could lead from the front-lines by consistently applying the appropriate semantics. What is
this I hear [1] about IEnumerator using 'dispose' to 'finish'!! Surely you finish with an enumerator, not dispose of
it? If 'finishing' with and enumerator requires 'disposal' isn't that simply an implementation detail that should
depend on whether or not the enumerator relies on something outside the managed environment? I can define an interface
like IDisposable in one line of code, why be so stingy with them and try and make IDisposable do everything?!
Seriously, please, stop it.
I'd stop going on about this, but I'm losing so much sleep dwelling on it that I need this problem to be addressed for
the sake of my health and well-being. In addition I've got things that I can actually earn money worrying about, and
I'd rather worry about those.
From the C# spec [2]:
"A resource is a class or struct that implements System.IDisposable, which includes a single parameterless method named
Dispose. Code that is using a resource can call Dispose to indicate that the resource is no longer needed. If Dispose
is not called, then automatic disposal eventually occurs as a consequence of garbage collection."
Shortly after that it goes on:
"A using statement is translated into three parts: acquisition, usage, and disposal. Usage of the resource is
implicitly enclosed in a try statement that includes a finally clause. This finally clause disposes of the resource."
So we have to infer from the spec that 'disposal' is the process of calling ((IDisposable)o).Dispose() on some object
(or struct) o that implements the IDisposable interface. From this definition that is nearly the only restriction on a
'disposable' object, there is nothing that limits what a disposable object is beyond implementing the
IDisposable.Dispose method except that the spec goes on the assert that if IDisposable.Dispose() is not called to
accomplish 'disposal' then 'disposal' is accomplished automatically as a result of garbage collection. There is a
glaring problem here. In the first instance 'disposal' means nothing more than calling a method called 'Dispose' on the
IDisposable interface (which isn’t very helpful if you want to know when you should be disposing your objects), in the
second instance disposal will happen automatically when garbage collection occurs, meaning that IDisposable.Dispose
must be called automatically as a part of finalization, otherwise the term 'disposal' has differing and undefined
meanings when it is used in the same context to describe this process. Since IDisposable.Dispose() is not called
automatically as part of finalization I can conclude that the term 'disposal' has differing meanings that haven't been defined in this context. Since I can't define disposal to use the term is to simply talk nonsense. Nonsense isn't very helpful to me.
Certainly ((IDisposable)o).Dispose() can not be called during the process of garbage collection, because the
IDisposable.Dispose() implementation is allowed to defer (or is it?) part of its disposal logic to another managed
interface that it references, and this is not allowed from the finalizer given the non-deterministic nature of GC.
Given the protected Dispose(bool) pattern, the finalizer can call Dispose(false), but this is not 'disposal' because
'disposal' is calling IDisposable.Dispose() right?
What about ControlCollection.Dispose()? Is that 'disposal'? It doesn't call IDisposable.Dispose on the Control
instances that it owns. Instead it calls Control.Dispose(). Control.Dispose() is not the same as IDisposable.Dispose()
given that I can re-implement one and not the other.
Surely disposal is more than simply an indication that a 'resource' (defined here as anything that implements
IDisposable) is no longer needed. No longer needed by who? Doesn't it also require that the caller assert ownership of
the object it is disposing? What of attempts to use objects that have been disposed? What does 'disposal' mean anyway
given that it is used in the same context to have different meanings? Isn't disposal just supposed to be a mechanism to
alert an external platform (outside the CLR) that 'stuff' (to avoid using the heavily overloaded term 'resource') that
is 'owned' by the external platform is no longer needed by the managed environment? Generally we don’t really care
about 'deterministic finalization' except for when we are interacting with stuff that is outside of the managed
environment right? So why then are we allowing the IDisposable interface to have any use in a purely managed context?
Why do I always feel so angry whenever this topic comes up? I'm going to stop typing now so I can search for something
that I can break. I really don't care what 'disposal' actually means I just want to know what it actually means. An
interface is a contract, so what is my contract when I implement IDisposable?
Apart from sidestepping the inability to have multiple inheritance isn’t the point of an interface that a ‘contract’ is
defined, so that when I call an interface I already know what is going to happen? Surely an interface is there so that
I can learn the semantics implied by the interface once and then confidently call (or not call) the interface knowing
exactly what is going to happen without having to look at the documentation for every single class that ever implements
that interface. I mean, IDisposable doesn’t just exist so that I can use the ‘using’ construct in C# right? Next
perhaps we’ll start overriding GetHashCode() to always return 42 and Equals() to always return false?
[1] http://download.microsoft.com/download/8/1/6/81682478-4018-48fe-9e5e-f87a44af3db9/SpecificationVer2.doc
[2] http://www.ecma-international.org/publications/standards/Ecma-334.htm
Monday, Feb 23, 2004, 3:16 AM
John:
Monday, Feb 23, 2004, 3:24 AM
Pierre Phaneuf:
How much of your code doesn't use resources?
Monday, Feb 23, 2004, 12:09 PM
Chris Sells:
Monday, Feb 23, 2004, 12:40 PM
Pierre Phaneuf:
http://xplc.sourceforge.net/
Monday, Feb 23, 2004, 2:44 PM
Chris Tavares:
In C++, failing to properly manage resources is a flat-out disaster. Leaks, crashes, fire raining from the sky, dogs and cats living together...
The worst thing about it is that everybody on the project has to do it absolutely correctly every single time. That's a lot of discipline to maintain.
The GC and finalizers act as a "safety net" to prevent at least some of these disasters. As a result, I don't need such overwhelming discipline constantly and from everybody on my team.
Granted, I still put using( ) statements in my code, but it's not quite so disasterous if I forget one, which is a nice change of pace. And it's a LOT better than writing the 342nd *$#*! wrapper class to do RAII.
Monday, Feb 23, 2004, 3:50 PM
Dilip:
Tuesday, Feb 24, 2004, 10:00 AM
Chris Sells:
Tuesday, Feb 24, 2004, 3:39 PM
Chris Tavares:
In .NET, they started with a GC, and backfitting refcounting isn't trivial. Everything, even stuff that you wouldn't think needs it, needs a refcount. Getting the refcount updated in the right spots was extremely difficult. And, quite honestly, parts of the rotor code are just plain ugly, which didn't help much.
As far as the performance impact goes, when the CLR starts up THOUSANDS of objects are created. Standard IO streams. Exception objects. Strings by the hundreds. AppDomains. Type objects. All this stuff gets addref'ed and released constantly even before your code actually starts running. I'm sure some optimization could remove some of this, but I simply didn't have time to go down those routes.
I'd love for somebody else to pick up where I left off and tell me where I screwed up. Just be polite about it, please. :-)
Wednesday, Feb 25, 2004, 8:50 PM
Chris Sells:
http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/4d0813ed26f44d70/c630c0830ab509a8#c630c0830ab509a8
Thursday, Jan 11, 2007, 2:46 PM
rkgl adevcxz:
Thursday, Mar 8, 2007, 11:54 PM
lvfosi vdoxpjg:
Thursday, Mar 8, 2007, 11:54 PM
kfcsqehrt xiakgpmc:
Thursday, Mar 8, 2007, 11:54 PM




