Adding Ref-Counting to Rotor

Here.

Well, I can't put it off any longer. I wanted to wait 'til Chris Tavares and I were able to figure out what the problem was, but after months of help from the Compuware guys using a complementary hunk of their latest and greatest profiling software, we're no closer to finding out what causes the massive slow-down in compiling Rotor when we add our ref-counting implementation to it. The problem, of course, is that if we can't figure out what the problem is, we can't optimize our implementation to fix it.

The good news is that once we wait for the compilation to complete, the hybrid ref-counting (for deterministic resource management) + ref-tracking (for memory management) programming model is a thing of beauty. No longer do you have to write Dispose methods or even call Dispose methods, because our updated JITter takes care of it for you.

However, because compilation of Rotor makes such nice use as a managed test of our implementation and because it currently sucks butt perf-wise, we're stuck. If you're into the low-level guts and you've got the time (that's what Chris and I ran out of), we'd love your feedback on our perf problems. Who knows, if we can make it efficient, MS might stick it into the next version of the CLR... (but I don't work for that team, so don't hold me to that : ).

BTW, I'd like to really thank Chris Tavares for his implementation work. It was my idea and my funding, but he did all the work up to and including several drafts of the paper. All I did was complain. Thanks, Chris, and I hope you're enjoying your child, which is a far better way to spend your free time than implementing my cockamamie ideas in Rotor.

BTW2, I'd like to take this opportunity to publicly apologize to Brian Harry. It is definitely the case that ref-counting *can* be added to .NET but I have in no way shown that it *should* be done. Thanks for letting me try, though. : )



13 comments on this post

John:


I hope you don’t mind Chris, but I’m going to use your blog to vent.. :P

This rant isn’t really addressed to anyone in particular, perhaps ‘programmers’ or ‘Microsoft’.

Reading between the lines: "Ref-counting in .NET isn't going to happen".

That means that we're stuck with IDisposable.

How about we start admitting that IDisposable is necessarily first-class because programmers frequently find themselves

in situations where they need precise control over object 'destruction' (in the general sense) and they need to

interact with the rest of the world outside of their own code and have a consistent understanding of the semantics

attributed to interfaces? If IDisposable is necessarily first class, then could we properly and formally define its

usage? Please? Pretty please, with sugar on top? I don't have a problem with 'contract' being part of the 'language',

(there is no use trying to pretend that it isn't, especially when you have language features calling on interfaces).

It would be nice if the BCL could lead from the front-lines by consistently applying the appropriate semantics. What is

this I hear [1] about IEnumerator using 'dispose' to 'finish'!! Surely you finish with an enumerator, not dispose of

it? If 'finishing' with and enumerator requires 'disposal' isn't that simply an implementation detail that should

depend on whether or not the enumerator relies on something outside the managed environment? I can define an interface

like IDisposable in one line of code, why be so stingy with them and try and make IDisposable do everything?!

Seriously, please, stop it.

I'd stop going on about this, but I'm losing so much sleep dwelling on it that I need this problem to be addressed for

the sake of my health and well-being. In addition I've got things that I can actually earn money worrying about, and

I'd rather worry about those.

From the C# spec [2]:

"A resource is a class or struct that implements System.IDisposable, which includes a single parameterless method named

Dispose. Code that is using a resource can call Dispose to indicate that the resource is no longer needed. If Dispose

is not called, then automatic disposal eventually occurs as a consequence of garbage collection."

Shortly after that it goes on:

"A using statement is translated into three parts: acquisition, usage, and disposal. Usage of the resource is

implicitly enclosed in a try statement that includes a finally clause. This finally clause disposes of the resource."

So we have to infer from the spec that 'disposal' is the process of calling ((IDisposable)o).Dispose() on some object

(or struct) o that implements the IDisposable interface. From this definition that is nearly the only restriction on a

'disposable' object, there is nothing that limits what a disposable object is beyond implementing the

IDisposable.Dispose method except that the spec goes on the assert that if IDisposable.Dispose() is not called to

accomplish 'disposal' then 'disposal' is accomplished automatically as a result of garbage collection. There is a

glaring problem here. In the first instance 'disposal' means nothing more than calling a method called 'Dispose' on the

IDisposable interface (which isn’t very helpful if you want to know when you should be disposing your objects), in the

second instance disposal will happen automatically when garbage collection occurs, meaning that IDisposable.Dispose

must be called automatically as a part of finalization, otherwise the term 'disposal' has differing and undefined

meanings when it is used in the same context to describe this process. Since IDisposable.Dispose() is not called

automatically as part of finalization I can conclude that the term 'disposal' has differing meanings that haven't been defined in this context. Since I can't define disposal to use the term is to simply talk nonsense. Nonsense isn't very helpful to me.

Certainly ((IDisposable)o).Dispose() can not be called during the process of garbage collection, because the

IDisposable.Dispose() implementation is allowed to defer (or is it?) part of its disposal logic to another managed

interface that it references, and this is not allowed from the finalizer given the non-deterministic nature of GC.

Given the protected Dispose(bool) pattern, the finalizer can call Dispose(false), but this is not 'disposal' because

'disposal' is calling IDisposable.Dispose() right?

What about ControlCollection.Dispose()? Is that 'disposal'? It doesn't call IDisposable.Dispose on the Control

instances that it owns. Instead it calls Control.Dispose(). Control.Dispose() is not the same as IDisposable.Dispose()

given that I can re-implement one and not the other.

Surely disposal is more than simply an indication that a 'resource' (defined here as anything that implements

IDisposable) is no longer needed. No longer needed by who? Doesn't it also require that the caller assert ownership of

the object it is disposing? What of attempts to use objects that have been disposed? What does 'disposal' mean anyway

given that it is used in the same context to have different meanings? Isn't disposal just supposed to be a mechanism to

alert an external platform (outside the CLR) that 'stuff' (to avoid using the heavily overloaded term 'resource') that

is 'owned' by the external platform is no longer needed by the managed environment? Generally we don’t really care

about 'deterministic finalization' except for when we are interacting with stuff that is outside of the managed

environment right? So why then are we allowing the IDisposable interface to have any use in a purely managed context?

Why do I always feel so angry whenever this topic comes up? I'm going to stop typing now so I can search for something

that I can break. I really don't care what 'disposal' actually means I just want to know what it actually means. An

interface is a contract, so what is my contract when I implement IDisposable?

Apart from sidestepping the inability to have multiple inheritance isn’t the point of an interface that a ‘contract’ is

defined, so that when I call an interface I already know what is going to happen? Surely an interface is there so that

I can learn the semantics implied by the interface once and then confidently call (or not call) the interface knowing

exactly what is going to happen without having to look at the documentation for every single class that ever implements

that interface. I mean, IDisposable doesn’t just exist so that I can use the ‘using’ construct in C# right? Next

perhaps we’ll start overriding GetHashCode() to always return 42 and Equals() to always return false?

[1] http://download.microsoft.com/download/8/1/6/81682478-4018-48fe-9e5e-f87a44af3db9/SpecificationVer2.doc
[2] http://www.ecma-international.org/publications/standards/Ecma-334.htm

Monday, Feb 23, 2004, 3:16 AM


John:


Sorry for the bogus formatting, I wrote it in notepad and the linefeeds got inserted after I saved it.

Monday, Feb 23, 2004, 3:24 AM


Pierre Phaneuf:


If only someone could explain to me why having to call Dispose() is so much better than having to use "delete"... The great difficulty with memory management is figuring out when to free the memory, but now, we have to figure out when to "dispose" the "resources". Only for simpler cases that do not actually use "resources", you get the promised holy grail of GC and not having to think about it anymore.

How much of your code doesn't use resources?

Monday, Feb 23, 2004, 12:09 PM


Chris Sells:


And therein lies the need for an efficient ref-counting implementation for .NET, Pierre!

Monday, Feb 23, 2004, 12:40 PM


Pierre Phaneuf:


Oh, I know. ;-)

http://xplc.sourceforge.net/

Monday, Feb 23, 2004, 2:44 PM


Chris Tavares:


Having (finally!) gotten a C# day job, I'm really liking the GC even for resource management.

In C++, failing to properly manage resources is a flat-out disaster. Leaks, crashes, fire raining from the sky, dogs and cats living together...

The worst thing about it is that everybody on the project has to do it absolutely correctly every single time. That's a lot of discipline to maintain.

The GC and finalizers act as a "safety net" to prevent at least some of these disasters. As a result, I don't need such overwhelming discipline constantly and from everybody on my team.

Granted, I still put using( ) statements in my code, but it's not quite so disasterous if I forget one, which is a nice change of pace. And it's a LOT better than writing the 342nd *$#*! wrapper class to do RAII.

Monday, Feb 23, 2004, 3:50 PM


Dilip:


But tell me something, Python (as my friends tell me) has successfully employed GC+Ref couting all these years. Exactly what is in .NET that makes this job difficult?

Tuesday, Feb 24, 2004, 10:00 AM


Chris Sells:


I don't know, Dilip, but if you're care to dig into our source, we'd happily accept your feedback.

Tuesday, Feb 24, 2004, 3:39 PM


Chris Tavares:


Python started with refcounting, and only added the GC in version 2.2, which I think was about two years ago. The GC in Python is ONLY used to collect cycles. And, if you follow the python mailing lists, you'll see that they're still chasing down corner cases in the GC.

In .NET, they started with a GC, and backfitting refcounting isn't trivial. Everything, even stuff that you wouldn't think needs it, needs a refcount. Getting the refcount updated in the right spots was extremely difficult. And, quite honestly, parts of the rotor code are just plain ugly, which didn't help much.

As far as the performance impact goes, when the CLR starts up THOUSANDS of objects are created. Standard IO streams. Exception objects. Strings by the hundreds. AppDomains. Type objects. All this stuff gets addref'ed and released constantly even before your code actually starts running. I'm sure some optimization could remove some of this, but I simply didn't have time to go down those routes.

I'd love for somebody else to pick up where I left off and tell me where I screwed up. Just be polite about it, please. :-)

Wednesday, Feb 25, 2004, 8:50 PM


Chris Sells:


Interesting reference from the C++ standards committee on whether to add GC to C++ entitled "Reconciling C++ RAII with GC":

http://groups.google.com/group/comp.lang.c++.moderated/browse_frm/thread/4d0813ed26f44d70/c630c0830ab509a8#c630c0830ab509a8

Thursday, Jan 11, 2007, 2:46 PM


rkgl adevcxz:


tjoqipcuz vlorq sjzbe pute uvfdxtj fkeuxzshp zwprgn

Thursday, Mar 8, 2007, 11:54 PM


lvfosi vdoxpjg:


kzwqraye xtyqu uqob ndzuf mopbzxd jwucomayq xsyfjgqn http://www.xzuemd.ugblmvjtq.com

Thursday, Mar 8, 2007, 11:54 PM


kfcsqehrt xiakgpmc:


igjlbfh qcimuob qjwrmkzg exutvw rcutfsv wldx homiavkdz [URL=http://www.dntso.nsqkdw.com]wtsmgql mdlyc[/URL]

Thursday, Mar 8, 2007, 11:54 PM





comment on this post

HTML tags will be escaped.

Powered By ASP.NET

Hosted by SecureWebs

Microsoft

Mensa

IEEE


Best CD Rates
moving companies
addiction treatment
sunglasses
Kratom
How To Lose Weight Fast
cocktail dresses
Credit Card Balance Transfer
Add URL
Stock Trading
Health Insurance Quotes
Promotional Merchandise
Jet Privé
loans for bad credit