|
|
|
More a question than a comment, and apologies if off topic...
I'm using SWIG to create CPython bindings for a C++ project. I know that SWIG can also create bindings for C#. Can IronPython use these SWIG-created bindings in some way, or would I need to use something like IronClad.
baoilleach |
Homepage |
08/05/19 - 4:52 pm | #
|
|
If you can create C# bindings to a project then you can use the C# bindings (in the form of the compiled assemblies) from IronPython, yes. The interface will be different though, as the C# bindings will expect C# types (statically enforced) instead of IronPython types.
This is a better short term bet for using a project from IronPython though! (and possibly a better long term bet - Ironclad is really targetting using C Python extensions from IronPython. Where you have the choice of generating C# bindings that may be a better option.)
Michael Foord |
Homepage |
08/05/19 - 5:09 pm | #
|
|
This is a very interesting read as its a complicated but important part of the solution. However, understanding it is a bit hard because of the terminology confusion. A simple diagram would be extremely useful to clarify things.
The classic problem with mixing two heaps managed by different garbage collectors is dealing with cycles that span the two heaps. In this case, it would be if the unmanaged object pointed to a managed object which pointed back to the unmanaged object. I am not sure you support this scenario currently. But if you did, would your current object lifetime management solution still work?
Btw, IronPython objects can be accessed from unmanaged code as IDispatch. See http://blogs.msdn.com/shrib/arch...g-
ireflect.aspx for details. This could possibly be useful to you to allow extension types to hold a reference to IronPython objects. Since CPython has good support for IDispatch via win32com, it might be possible to build on top of it to enable the CPython objects to talk to IronPython objects. Just FYI. I have not thought whether this actually works or not.
Shri Borde |
Homepage |
08/05/19 - 9:33 pm | #
|
|
Interesting. I agree a good diagram could help a lot here.
I'm not sure about cycles. I don't *think* it is a problem as unmanaged objects can't (currently) hold references to arbitrary managed objects.
We also may not need to provide access to IronPython objects to IronPython objects (except for objects like lists where direct memory access is allowed and we have to convert them as they cross the boundary). All access should be through the Python API, which we have implemented to directly work with IronPython objects.
Michael Foord |
Homepage |
08/05/19 - 10:05 pm | #
|
|
Hmmm... I've been thinking (a bit) about this.
I don't think we can have cycles between managed and unmanaged 'objects' because we don't really have any unmanaged objects.
What we do is create managed objects with corresponding unmanaged resources associated with them. With our reference counting the unmanaged resources can keep their corresponding managed object alive - but the unmanaged resource isn't an object and can't own references to other managed objects (this is my understanding).
*However*, the unmanaged resources can own references to other unmanaged resources (via the reference count mapper) which will keep their corresponding managed objects alive.
In this way we can have cycles that exist on the unmanaged side keeping each other alive.
CPython has a cycle detector to habdle this, and to detect this we would need to reimplement (or possibly copy) the cycle detector...
Another alternative would be to have a way of explicitly disposing of "managed objects of unmanaged types" that you know you have finished with. We might need to walk object graphs though (where you want to dispose of large data structures), which may not be trivial. A problem for another day...
Michael Foord |
Homepage |
08/05/20 - 12:36 am | #
|
|
Consider the IronPython code below. register_sink is a method that stores the incoming argument and later, say, invokes some method on that object in response to some event.
# Some IronPython code
import CPythonExtension
unmanagedObj = CPythonExtension.ExtensionType()
managedObj = IronPythonType()
unmanagedObj.register_sink(managedObj)
In this case, the unmanaged object now has a reference to a managed object.
I am not saying that this is an important scenario. Just that in this case, you do run into the cycle. If this is a limitation that does not need to be supported, that sounds reasonable. However, without knowing if you intend this to be a limitation, it was hard to think through how the object lifetime management algorithm you described would work.
Shri Borde |
08/05/20 - 3:52 am | #
|
|
Hi Shri
I think I can reassure you on the above topic. Michael's article was only covering the strategy in the case of a managed-object-with-unmanaged-type, and didn't cover the (much simpler) details of how we deal with normal .NET objects when they need to be exposed to unmanaged code.
When you execute (from the example above):
unmanagedObj.register_sink(managedObj)
...the following things happen.
* Ironclad allocates an unmanaged PyObject -- 8 bytes -- and fills in a reference count (managedPtr->ob_refcnt = 1) and a type pointer (managedPtr->ob_type = whatever). For future reference, we'll call the newly allocated block managedPtr.
* It stores managedObj and managedPtr in a 2-way map for later retrieval -- crucially, it keeps a strong reference to managedObj, so managedObj will not be GCed under our feet.
* It calls a delegate which wraps the function pointer provided by ExtensionType.register_sink, passing unmanagedObj._instancePtr and managedPtr.
* Magic happens in the background -- the unmanaged code at the other end of the function pointer could be doing anything .
* Some value is returned from the delegate; it's not relevant, so I won't mention it again.
* Ironclad cleans up by calling the DecRef method with managedPtr. This function reads managedPtr->ob_refcnt; if the refcount is 1, it calls the PyObject_Free function, which both frees the memory and throws away the strong reference to managedObj.
* However, if unmanagedObj wanted to use managedPtr later, it would have passed it into the Py_INCREF macro, which would have incremented managedPtr->ob_refcnt. Hence, when it's passed to Ironclad's DecRef method, the refcount would be 2.
* In this case, the DecRef method just decrements managedPtr->ob_refcnt, and leaves managedObj and managedPtr in the map.
At this point, the unmanaged code is solely responsible for the destruction of managedObj -- assuming it's written properly (:p), it will at some point pass managedPtr into the Py_DECREF macro, which will decrement managedPtr->ob_refcnt, note that it is now 0, and call managedPtr->ob_type->tp_dealloc(managedPtr).
That tp_dealloc is, in fact, a function pointer obtained from a managed delegate: this will (eventually; some details skipped) call Ironclad's PyObject_Free function, which will remove managedObj and managedPtr from the map. Assuming that was the last strong reference to managedObj, it will shortly be garbage collected, and all will be right with the world.
I hope that was helpful .
William
William Reade |
08/05/20 - 8:00 pm | #
|
|
I do understand now how "unmanagedObj.register_sink(managedObj)" would work. Thanks for the explanation.
So Michael's explanation was how unmanagedObj itself was tracked? Isnt that scheme simlar except that DecRef is replaced by a finalizer method?
It does sound that you do intend to support CPython and IronPython objects to hold onto objects created by the other system. This could end up with cycles between the heaps of the two systems. If you had code like below, would all the memory associated with managedObj and unmanagedObj be collected at the end of the snippet below?
# Some IronPython code
import CPythonExtension
unmanagedObj = CPythonExtension.ExtensionType()
managedObj = IronPythonType()
# Create a cycle
unmanagedObj.register_sink(managedObj)
managedObj.register_sink(unmanagedObj)
# clear out any references
del unmanagedObj
del managedObj
import gc
gc.collect()
Shri Borde |
Homepage |
08/05/20 - 11:49 pm | #
|
|
I think the answer is that cycles are possible and we will have to work out a strategy to handle that.
I don't think you need to cross heaps for our system to have cycles - internal cycles between unmanaged objects are also possible. (And cycles between managed objects and unmanaged objects effectively become cycles between unmanaged objects in Ironclad.)
CPython has a cycle detector especially for this case and we may have to port it into Ironclad.
Michael Foord |
Homepage |
08/05/21 - 8:26 am | #
|
|
Hi Shri
Certainly, yes, the schemes are similar in many respects . Hopefully the details of the differences have been interesting all the same.
With regard to managed/unmanaged cycles: I'm pretty sure that both managedObj and unmanagedObj will be leaked in the case above (until Ironclad shuts down, anyway). Similarly, lists containing themselves can't be nicely cleaned up.
I'm sorry to say that we have no immediate plans to fix these issues. Possibly, at some point, we'll be able to press the CPython cycle detector into service, but IMO there are many more important features to implement before then .
William Reade |
08/05/21 - 11:12 am | #
|
|
The CLR will also leak if there are cycles between COM objects and managed objects for the very same reason. In practice, people do not run into this often.
Similarly, if there are cycles between JScript objects and the IE DOM, it leads to leaks. See http://javascript.crockford.com/...emory/
leak.html and http://blogs.msdn.com/gpde/pages...k-
detector.aspx.
I am not suggesting that this problem is a priority. In fact, I would not do anything until there was a known real-world scenario that was running into it. I was asking about cycles just to figure out what the expectation was so that I could digest the blog article. Thanks for the explanations!
Shri Borde |
08/05/21 - 5:41 pm | #
|
|
|
Commenting by HaloScan
|