The Voidspace Techie Blog

Gravatar I believe the CPython garbage collector cannot break cycles if "both" objects have a __del__ method. As long as one of the cycled objects has no del, it can be destroyed first.
(whoa, where'd my profile pic come from?)


Gravatar >>>import re
>>>s = 'testntest2ntest3rnnntest4rntest5nnnn'
>>>r= '(?>>re.sub(r,'rn',s)
'testrntest2rntest3rnrnrntest4rntest5rnrnrnrn'

Isn't that what you're aiming for?


Gravatar Heh, your comments ate my code... here it is again:

>>>s = 'test\ntest2\ntest3\r\n\n\ntest4\r\ntest5\n\n\n\n'

>>>r= '(?>>>re.sub(r,'\r\n',s)

'test\r\ntest2\r\ntest3\r\n\r\n\r\ntest4\r\ntest5\ r\n\r\n\r\n\r\n'


Gravatar Damn! Let's try again:

r = "(?


Gravatar I give up. My code is at http://orestis.gr/en/blog/2008/0...exp-lookbehind/


Gravatar I'm actually surprised this would be causing the issue unless the strings were quite large. But one simple thing you could try would be:

text = text.replace('rn', 'n')
text = text.replace('n', 'rn')

The .NET GC can (and will) run anytime it wants. That includes in the middle of your method when you're running the above code, but most likely when one of the new strings are allocated. But in your code you're keeping both text and the other two strings alive as well (because text isn't available for reclamation into it's been assigned to after all 3 strings are computed).

So if these were really big strings, and anything over 80k will get shoved into the Large Object Heap making them problematic, getting one freed my be the difference between running and not.

Also if your GC heap isn't approaching the 1.3GB range (on 32-bit) then you probably want to look at fragmentation and pinning as potential things which could be making the string allocation a problem - of course I'm assuming here paging is always an option which prevents OOM.

Does Resolver have a 64-bit version?


Gravatar @orestis
The regex looks good, I've forwarded it to Christian. Your regex fu is better than mine!

@Dino
We don't specifically have a 64bit version, although there is no reason why not! Thanks for the info on garbage collection. I think that *most* of the time the replace isn't even necessary - so our current solution solves the problem adequately. And yes - those strings can be *very* big...


Gravatar Can't we assume that generated code doesn't need this replace operation, and perform the search/replace only on usercode sections?


Gravatar @jonathan

I *think* we managed to confirm that during normal operation it shouldn't be needed in the *generated* code chunks (once we corrected the license chunk). We still need to call it on the user code chunks though - and it is currently used in the 'Chunk' class which is used for all chunks.


Gravatar Definitely sounds like we could get a performance win by moving it out of Chunk (which, for non-Resolverites, is our class to represent a foldable section of the code - either generated code or user code) and into something specific to user code.


Name:

Email:

URL:

Comment:  ? 

 

Commenting by HaloScan