RAOperationQueue, an open-source replacement for NSOperationQueue
Posted By Mike Ash on December 1st, 2008
When we started on our first Leopard-only software, we were excited to to use NSOperationQueue, as it promised to greatly ease the pain of multithreading. And it lived up to its promise well, right up to the point where we discovered that it was fatally broken.
(And I mean fatally broken. It spontaneously crashes. There’s no workaround and it’s unlikely that it will get fixed in 10.5.)
By that time we had already written a ton of code that revolved around NSOperationQueue. If it weren’t unusably broken, it would have been a great way to work. So we decided to simply write our own replacement instead. Since NSOperationQueue is a nice API, and since we all hate crashy Mac OS X software, we decided to release it to the world so that we can all benefit.
You can download RAOperationQueue here.
Note that this is not a 100% drop-in replacement for NSOperationQueue. Since it was written for our code, it only does what we needed it to do, so it has some limitations. Specifically, RAOperationQueue:
- Only supports one running operation at a time in each queue. In other words, equivalent to setMaxConcurrentOperationCount:1 no matter what.
- Only supports two priority levels.
- Has no support for inter-operation dependencies.
- Has no support for “concurrent” operations which don’t need to be run on a worker thread.
- Spawns one worker thread per queue instead of sharing a pool of worker threads.
- Perhaps some others.
However it also has some significant benefits:
- 100% lockless and non-blocking in all circumstances, except that the worker thread will block when no operations are pending.
- Full control over worker threads, can potentially set custom priorities or other thread properties.
- Shouldn’t crash. If it does, you have the code and can fix the problem!
Whether you can live with the limitations is of course something only you can decide, but we’re making very good use of this code internally, and we hope that you can too.
Comments, questions, bug fixes, and other contributions are welcome by e-mail to mikeash at this domain.
RAOperationQueue is made available under a very liberal license which just requires a mention in your about box. We hope you’ll find it as useful as we have.
Update: As of Mac OS X 10.5.7, NSOperationQueue is fixed and safe to use again.
Steven Fisher says:December 2nd, 2008 at 2:27 pm
If NSOperationQueue crashes, isn’t it likely it’s executing something it shouldn’t and will be fixed as part of a security update?
Andy Mroczkowski says:December 2nd, 2008 at 2:32 pm
Thanks for contributing this to the community?
One question: what is the minimum OS requirement (10.4, 10.5)?
Tim Buchheim says:December 2nd, 2008 at 3:29 pm
Steve has a good point. Typically the fastest way to get a bug fixed by Apple is to look for some way to exploit the bug to create a security problem. This may or may not be possible in this case, but if someone can find some sort of exploit, then that will get Apple’s attention and will eventually lead to a fix.
Mike Ash says:December 2nd, 2008 at 3:52 pm
Steven: The crash is due to an uncaught exception which is thrown by an assertion failure when an NSOperation is executed twice. (They are designed to only execute once.) I really doubt that it’s a security problem but I certainly could be wrong.
Andy: We’ve only been using it on 10.5 but as far as I can see there’s nothing which would prevent it from working on 10.4. I think that the OSAtomicCompareAndSwapPtrBarrier function may not be available on 10.4, but that can easily be replaced with one of the other CAS functions which are.
Steven Fisher says:December 2nd, 2008 at 5:32 pm
Oh, I thought it was something exploitable. I guess I have a poor memory today. Thanks for straightening me out on that. :)
dave says:December 2nd, 2008 at 7:40 pm
You couldn’t just create a new NSOperation’s instead of reusing existing ones to avoid the crash?
Mike Ash says:December 2nd, 2008 at 10:11 pm
dave: That’s just the thing, we aren’t reusing existing NSOperations. You can’t do that. The system takes one-shot NSOperations and, every so often, tries to run one of them twice. Hilarity ensues.
Details here: http://www.mikeash.com/?page=pyblog/dont-use-nsoperationqueue.html
gparker says:December 2nd, 2008 at 10:48 pm
RAOperationQueue is not GC-safe. The problem is RAAtomicListNode, which is a malloc block that is used to store pointers to RAOperation objects. The garbage collector will delete RAOperations even if they’re still in the list.
Mike Ash says:December 2nd, 2008 at 11:59 pm
gparker: Thanks for pointing that out. If anyone has a nice patch for GC compatibility, feel free to send it in. It appears that it should be pretty straightforward to fix but I didn’t look all that closely.