The mystery of the crashing NSPredicate
By: Brian Webster |
I recently fixed a crash I was seeing reported in one of my applications, and thought the steps I ended up going through to figure out what was going on might be worth sharing. This may not end up actually being useful to anyone, but hopefully it will at least serve as some programming nerd entertainment!
The crash
The top of the crashed thread’s stack trace looked like this, while trying to filter a set using an NSPredicate:
Thread 4 Crashed:
0 libobjc.A.dylib 0x00007fff8501fe90 objc_msgSend + 16
1 Foundation 0x00007fff87af1701 -[NSPredicateOperator performOperationUsingObject:andObject:] + 109
2 Foundation 0x00007fff87af0fe8 -[NSComparisonPredicate evaluateWithObject:substitutionVariables:] + 290
3 Foundation 0x00007fff87af0a9a -[NSPredicate evaluateWithObject:] + 18
4 Foundation 0x00007fff87af0a1f _filterObjectsUsingPredicate + 332
5 Foundation 0x00007fff87b9a60a -[NSSet(NSPredicateSupport) filteredSetUsingPredicate:] + 337
OK, crashing in objc_msgSend usually means you’ve got a bad object pointer gumming up the works. My first guess was to think there was some sort of memory management issue going on, where an object was being messaged after having been deallocated. Looking at the code surrounding the crash though, I didn’t see anything obviously wrong that might cause that sort of problem though. Next, I looked at the top of the crash report for details on what was actually causing the crash:
`Date/Time: 2014-11-03 09:26:05 +0000
OS Version: Mac OS X 10.7.5 (11G63b)
Report Version: 104
Exception Type: SIGSEGV
Exception Codes: SEGV_MAPERR at 0x10
Crashed Thread: 4
`
OK, so it seems to be trying to access the address 0x10. That is definitely not going to be a real pointer to an object - but how is that getting there? Fortunately, one of my users actually filled out that little box in the crash reporter that asks what you were doing when the app crashed! (PSA: always put something into that box if you possibly can - sometimes even the most innocuous looking clue will get the developer looking in the right place)
The culprit
That clue got me looking at a particular predicate in my code, the format string for which is “FUNCTION(self, ‘isAlbumMemberWithMask:’, $VALUE) == YES”. This is a somewhat esoteric feature of NSPredicate that lets you define a custom function that is evaluated as part of the predicate. For the gory details on how this works, I refer you to Dave DeLong’s extensive explanation here.
The code for my -isAlbumMemberWithMask: method is below. Now that you’ve read that article and know how custom functions are supposed to be written, you’ll immediately see what I did wrong here:
- (BOOL)isAlbumMemberWithMask:(NSNumber*)inMask
{
return (self.albumMembershipMask & [inMask unsignedIntegerValue]) != 0;
}
The problem here is that the return value from this method needs to be an object, but I’m returning a BOOL. That certainly explains why there’s a problem when trying to send it an Objective-C message! So, the fix here is to change this method to return an NSNumber instead of a BOOL. Easy peasy.
But wait a minute…
I had run this code through its paces before releasing it, and written several unit tests to exercise it, and everything worked fine. So… how on earth did this ever work in the first place?! I must know!
Looking through the crash reports again, I noticed that all the crashes I received were either on 10.7.x or 10.8.x. I had not received any crashes on 10.9 or later, and I had done most of my development/testing on 10.9, so it seems that somehow, my broken code worked on 10.9, but not on earlier versions of the OS. Huh?
Thinking through things, my method would always return one of two values: NO or YES, or in numeric terms, 0 or 1. The 0 value would be treated as a nil object pointer, and sending a message to a nil object will always return 0/nil. NSPredicate would presumably send a -boolValue message to the returned object to determine whether things evaluated to true or false. An NSNumber wrapping NO would return 0, but a nil object would also return 0, so it makes sense why the NO case would work. But what about YES?
Enter tagged pointers
On newer architectures such as 64-bit Intel, Apple has implemented something called tagged pointers. For the gory details on how these work, I refer you to Mike Ash’s extensive explanation here. The basic idea is that due to the way memory is aligned on newer architectures, the bottom several bits of an object pointer are always going to be 0. Apple takes advantage of this for certain classes by storing the value for the object in the pointer rather than allocating separate memory to hold the value. Since in a normal object pointer, the bottom several bits will always be 0, a tagged pointer instead has its bottom bit set to 1. When the Objective-C runtime sees that bottom bit is not 0, it knows that it’s dealing with a tagged pointer and not a “real” object pointer, and can use a different code path to handle sending messages to the object.
In our case, if our method is returning a YES value, that will end up with a full value of 0x0000000000000001. If you try interpreting this value as an object, the runtime will think it is a tagged pointer, since the bottom bit is 1. OK then, so what kind of object is this exactly?
Well, the class of the tagged pointer object is determined by looking at the next 3 bits of the value, which allows for 8 different classes to be represented by tagged pointers. After some searching, I found the list of these classes defined in the Objective-C runtime, which is open source. The list looks like this:
{
OBJC_TAG_NSAtom = 0,
OBJC_TAG_1 = 1,
OBJC_TAG_NSString = 2,
OBJC_TAG_NSNumber = 3,
OBJC_TAG_NSIndexPath = 4,
OBJC_TAG_NSManagedObjectID = 5,
OBJC_TAG_NSDate = 6,
OBJC_TAG_7 = 7
};
For our object, those three bits are all going to be 0, meaning that this tagged pointer will be interpreted as an NSAtom, whatever the heck that is. It’s not a publicly defined class, to be sure, so I fired up LLDB in Xcode to see if I can track this thing down.
NSAtom
OK, let’s just try printing one of these things out and see what happens.
`(lldb) po (id)0x1