Cocoa data serialization benchmark: Archive framework vs custom serialization

Published on — Filed under cocoade, bare metal

TL;DR: Custom serialization proposed here is ~2x faster than Cocoa Archive Framework. Results are here.

Companion code for this article is on GitHub

Applying the findings of my last benchmark comparing Core Data and File System storage, I went with a similar solution on a recent project where I didn't need all the firepower in Core Data. The twist this time, however, was that instead of simply storing a key-value pair (NSString, NSDate), I needed to store instances of a class that had multiple fields, indexed by one of its fields — (NSString, SomeClass).

Objective: writing to/reading from disk, as fast as possible

This essay compares Cocoa Archiving Framework against a custom serialization method, using Binary Property Lists on serialization/deserialization of a NSDictionary that contains multiple objects, indexed by one of their properties — I'll call this dictionary the object index.

Cocoa Archiving Framework is pretty straightforward:

The custom serialization method requires a bit more work, both to serialize...

  1. Each object in object index must be converted to a NSDictionary instance;
  2. A copy of the object index must be created with the same keys but with the NSDictionary representations of the objects;
  3. The copy of the object index is then serialized to NSData using NSPropertyListSerialization;
  4. NSData is written to disk;

Custom serialization flow

... and to deserialize:

  1. NSData is read from disk;
  2. NSData is converted into a NSDictionary, using NSPropertyListSerialization;
  3. For each entry in this NSDictionary, convert it to the model to be instantiated;
  4. Create the object index with these deserialized instances.

Custom deserialization flow

The model

Let's assume we want to store instances of the class BBItem:

@interface BBItem : NSObject

@property(strong, nonatomic) NSString *identifier;
@property(strong, nonatomic) NSDate *createdAt;
@property(strong, nonatomic) NSString *hash;
@property(strong, nonatomic) NSData *data;
@property(strong, nonatomic) NSString *optionalString;
@property(assign, nonatomic) NSUInteger views;
@property(assign, nonatomic) CGRect displayInRect;


It has pretty much all the stuff you might find on a regular model class:

Note: For the sake of brevity, I decided not to include fields with custom classes. I address this point in the end of the article, explaining how it could be done.

The repository

The purpose of the repository is to act as facade to store an arbitrary number of BBItem instances and query them by identifier. Hence, the interface of a repository class could be something like this:

@interface BBItemRepository : NSObject

- (NSUInteger)itemCount;
- (BBItem *)itemWithIdentifier:(NSString *)identifier;
- (void)addItem:(BBItem *)item;
- (void)removeItem:(BBItem *)item;
- (void)removeItemWithIdentifier:(NSString *)identifier;


By looking at these operations, NSMutableDictionary immediately comes to mind as the perfect structure to hold and query this data under the hood. Also, since we want to persist data to disk, we need to add a couple of methods to load and flush data.

Here's how it would look after a slight upgrade to support these requirements:

@interface BBItemRepository : NSObject {
  __strong NSMutableDictionary *_entries;

// deserializes content from disk into memory
- (void)reload;
// flushes all data in memory to disk (but keeps data in memory)
- (BOOL)flush;

- (NSUInteger)itemCount;
- (BBItem *)itemWithIdentifier:(NSString *)identifier;
- (void)addItem:(BBItem *)item;
- (void)removeItem:(BBItem *)item;
- (void)removeItemWithIdentifier:(NSString *)identifier;


Serialization and deserialization: how & when

In order to use this class, we must first call reload — a good place to do it would be on the app delegate's application:didFinishLaunchingWithOptions: method — and eventually call flush after performing some changes — good candidates would be applicationWillTerminate: and applicationDidEnterBackground: on the app delegate.

To simplify things for this particular case I wrote a default repository implementation, BBItemRepository, with no-op flush and reload methods — an in-memory repository.

I then subclass this BBItemRepository with BBPlistItemRepository (custom serialization) and BBArchiveItemRepository (Cocoa Archive serialization).

Note: This article will not cover the implementation of the query methods of the superclass. You can take a quick peek here.

NSKeyedArchiver/Unarchiver and NSCoding serialization

In order to use Cocoa's Archiving Framework, our class must implement the NSCoding protocol. This is a very straight-forward process, where we provide the current values of the properties to the encoder or set the properties' values by reading fields from the decoder.

@implementation BBItem
- (void)encodeWithCoder:(NSCoder *)coder {
  // Object
  [coder encodeObject:_identifier forKey:@"identifier"];
  [coder encodeObject:_createdAt forKey:@"createdAt"];
  [coder encodeObject:_hash forKey:@"hash"];
  [coder encodeObject:_data forKey:@"data"];
  // Scalar
  [coder encodeInteger:_views forKey:@"views"];
  [coder encodeCGRect:_displayInRect forKey:@"displayInRect"];
  // Optional
  if (_optionalString != nil) {
    [coder encodeObject:_optionalString forKey:@"optionalString"];

- (id)initWithCoder:(NSCoder *)decoder {
  self = [super init];
  if (self != nil) {
    // Object
    self.identifier = [decoder decodeObjectForKey:@"identifier"];
    self.createdAt = [decoder decodeObjectForKey:@"createdAt"];
    self.hash = [decoder decodeObjectForKey:@"hash"]; = [decoder decodeObjectForKey:@"data"];
    // Scalar
    self.views = [decoder decodeIntegerForKey:@"views"];
    self.displayInRect = [decoder decodeCGRectForKey:@"displayInRect"];
    // Optional
    if ([decoder containsValueForKey:@"optionalString"]) {
      self.optionalString = [decoder decodeObjectForKey:@"optionalString"];

  return self;

While tedious to write initially (and maintain, if you tend to change your models very frequently), it's a conceptually simple task.

Assuming we have an ivar named _archiveFilePath which was initialized with the path where the archive should sit, reading and flushing these items requires two one-liners:

@implementation BBArchiveItemRepository
- (void)reload {
  // Ensure initialization of _entries (NSMutableDictionary)
  [super reload];

  NSMutableDictionary* entries = [NSKeyedUnarchiver
  if (entries == nil) return;

  // Entries are not null, so assign to the ivar
  _entries = entries;

- (BOOL)flush {
  // This pretty much does nothing but it's always
  // nice to call the superclass's method...
  if (![super flush]) return NO;

  return [NSKeyedArchiver

And that's all there is to it.

Binary Property List (plist) serialization

Since an NSDictionary serialized/deserialized using Binary Plists can only contain objects of the classes NSData, NSString, NSArray, NSDictionary, NSDate and NSNumber, the conversion BBItem -> NSDictionary is a tad bit more cumbersome than using a NSCoder. Thus, by convention, our model will have two new methods:

@implementation BBItem
+ (BBItem *)itemFromDictionary:(NSDictionary *)dictionary {
  BBItem *model = [[BBItem alloc] init];
  // Object - straight forward conversions, retrieved from
  // the dictionary without any further changes required
  model.identifier = [dictionary objectForKey:@"identifier"];
  model.createdAt = [dictionary objectForKey:@"createdAt"];
  model.hash = [dictionary objectForKey:@"hash"]; = [dictionary objectForKey:@"data"];
  // Scalar, require conversion from the objects
  // stored in the NSDictionary
  NSNumber* viewsNumber = [dictionary objectForKey:@"views"];
  if (viewsNumber != nil) {
    model.views = [viewsNumber unsignedIntegerValue];
  NSString *displayInRectStr = [dictionary objectForKey:@"displayInRect"];
  if (displayInRectStr != nil) {
    model.displayInRect = CGRectFromString(displayInRectStr);
  // Optional
  model.optionalString = [dictionary objectForKey:@"optionalString"];

  // Optionally, we can validate the model here
  if ((model.identifier == nil) ||
      (model.createdAt == nil) ||
      (model.hash == nil) ||
      ( == nil)) {
      return nil;

  return model;

- (NSDictionary *)convertToDictionary {
  NSMutableDictionary* dictionary =
    [NSMutableDictionary dictionaryWithObjectsAndKeys:
    // Object
    _identifier, @"identifier",
    _createdAt, @"createdAt",
    _hash, @"hash",
    _data, @"data",
    // Scalar
    [NSNumber numberWithUnsignedInteger:_views], @"views",
    NSStringFromCGRect(_displayInRect), @"displayInRect",

  // Optional properties should be checked before
  // being committed to the NSDictionary
  if (_optionalString != nil) {
    [dictionary setValue:_optionalString forKey:@"optionalString"];

  return dictionary;

Using these methods, we convert each instance of BBItem in _entries into a NSDictionary representation. We then create another top-level index NSDictionary using the same keys as _entries but this time using the NSDictionary representations of the BBItems as values. Conversely, when reading from disk, we must create new BBItem instances from the NSDictionary values in the deserialized binary plist file.

Again, assuming we already have an ivar _indexFilePath which has been initialized with the path where the binary plist file is located, the reload and flush implementations of the BBPlistItemRepository are:

@implementation BBPlistItemRepository
- (void)reload {
  [super reload];

  // Load the file as NSData
  NSData* dictionaryData = [NSData dataWithContentsOfFile:_indexFilePath];
  if (dictionaryData == nil) return;

  // Deserialize the contents of the file to an NSDictionary
  NSString* error = nil;
  NSDictionary* serializedEntries =
     format:NULL errorDescription:&error];

  if (error != nil) return;

  // Convert each key-value pair (NSString, NSDictionary)
  // into our entries: (, item)
   enumerateKeysAndObjectsUsingBlock:^(NSString *key,
                                       NSDictionary *serializedEntry,
                                       BOOL *stop) {
    BBItem* item = [BBItem itemFromDictionary:serializedEntry];
    [_entries setObject:item forKey:key];

- (BOOL)flush {
  if (![super flush]) return NO;

  // Convert each BBItem in the _entries dictionary to
  // its NSDictionary representation
  NSError* error = nil;
  NSMutableDictionary* serializedEntries =
    [NSMutableDictionary dictionaryWithCapacity:[_entries count]];
  [_entries enumerateKeysAndObjectsUsingBlock:^(NSString *key,
                                                BBItem *item,
                                                BOOL *stop) {
    NSDictionary* serializedEntry = [item convertToDictionary];
    [serializedEntries setObject:serializedEntry forKey:key];

  // Create NSData from the dictionary created above,
  // by serializing using binary property lists.
  NSData* dictionaryData = [NSPropertyListSerialization
                            options:0 error:&error];
  if (error != nil) return NO;

  if (![dictionaryData writeToFile:_indexFilePath
                       options:NSDataWritingAtomic error:&error]) {
    return NO;

  return YES;

Even though this code only needs to be written once, it's significantly more complex than BBArchiveItemRepository.

Time to figure out whether the extra complexity actually pays up or not.

Benchmark description

The benchmark is pretty simple; in a typical usage of this repository, all the records would be loaded into memory on boot and flushed back to disk when the app enters background or is about to terminate.

It thus consists of profiling the executions of both reload and flush with varying numbers of items — all other operations will end up being query calls to a NSDictionary.

- (NSString *)testSpeed:(BBItemRepository *)repository
          withDummyData:(NSArray *)items {
  // Make sure we have no content
  [repository reset];

  // items is a NSArray* filled with dummy BBItem instances
  for (BBItem *item in items) {
    [repository addItem:item];

  // Time executions of flush (write to disk) and reload (read from disk)
  uint64_t flushNanoseconds = [BBProfiler profileBlock:^() {
    [repository flush];

  uint64_t reloadNanoseconds = [BBProfiler profileBlock:^() {
    [repository reload];

  // Clean it up again
  [repository reset];

  return ...;

Just like in the Core Data vs File System comparison, I begin by ensuring both repositories work exactly as expected. These assertions can be found on the method testRepositoryCorrectness: of the class BBBenchmarkViewController.


What we've all been waiting for:

Custom serialization, 100 items:
Flush:      17.95ms
Reload:     13.99ms

Cocoa Archive Framework, 100 items:
Flush:      37.80ms
Reload:     17.99ms


Custom serialization, 1000 items:
Flush:      137.68ms
Reload:     112.71ms

Cocoa Archive Framework, 1000 items:
Flush:      319.13ms
Reload:     204.75ms


Custom serialization, 10000 items:
Flush:      1391.21ms
Reload:     1221.77ms

Cocoa Archive Framework, 10000 items:
Flush:      3479.27ms
Reload:     2139.62ms

Over 2x faster when flushing to disk, almost 2x faster when loading from disk.


At the cost of a slightly more complex initial implementation, the custom serialization method proposed here does offer a significant speed boost when compared to Cocoa's Archive Framework.

The main reason behind this is that the custom serialization method simply creates different representations of the items when serializing, whereas with CAF, hidden under those handy one-liners, there is a lot of object graph voodoo going on; this naturally slows the process down.

It's very important to mention that this repository is not meant to scale past a few thousand records. For very large amounts of objects or very large objects, you should stick to Core Data.

Bonus round: using custom classes as properties on the model items

While Cocoa archiving framework takes care of object graphs, using the custom serialization method introduced in this article does not. If you plan on serializing complex graphs, it's probably better to use Cocoa's Archiving Framework.

The purpose of this custom serialization method is to be able to quickly serialize & deserialize either simple objects or simple trees of objects — quickly being the keyword.

With that said, you could easily add a custom class member to BBItem:

@interface BBItem : NSObject
@property(strong, nonatomic) BBSubItem *subItem;

You'd need to:

Additional notes