The Core ML framework brought with it the ability to embed machine learning models into an iOS application through the use of .mlmodel files. These files are normally embedded by including them into an Xcode project, recompiling your application and deploying to a device.

This post will have a look at the case where the developer wishes to update or insert a model into the application, but avoiding the process of recompiling the entire application as mentioned above. Several approaches will be discussed, from Apple’s preferred method to less conventional tricks which trade speed and storage efficiency alike.

Link to the complete example:

Why would you do this?

In the case where an app is already deployed to the App Store, the process of pushing a traditional update and making the user download the update may be unsuitable for frequent updates to a model. It may also be desirable to allow the user to optionally download machine learning models, whether there are many models or only one.

It may also be interesting to consider if this approach could reduce the application’s size. Many CoreML models will often occupy large amounts of storage, and will at the same time cause updates to become larger. If compression could be applied to models, these issues would likely disappear. This will be discussed in a later section.

An important consideration before using these methods is to ensure that the app’s functionality is preserved, whether it be functioning years down the road or when a user is not connected to a network. These aspects should be carefully considered when implementing these methods for production use.

But first, some information

First off, you will need Xcode beta 4 or greater as well as iOS 11 beta 4 or greater in order for this to work.

The common file format used to distribute readily trained CoreML models is the .mlmodel format, a binary blob containing weights, a network description and input and output names. This information alone is enough to set up and use a machine learning model on an iOS device (discounting any pre- and post-processing of input and output data, of course).

When working with Xcode, a .mlmodel file is (1) used to generate a Swift or Objective-C wrapper class for simple usage and (2) compiled down to a .mlmodelc filesystem structure [1]. The .mlmodelc structure is the essence of what we will be working with. It contains (at least in iOS 11 beta 4) the following files:
  • model.espresso.shape
  • model.espresso.weights
  • coremldata.bin
  • A directory named ‘model’ containing another file named coremldata.bin

The file labelled .weights is of interest in the case where the same model should be updated. Otherwise, the entire .mlmodelc may be replaced. As far as wrapper-code is concerned, only the input and output types must match.

[1] This was discovered by inspecting the .app directory generated by Xcode in DerivedData, which will contain the .mlmodelc structure. The Swift/Objective-C code can be found in the DerivedSources directory found deeper within DerivedData as an intermediate file.

The different methods

Dynamic partsSizeInitial loading time
Compiled and bundled with the appNothing Large None
MLModel.compileModel()Everything Smaller than compiled[1] Short
Manual compression, updating weights [2]Weights Smallest Long
Manual compression, shipping a new model [2]Weights, network Smallest Long

[1] During some tests, the size of the readily compiled model was smaller than the .mlmodel file

Table 1 shows an outline of the methods to be discussed. In summary, the official method has its benefit of being optimized on the hardware side, while compromising on storage space.
[2] The compression considered here is BZip2. The mentioned methods assume a similar ratio of compression, since BZip2 would be combined with TAR in order to allow directories to be compressed.

We will be using our toy example in order to exemplify how these different methods may be performed. The source code may be found at:

Compiling a model on-device

In iOS 11 beta 4, Apple very silently added a new feature to their MLModel class to allow compilation of .mlmodel files on iOS devices. The following code snippet demonstrates how it works:

let compiledModelUrl: URL = try MLModel.compileModel(at: mlModelUrl)

The resulting URL will refer to a temporary .mlmodelc which may be loaded by a model wrapper class (assuming its input and output names and types match), as generated by Xcode at build time [1].

let loadedModel: MNISTGenerator = MNISTGenerator(contentsOf: compiledModelUrl)

In addition to the above, the .mlmodel must also be brought to the device, either as a resource within the app or as an online resource (for the sake of simplicity, we will not be showing the latter).

This brings up another problem: if a .mlmodel is loaded as a resource, not a compiled asset, there will not be any wrapper class (unless another model is compiled by Xcode, that is). The solution to this problem can be found by investigating xcodebuild’s output when compiling such a model. At some point, the wrapper class is generated by a utility called ‘coremlc’, located in the Xcode tool path[2]. The usage flags for this program are not very well documented by the program itself or Apple, but it works as such:

coremlc [command] [model.mlmodel] [output directory for artifacts]

The command argument may be used as follows:

coremlc generate model.mlmodel [output directory for source code] \
     --language [Swift|Objective-C] \
     [--swift-version [version, probably 4.0]]

coremlc compile model.mlmodel [output directory for .mlmodelc structure]

Using coremlc’s ‘generate’ command to output a Swift or Objective-C wrapper, we may avoid manually writing the boilerplate code required to use the MLModel class[3]. This operation can be done as such in case of our toy example:

coremlc generate mnist.mlmodel ./GAN/GAN/ --language Swift --swift-version 4.0

[1] Most examples by Apple show the use of a convenience initializer that takes no arguments. This initializer searches for a specific path within the application bundle, which may not be desirable in the case where the compiled model is located in a user’s document directory or elsewhere.
[2] The Xcode tool path being specified by ‘$(xcode-select -p)/usr/bin’.

[3] Although it is possible to dynamically resolve the input and output names at run-time, this would require a bit more effort in ensuring that shapes and types are the same.

Compiling a model offline on a server

Keeping in mind the utility of ‘coremlc’ from the last section, it exposes the command ‘compile’ which becomes quite useful when wanting to create a .mlmodelc structure without using the entirety of xcodebuild. Keeping in mind what we know about .mlmodelc from earlier, it is an entity which may be compressed into an archive if need be (which becomes convenient if it were to be downloaded by the app).

This approach will be quite similar to the previous approach, like requiring code to load the model in the application, excepting the steps of compiling the model on-site. The difference is what can be done between outputting .mlmodelc and arriving at the device. In our example we put the files into a TAR and applied BZip2 compression to see how well it compares to the .mlmodel format. The results of this comparison can be seen later in this post.

Compiling a model offline on a server, sending only weights

This method overlaps heavily with the previous approach, except its aim would be to only update weights. In general, this approach can be used in the case where a fixed model resides on the App Store with a truncated (zero-sized) weights file.

The implementation

The examples will not mention the setup around the models and how they are used, such as pre- and postprocessing of data or displaying it to the user. The code is still found within the repository if this is interesting, but we will not explain it further.

Compiling models using MLModel

The static method MLModel.compileModel() takes a URL as an argument, the URL referring to a .mlmodel file on the device. This may be located in the application’s Documents directory or the application bundle.

func compileModel(with newModelAddress: URL) {

       if let compiledAddress = try? MLModel.compileModel(at: newModelAddress) {
           Tools.replaceFile(at: modelPath, withFileAt: compiledAddress)
           generator.setModel(with: modelPath)

Loading models and decompressing on-device

This approach uses Cocoa Pods for decompressing BZip2 using system libraries and for accessing the contents of TAR files. The specific dependencies can be found in the repository’s Podfile.

The procedure extracts a .mlmodelc from a .tar.bz2 archive, moves it to the Documents directory and loads the model from that directory.

func decompressModel(with newModelAddress: URL) {
       print("Starting decompression using BZip2")
       do {

           let data = try Data(contentsOf: newModelAddress)
           print("Got data", newModelAddress, data.count)
           let tarData = try BZipCompression.decompressedData(with: data)
           print("Unpacking to " + modelPath.absoluteString)
           let tempPath = fileManager.urls(for: .cachesDirectory, in: .userDomainMask).first!
           try fileManager.createFilesAndDirectories(at: tempPath, withTarData: tarData, progress: { (progress) in

           print("Finished decompressing")

           Tools.replaceFile(at: modelPath, withFileAt: tempPath.appendingPathComponent("mnistNew.mlmodelc"))
           generator.setModel(with: modelPath)

       catch {
           print("Failed to coerce Data")

Replacing weights on an existing model

This approach can be combined with compression using BZip2, but for the intents and purposes of this example, only the replacement of the weights file is considered.

func truncateModel(with newWeightAddress: URL) {
       Tools.replaceFile(at: modelPath.appendingPathComponent("model.espresso.weights"), withFileAt: newWeightAddress)
       generator.setModel(with: modelPath)

The application will proceed to copy an already generated (by Xcode) .mlmodelc structure to the Documents directory such that a complete, modified .mlmodelc can be loaded by the application’s generated model wrapper.

guard let originalPath = Bundle.main.url(forResource: "mnistStock", withExtension: "mlmodelc") else {
               return nil

       self.modelPath = documentUrl.appendingPathComponent("mnistStock.mlmodelc")

       do {
           Tools.deleteFile(atPath: modelPath)
           try FileManager.default.copyItem(at: originalPath, to: modelPath)
       catch let error {


The above methods all aim to allow more flexible configuration of models on the device. Looking at how it performs on a real device we may get a better picture of performance, practicality and storage requirements.

Looking at performance, a comparison between compression with BZip2 and the .mlmodel format is appropriate. It should be noted that both formats employ a type of compression to the data. Table 1 shows the amount of time spent loading from the two different formats.

1767 7093
1991 6962
1729 6946
1927 6949
1853.5 6987.5

Measuring of the above values was done using the Unix timestamp from Swift’s Date() class converted to milliseconds.

These tests were performed with the MNIST model bundled with the toy example. The BZip2 function used was included from the iOS SDK, while the CoreML results are timing the call to MLModel.compileModel(). Looking at the average scores, BZip2’s performance is consistently lagging behind CoreML for this model.

Another interesting metric to this is the storage size for each alternative, seen below:

InceptionV3GoogLeNetPlacesResNet50VGG16Our MINST
Compiled size[1]95,847,541 24,892,518 102,217,866 553,642,814 27,455,465
.mlmodel size[2]94,704,130 24,754,375 102,586,628 553,457,269 27,056,111
BZip2 size[3]90,363 687 23,682,748 97,305,114 524,954,884 25,413,660

All sizes are in bytes
[1] Size of .mlmodelc directory, as compiled by Xcode
[2] Size of unprocessed .mlmodel file
[3] Size of .mlmodelc directory in TAR, compressed by BZip2

For most models shown above, the .mlmodel size is almost the same as the compiled size, due to floating point numbers not being easily compressible, containing many random sequences as opposed to repeating sequences preferred by compression algorithms. Taking this into account, the results for BZip2 are not very surprising.


Overall, the effort required to load models dynamically is not very much compared to its usefulness. Out of all the alternatives, compiling a model on the device is fast enough to not be a problem for most intents and purposes, keeping in mind it only has to be done once for any given model.

By allowing dynamic loading, it becomes easy to deploy new models, not having to recompile and resubmit the application to the App Store, while also making it possible to download and use models on-demand.

Best regards,
Nils Barlaug, Jørgen Henrichsen, John Chen, Håvard Bjerke, Jonas Gedde-Dahl and Camilla Dahlstrøm