Tag Archives: iOS

Scaling backend infrastructure to handle millions of phones (Mobile Refresh 2018)

I recently had the pleasure to speak at the 2018 Mobile Refresh conference held here in Wellington and did a talk introducing how we run some parts of the Sailthru Mobile platform, along with recommendations and advice to anyone else also building backends to support their mobile applications.

It’s more entry-level than some of my other infrastructure talks as it’s focused on people that are primarily mobile developers with maybe a limited set of infrastructure awareness.

Firebase FCM upstream with Swift on iOS

I’ve been learning a bit of Swift lately in order to write an iOS app for my alarm system. I’m not very good at it yet, but figured I’d write some notes to help anyone else playing with the murky world of Firebase Cloud Messaging/FCM and iOS.

One of the key parts of the design is that I wanted the alarm app and the alarm server to communicate directly with each other without needing public facing endpoints, rather than the conventional design when the app interacts via an HTTP API.

The intention of this design is that it means I can dump all the alarm software onto a small embedded computer and as long as that computer has outbound internet access, it just works™️. No headaches about discovering the endpoint of the service and much more simplified security as there’s no public-facing web server.

Given I need to deliver push notifications to the app, I implemented Google Firebase Cloud Messaging (FCM) – formerly GCM – for push delivery to both iOS and Android apps.

Whilst FCM is commonly used for pushing to devices, it also supports pushing messages back upstream to the server from the device. In order to do this, the server must be implemented as an XMPP server and the FCM SDK be embedded into the app.

The server was reasonably straight forwards, I’ve written a small Java daemon that uses a reference XMPP client implementation and wraps some additional logic to work with HowAlarming.

The client side was a bit more tricky. Google has some docs covering how to implement upstream messaging in the iOS app, but I had a few issues to solve that weren’t clearly detailed there.

 

Handling failure of FCM upstream message delivery

Firstly, it’s important to have some logic in place to handle/report back if a message can not be sent upstream – otherwise you have no way to tell if it’s worked. To do this in swift, I added a notification observer for .MessagingSendError which is thrown by the FCM SDK if it’s unable to send upstream.

class AppDelegate: UIResponder, UIApplicationDelegate, MessagingDelegate {

 func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplicationLaunchOptionsKey: Any]?) -> Bool {
   ...
   // Trigger if we fail to send a message upstream for any reason.
   NotificationCenter.default.addObserver(self, selector: #selector(onMessagingUpstreamFailure(_:)), name: .MessagingSendError, object: nil)
   ...
 }

 @objc
 func onMessagingUpstreamFailure(_ notification: Notification) {
   // FCM tends not to give us any kind of useful message here, but
   // at least we now know it failed for when we start debugging it.
   print("A failure occurred when attempting to send a message upstream via FCM")
 }
}

Unfortunately I’m yet to see a useful error code back from FCM in response to any failures to send message upstream – seem to just get back a 501 error to anything that has gone wrong which isn’t overly helpful… especially since in web programming land, any 5xx series error implies it’s the remote server’s fault rather than the client’s.

 

Getting the GCM Sender ID

In order to send messages upstream, you need the GCM Sender ID. This is available in the GoogleService-Info.plist file that is included in the app build, but I couldn’t figure out a way to extract this easily from the FCM SDK. There probably is a better/nice way of doing this, but the following hack works:

// Here we are extracting out the GCM SENDER ID from the Google
// plist file. There used to be an easy way to get this with GCM, but
// it's non-obvious with FCM so here's a hacky approach instead.
if let path = Bundle.main.path(forResource: "GoogleService-Info", ofType: "plist") {
  let dictRoot = NSDictionary(contentsOfFile: path)
  if let dict = dictRoot {
    if let gcmSenderId = dict["GCM_SENDER_ID"] as? String {
       self.gcmSenderId = gcmSenderId // make available on AppDelegate to whole app
    }
  }
}

And yes, although we’re all about FCM now, this part hasn’t been rebranded from the old GCM product, so enjoy having yet another acronym in your app.

 

Ensuring the FCM direct channel is established

Finally the biggest cause I had for upstream message delivery failing, is that I was often trying to send an upstream message before FCM had finished establishing the direct channel.

This happens for you automatically by the SDK whenever the app is loaded into foreground, provided that you have shouldEstablishDirectChannel set to true. This can take up to several seconds after application launch for it to actually complete – which means if you try to send upstream too early, the connection isn’t ready, and your send fails with an obscure 501 error.

The best solution I found was to use an observer to listen to .MessagingConnectionStateChanged which is triggered whenever the FCM direct channel connects or disconnects. By listening to this notification, you know when FCM is ready and capable of delivering upstream messages.

An additional bonus of this observer, is that when it indicates the FCM direct channel is established, by that time the FCM token for the device is available to your app to use if needed.

So my approach is to:

  1. Setup FCM with shouldEstablishDirectChannel set to true (otherwise you won’t be going upstream at all!).
  2. Setup an observer on .MessagingConnectionStateChanged
  3. When triggered, use Messaging.messaging().isDirectChannelEstablished to see if we have a connection ready for us to use.
  4. If so, pull the FCM token (device token) and the GCM Sender ID and retain in AppDelegate for other parts of the app to use at any point.
  5. Dispatch the message to upstream with whatever you want in messageData.

My implementation looks a bit like this:

class AppDelegate: UIResponder, UIApplicationDelegate, MessagingDelegate {

 func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplicationLaunchOptionsKey: Any]?) -> Bool {
  ...
  // Configure FCM and other Firebase APIs with a single call.
  FirebaseApp.configure()

  // Setup FCM messaging
  Messaging.messaging().delegate = self
  Messaging.messaging().shouldEstablishDirectChannel = true

  // Trigger when FCM establishes it's direct connection. We want to know this to avoid race conditions where we
  // try to post upstream messages before the direct connection is ready... which kind of sucks.
  NotificationCenter.default.addObserver(self, selector: #selector(onMessagingDirectChannelStateChanged(_:)), name: .MessagingConnectionStateChanged, object: nil)
  ...
 }

 @objc
 func onMessagingDirectChannelStateChanged(_ notification: Notification) {
  // This is our own function listen for the direct connection to be established.
  print("Is FCM Direct Channel Established: \(Messaging.messaging().isDirectChannelEstablished)")

  if (Messaging.messaging().isDirectChannelEstablished) {
   // Set the FCM token. Given that a direct channel has been established, it kind of implies that this
   // must be available to us..
   if self.registrationToken == nil {
    if let fcmToken = Messaging.messaging().fcmToken {
     self.registrationToken = fcmToken
     print("Firebase registration token: \(fcmToken)")
    }
   }

   // Here we are extracting out the GCM SENDER ID from the Google PList file. There used to be an easy way
   // to get this with GCM, but it's non-obvious with FCM so we're just going to read the plist file.
   if let path = Bundle.main.path(forResource: "GoogleService-Info", ofType: "plist") {
    let dictRoot = NSDictionary(contentsOfFile: path)
     if let dict = dictRoot {
      if let gcmSenderId = dict["GCM_SENDER_ID"] as? String {
       self.gcmSenderID = gcmSenderId
     }
    }
   }

  // Send an upstream message
  let messageId = ProcessInfo().globallyUniqueString
  let messageData: [String: String] = [
   "registration_token": self.registrationToken!, // In my use case, I want to know which device sent us the message
   "marco": "polo"
  ]
  let messageTo: String = self.gcmSenderID! + "@gcm.googleapis.com"
  let ttl: Int64 = 0 // Seconds. 0 means "do immediately or throw away"

  print("Sending message to FCM server: \(messageTo)")

  Messaging.messaging().sendMessage(messageData, to: messageTo, withMessageID: messageId, timeToLive: ttl)
  }
 }

 ...
}

For a full FCM downstream and upstream implementation example, you can take a look at the HowAlarming iOS app source code on Github and if you need a server reference, take a look at the HowAlarming GCM server in Java.

 

Learnings

It’s been an interesting exercise – I wouldn’t particularly recommend this architecture for anyone building real world apps, the main headaches I ran into were:

  1. FCM SDK just seems a bit buggy. I had a lot of trouble with the GCM SDK and the move to FCM did improve stuff a bit, but there’s still a number of issues that occur from time to time. For example: occasionally a FCM Direct Channel isn’t established for no clear reason until the app is terminated and restarted.
  2. Needing to do things like making sure FCM Direct Channel is ready before sending upstream messages should probably be handled transparently by the SDK rather than by the app developer.
  3. I have still yet to get background code execution on notifications working properly. I get the push notification without a problem, but seem to be unable to trigger my app to execute code even with content-available == 1 . Maybe a bug in my code, or FCM might be complicating the mix in some way, vs using pure APNS. Probably my code.
  4. It’s tricky using FCM messages alone to populate the app data, occasionally have issues such as messages arriving out of order, not arriving at all, or occasionally ending up with duplicates. This requires the app code to process, sort and re-populate the table view controller which isn’t a lot of fun. I suspect it would be a lot easier to simply re-populate the view controller on load from an HTTP endpoint and simply use FCM messages to trigger refreshes of the data if the user taps on a notification.

So my view for other projects in future would be to use FCM purely for server->app message delivery (ie: “tell the user there’s a reason to open the app”) and then rely entirely on a classic app client and HTTP API model for all further interactions back to the server.

Easy IKEv2 VPN for mobile devices (inc iOS)

I recently obtained an iPhone and needed to connect it to my VPN. However my existing VPN server was an OpenVPN installation which works lovely on traditional desktop operating systems and Android, but the iOS client is a bit more questionable having last been updated in September 2014 (pre iOS 9).

I decided to look into what the “proper” VPN option would be for iOS in order to get something that should be supported by the OS as smoothly as possible. Last time I looked this was full of wonderful horrors like PPTP (not actually encrypted!!) and L2TP/IPSec (configuration hell), so I had always avoided like the plague.

However as of iOS 9+, Apple has implemented support for IKEv2 VPNs which offers an interesting new option. What particularly made this option attractive for me, is that I can support every device I have with the one VPN standard:

  • IKEv2 is built into iOS 9 and MacOS El Capitan.
  • IKEv2 is built into Windows 10.
  • Works on Android with a third party client (hopeful for native integration soonish?).
  • Naturally works on GNU/Linux.

Whilst I love OpenVPN, being able to use the stock OS features instead of a third party client is always nice, particularly on mobile where power management and background tasks behaviour can be interesting.

IKEv2 on mobile also has some other nice features, such as MOBIKE, which makes it very seamless when switching between different networks (like the cellular to WiFi dance we do constantly with phones/tablets). This is something that OpenVPN can’t do – whilst it’s generally fast and reliable at establishing a connection, a change in the network means issuing a reconnect, it doesn’t just move the current connection across.

 

Given that I run GNU/Linux servers, I went for one of the popular IPSec solutions available on most distributions – StrongSwan.

Unfortunately whilst it’s technical capabilities are excellent, it’s documentation isn’t great. Best way to describe it is that every option is documented, but what options and why you’d want to use them? Not so much. The “left” vs “right” style documentation is also a right pain to work with, it’s not a configuration format that reads nicely and clearly.

Trying to find clear instructions and working examples of configurations for doing IKEv2 with iOS devices was also difficult and there’s some real traps for young players such as generating SHA1 certs instead of SHA2 when using the tools with defaults.

The other fun is that I also wanted my iOS device setup properly to:

  1. Use certificate based authentication, rather than PSK.
  2. Only connect to the VPN when outside of my house.
  3. Remain connected to the VPN even when moving between networks, etc.

I found the best way to make it work, was to use Apple Configurator to generate a .mobileconfig file for my iOS devices that includes all my VPN settings and certificates in an easy-to-import package, but also (critically) allows me to define options that are not selectable to end users, such as on-demand VPN establishment.

 

After a few nights of messing around and cursing the fact that all the major OS vendors haven’t just implemented OpenVPN, I managed to get a working connection. To avoid others the same pain, I considered writing a guide – but it’s actually a really complex setup, so instead I decided to write a Puppet module (clone from github / or install from puppetforge) which does the following heavy lifting for you:

  • Installs StrongSwan (on a Debian/derived GNU/Linux system).
  • Configures StrongSwan for IKEv2 roadwarrior style VPNs.
  • Generates all the CA, cert and key files for the VPN server.
  • Generates each client’s certs for you.
  • Generates a .mobileconfig file for iOS devices so you can have a single import of all the configuration, certs and ondemand rules and don’t have to have a Mac to use Apple Configurator.

This means you can save yourself all the heavy lifting and setup a VPN with as little as the following Puppet code:

class { 'roadwarrior':
   manage_firewall_v4 => true,
   manage_firewall_v6 => true,
   vpn_name           => 'vpn.example.com',
   vpn_range_v4       => '10.10.10.0/24',
   vpn_route_v4       => '192.168.0.0/16',
 }

roadwarrior::client { 'myiphone':
  ondemand_connect       => true,
  ondemand_ssid_excludes => ['wifihouse'],
}

roadwarrior::client { 'android': }

The above example sets up a routed VPN using 10.10.10.0/24 as the VPN client range and routes the 192.168.0.0/16 network behind the VPN server back through. (Note that I haven’t added masquerading options yet, so your gateway has to know to route the vpn_range back to the VPN server).

It then defines two clients – “myiphone” and “android”. And in the .mobileconfig file generated for the “myiphone” client, it will specifically generate rules that cause the VPN to maintain a constant connection, except when connected to a WiFi network called “wifihouse”.

The certs and .mobileconfig files are helpfully placed in  /etc/ipsec.d/dist/ for your rsync’ing pleasure including a few different formats to help load onto fussy devices.

 

Hopefully this module is useful to some of you. If you’re new to Puppet but want to take advantage of it, you could always check out my introduction to Puppet with Pupistry guide.

If you’re not sure of my Puppet modules or prefer other config management systems (or *gasp* none at all!) the Puppet module should be fairly readable and easy enough to translate into your own commands to run.

There a few things I still want to do – I haven’t yet done IPv6 configuration (which I’ll fix since I run a dual-stack network everywhere) and I intend to add a masquerade firewall feature for those struggling with routing properly between their VPN and LAN.

I’ve been using this configuration for a few weeks on a couple iOS 9.3.1 devices and it’s been working beautifully, especially with the ondemand configuration which I haven’t been able to do on any other devices (like Android or MacOS) yet unfortunately. The power consumption overhead seems minimal, but of course your mileage may vary.

It would be good to test with Windows 10 and as many other devices as possible. I don’t intend for this module to support non-roadwarrior type configs (eg site-to-site linking) to keep things simple, but happy to merge any PRs that make it easier to connect more mobile devices or branch routers back to a main VPN host. Also happy to merge PRs for more GNU/Linux distribution support- currently only support Debian/Ubuntu, but it shouldn’t be hard to add others.

If you’re on Android, this VPN will work for you, but you may find the OpenVPN client better and more flexible since the Android client doesn’t have the same level of on demand functionality that iOS has built in. You may also find OpenVPN a better option if you’re regularly using restrictive networks that only allow “HTTPS” out, since it can be run on TCP/443, whereas StrongSwan IKEv2 runs on UDP port 500 or 4500.