Tag Archives: google

Firebase FCM upstream with Swift on iOS

I’ve been learning a bit of Swift lately in order to write an iOS app for my alarm system. I’m not very good at it yet, but figured I’d write some notes to help anyone else playing with the murky world of Firebase Cloud Messaging/FCM and iOS.

One of the key parts of the design is that I wanted the alarm app and the alarm server to communicate directly with each other without needing public facing endpoints, rather than the conventional design when the app interacts via an HTTP API.

The intention of this design is that it means I can dump all the alarm software onto a small embedded computer and as long as that computer has outbound internet access, it just works™️. No headaches about discovering the endpoint of the service and much more simplified security as there’s no public-facing web server.

Given I need to deliver push notifications to the app, I implemented Google Firebase Cloud Messaging (FCM) – formerly GCM – for push delivery to both iOS and Android apps.

Whilst FCM is commonly used for pushing to devices, it also supports pushing messages back upstream to the server from the device. In order to do this, the server must be implemented as an XMPP server and the FCM SDK be embedded into the app.

The server was reasonably straight forwards, I’ve written a small Java daemon that uses a reference XMPP client implementation and wraps some additional logic to work with HowAlarming.

The client side was a bit more tricky. Google has some docs covering how to implement upstream messaging in the iOS app, but I had a few issues to solve that weren’t clearly detailed there.

 

Handling failure of FCM upstream message delivery

Firstly, it’s important to have some logic in place to handle/report back if a message can not be sent upstream – otherwise you have no way to tell if it’s worked. To do this in swift, I added a notification observer for .MessagingSendError which is thrown by the FCM SDK if it’s unable to send upstream.

class AppDelegate: UIResponder, UIApplicationDelegate, MessagingDelegate {

 func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplicationLaunchOptionsKey: Any]?) -> Bool {
   ...
   // Trigger if we fail to send a message upstream for any reason.
   NotificationCenter.default.addObserver(self, selector: #selector(onMessagingUpstreamFailure(_:)), name: .MessagingSendError, object: nil)
   ...
 }

 @objc
 func onMessagingUpstreamFailure(_ notification: Notification) {
   // FCM tends not to give us any kind of useful message here, but
   // at least we now know it failed for when we start debugging it.
   print("A failure occurred when attempting to send a message upstream via FCM")
 }
}

Unfortunately I’m yet to see a useful error code back from FCM in response to any failures to send message upstream – seem to just get back a 501 error to anything that has gone wrong which isn’t overly helpful… especially since in web programming land, any 5xx series error implies it’s the remote server’s fault rather than the client’s.

 

Getting the GCM Sender ID

In order to send messages upstream, you need the GCM Sender ID. This is available in the GoogleService-Info.plist file that is included in the app build, but I couldn’t figure out a way to extract this easily from the FCM SDK. There probably is a better/nice way of doing this, but the following hack works:

// Here we are extracting out the GCM SENDER ID from the Google
// plist file. There used to be an easy way to get this with GCM, but
// it's non-obvious with FCM so here's a hacky approach instead.
if let path = Bundle.main.path(forResource: "GoogleService-Info", ofType: "plist") {
  let dictRoot = NSDictionary(contentsOfFile: path)
  if let dict = dictRoot {
    if let gcmSenderId = dict["GCM_SENDER_ID"] as? String {
       self.gcmSenderId = gcmSenderId // make available on AppDelegate to whole app
    }
  }
}

And yes, although we’re all about FCM now, this part hasn’t been rebranded from the old GCM product, so enjoy having yet another acronym in your app.

 

Ensuring the FCM direct channel is established

Finally the biggest cause I had for upstream message delivery failing, is that I was often trying to send an upstream message before FCM had finished establishing the direct channel.

This happens for you automatically by the SDK whenever the app is loaded into foreground, provided that you have shouldEstablishDirectChannel set to true. This can take up to several seconds after application launch for it to actually complete – which means if you try to send upstream too early, the connection isn’t ready, and your send fails with an obscure 501 error.

The best solution I found was to use an observer to listen to .MessagingConnectionStateChanged which is triggered whenever the FCM direct channel connects or disconnects. By listening to this notification, you know when FCM is ready and capable of delivering upstream messages.

An additional bonus of this observer, is that when it indicates the FCM direct channel is established, by that time the FCM token for the device is available to your app to use if needed.

So my approach is to:

  1. Setup FCM with shouldEstablishDirectChannel set to true (otherwise you won’t be going upstream at all!).
  2. Setup an observer on .MessagingConnectionStateChanged
  3. When triggered, use Messaging.messaging().isDirectChannelEstablished to see if we have a connection ready for us to use.
  4. If so, pull the FCM token (device token) and the GCM Sender ID and retain in AppDelegate for other parts of the app to use at any point.
  5. Dispatch the message to upstream with whatever you want in messageData.

My implementation looks a bit like this:

class AppDelegate: UIResponder, UIApplicationDelegate, MessagingDelegate {

 func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplicationLaunchOptionsKey: Any]?) -> Bool {
  ...
  // Configure FCM and other Firebase APIs with a single call.
  FirebaseApp.configure()

  // Setup FCM messaging
  Messaging.messaging().delegate = self
  Messaging.messaging().shouldEstablishDirectChannel = true

  // Trigger when FCM establishes it's direct connection. We want to know this to avoid race conditions where we
  // try to post upstream messages before the direct connection is ready... which kind of sucks.
  NotificationCenter.default.addObserver(self, selector: #selector(onMessagingDirectChannelStateChanged(_:)), name: .MessagingConnectionStateChanged, object: nil)
  ...
 }

 @objc
 func onMessagingDirectChannelStateChanged(_ notification: Notification) {
  // This is our own function listen for the direct connection to be established.
  print("Is FCM Direct Channel Established: \(Messaging.messaging().isDirectChannelEstablished)")

  if (Messaging.messaging().isDirectChannelEstablished) {
   // Set the FCM token. Given that a direct channel has been established, it kind of implies that this
   // must be available to us..
   if self.registrationToken == nil {
    if let fcmToken = Messaging.messaging().fcmToken {
     self.registrationToken = fcmToken
     print("Firebase registration token: \(fcmToken)")
    }
   }

   // Here we are extracting out the GCM SENDER ID from the Google PList file. There used to be an easy way
   // to get this with GCM, but it's non-obvious with FCM so we're just going to read the plist file.
   if let path = Bundle.main.path(forResource: "GoogleService-Info", ofType: "plist") {
    let dictRoot = NSDictionary(contentsOfFile: path)
     if let dict = dictRoot {
      if let gcmSenderId = dict["GCM_SENDER_ID"] as? String {
       self.gcmSenderID = gcmSenderId
     }
    }
   }

  // Send an upstream message
  let messageId = ProcessInfo().globallyUniqueString
  let messageData: [String: String] = [
   "registration_token": self.registrationToken!, // In my use case, I want to know which device sent us the message
   "marco": "polo"
  ]
  let messageTo: String = self.gcmSenderID! + "@gcm.googleapis.com"
  let ttl: Int64 = 0 // Seconds. 0 means "do immediately or throw away"

  print("Sending message to FCM server: \(messageTo)")

  Messaging.messaging().sendMessage(messageData, to: messageTo, withMessageID: messageId, timeToLive: ttl)
  }
 }

 ...
}

For a full FCM downstream and upstream implementation example, you can take a look at the HowAlarming iOS app source code on Github and if you need a server reference, take a look at the HowAlarming GCM server in Java.

 

Learnings

It’s been an interesting exercise – I wouldn’t particularly recommend this architecture for anyone building real world apps, the main headaches I ran into were:

  1. FCM SDK just seems a bit buggy. I had a lot of trouble with the GCM SDK and the move to FCM did improve stuff a bit, but there’s still a number of issues that occur from time to time. For example: occasionally a FCM Direct Channel isn’t established for no clear reason until the app is terminated and restarted.
  2. Needing to do things like making sure FCM Direct Channel is ready before sending upstream messages should probably be handled transparently by the SDK rather than by the app developer.
  3. I have still yet to get background code execution on notifications working properly. I get the push notification without a problem, but seem to be unable to trigger my app to execute code even with content-available == 1 . Maybe a bug in my code, or FCM might be complicating the mix in some way, vs using pure APNS. Probably my code.
  4. It’s tricky using FCM messages alone to populate the app data, occasionally have issues such as messages arriving out of order, not arriving at all, or occasionally ending up with duplicates. This requires the app code to process, sort and re-populate the table view controller which isn’t a lot of fun. I suspect it would be a lot easier to simply re-populate the view controller on load from an HTTP endpoint and simply use FCM messages to trigger refreshes of the data if the user taps on a notification.

So my view for other projects in future would be to use FCM purely for server->app message delivery (ie: “tell the user there’s a reason to open the app”) and then rely entirely on a classic app client and HTTP API model for all further interactions back to the server.

Your cloud pricing isn’t webscale

Thankfully in 2015 most (but not all) proprietary software providers have moved away from the archaic ideology of software being licensed by the CPU core – a concept that reflected the value and importance of systems back when you were buying physical hardware, but rendered completely meaningless by cloud and virtualisation providers.

Taking it’s place came the subscription model, popularised by Software-as-a-Service (or “cloud”) products. The benefits are attractive – regular income via customer renewal payments, flexibility for customers wanting to change the level of product or number of systems covered and no CAPEX headaches in acquiring new products to use.

Clients win, vendors win, everyone is happy!

Or maybe not.

 

Whilst the horrible price-by-CPU model has died off, a new model has emerged – price by server. This model assumes that the more servers a customer has, the bigger they are and the more we should charge them.

The model makes some sense in a traditional virtualised environment (think VMWare) where boxes are sliced up and a client runs only as many as they need. You might only have a total of two servers for your enterprise application – primary and DR – each spec’ed appropriately to handle the max volume of user requests.

But the model fails horribly when clients start proper cloud adoption. Suddenly that one big server gets sliced up into 10 small servers which come and go by the hour as they’re needed to supply demand.

DevOps techniques such as configuration management suddenly turns the effort of running dozens of servers into the same as running a single machine, there’s no longer any reason to want to constrain yourself to a single machine.

It gets worse if the client decides to adopt microservices, where each application gets split off into it’s own server (or container aka Docker/CoreOS). And it’s going to get very weird when we start using compute-less computing more with services like Lambda and Hoist because who knows how many server licenses you need to run an application that doesn’t even run on a server that you control?

 

Really the per-server model for pricing is as bad as the per-core model, because it no longer has any reflection on the size of an organisation, the amount they’re using a product and most important, the value they’ve obtaining from the product.

So what’s the alternative? SaaS products tend to charge per-user, but the model doesn’t always work well for infrastructure tools. You could be running monitoring for a large company with 1,000 servers but only have 3 user accounts for a small sysadmin team, which doesn’t really work for the vendor.

Some products can charge based on volume or API calls, but even this is risky. A heavy micro-service architecture would result in large number of HTTP calls between applications, so you can hardly say an app with 10,000 req/min is getting 4x the value compared to a client with a 2,500 req/min application – it could be all internal API calls.

 

To give an example of how painful the current world of subscription licensing is with modern computing, let’s conduct a thought exercise and have a look at the current pricing model of some popular platforms.

Let’s go with creating a startup. I’m going to run a small SaaS app in my spare time, so I need a bit of compute, but also need business-level tools for monitoring and debugging so I can ensure quality as my startup grows and get notified if something breaks.

First up I need compute. Modern cloud compute providers *understand* subscription pricing. Their models are brilliantly engineered to offer a price point for everyone. Whether you want a random jump box for $2/month or a $2000/month massive high compute monster to crunch your big-data-peak-hipster-NoSQL dataset, they can deliver the product at the price point you want.

Let’s grab a basic Digital Ocean box. Well actually let’s grab 2, since we’re trying to make a redundant SaaS product. But we’re a cheap-as-chips startup, so let’s grab 2x $5/mo box.

Screen Shot 2015-11-03 at 21.46.40

Ok, so far we’ve spent $10/month for our two servers. And whilst Digital Ocean is pretty awesome our code is going to be pretty crap since we used a bunch of high/drunk (maybe both?) interns to write our PHP code. So we should get a real time application monitoring product, like Newrelic APM.

Screen Shot 2015-11-03 at 21.37.46

Woot! Newrelic have a free tier, that’s great news for our SaaS application – but actually it’s not really that useful, it can’t do much tracing and only keeps 24 hours history. Certainly not enough to debug anything more serious than my WordPress blog.

I’ll need the pro account to get anything useful, so let’s add a whopping $149/mo – but actually make that $298/mo since we have two servers. Great value really. :-/

 

Next we probably need some kind of paging for oncall when our app blows up horribly at 4am like it will undoubtably do. PagerDuty is one of the popular market leaders currently with a good reputation, let’s roll with them.

Screen Shot 2015-11-03 at 21.52.57

Hmm I guess that $9/mo isn’t too bad, although it’s essentially what I’m paying ($10/mo) for the compute itself. Except that it’s kinda useless since it’s USA and their friendly neighbour only and excludes us down under. So let’s go with the $29/mo plan to get something that actually works. $29/mo is a bit much for a $10/mo compute box really, but hey, it looks great next to NewRelic’s pricing…

 

Remembering that my SaaS app is going to be buggier than Windows Vista, I should probably get some error handling setup. That $298/mo Newrelic APM doesn’t include any kind of good error handler, so we should also go get another market leader, Raygun, for our error reporting and tracking.

Screen Shot 2015-11-03 at 22.00.54

For a small company this isn’t bad value really given you get 5 different apps and any number of muppets working with you can get onboard. But it’s still looking ridiculous compared to my $10/mo compute cost.

So what’s the total damage:

Compute: $10/month
Monitoring: $371/month

Ouch! Now maybe as a startup, I’ll churn up that extra money as an investment into getting a good quality product, but it’s a far cry from the day when someone could launch a new product on a shoestring budget in their spare time from their uni laptop.

 

Let’s look at the same thing from the perspective of a large enterprise. I’ve got a mission critical business application and it requires a 20 core machine with 64GB of RAM. And of course I need two of them for when Java inevitably runs out of heap because the business let muppets feed garbage from their IDE directly into the JVM and expected some kind of software to actually appear as a result.

That compute is going to cost me $640/mo per machine – so $1280/mo total. And all the other bits, Newrelic, Raygun, PagerDuty? Still that same $371/mo!

Compute: $1280/month
Monitoring: $371/month

It’s not hard to imagine that the large enterprise is getting much more value out of those services than the small startup and can clearly afford to pay for that in relation to the compute they’re consuming. But the pricing model doesn’t make that distinction.

 

So given that we know know that per-core pricing is terrible and per-server pricing is terrible and (at least for infrastructure tools) per-user pricing is terrible what’s the solution?

“Cloud Spend Licensing” [1]

[1] A term I’ve just made up, but sounds like something Gartner spits out.

With Cloud Spend Licensing, the amount charged reflects the amount you spend on compute – this is a much more accurate indicator of the size of an organisation and value being derived from a product than cores or servers or users.

But how does a vendor know what this spend is? This problem will be solving itself thanks to compute consumers starting to cluster around a few major public cloud players, the top three being Amazon (AWS), Microsoft (Azure) and Google (Compute Engine).

It would not be technically complicated to implement support for these major providers (and maybe a smattering of smaller ones like Heroku, Digital Ocean and Linode) to use their APIs to suck down service consumption/usage data and figure out a client’s compute spend in the past month.

For customers whom can’t (still on VMWare?) or don’t want to provide this data, there can always be the fallback to a more traditional pricing model, whether it be cores, servers or some other negotiation (“enterprise deal”).

 

 

How would this look?

In our above example, for our enterprise compute bill ($1280/mo) the equivalent amount spent on the monitoring products was 23% for Newrelic, 3% for Raygun and 2.2% for PagerDuty (total of 28.2%). Let’s make the assumption this pricing is reasonable for the value of the products gained for the sake of demonstration (glares at Newrelic).

When applied to our $10/month SaaS startup, the bill for this products would be an additional $2.82/month. This may seem so cheap there will be incentive to set a minimum price, but it’s vital to avoid doing so:

  • $2.82/mo means anyone starting up a new service uses your product. Because why not, it’s pocket change. That uni student working on the next big thing will use you. The receptionist writing her next mobile app success in her spare time will use you. An engineer in a massive enterprise will use you to quickly POC a future product on their personal credit card.
  • $2.82/mo might only just cover the cost of the service, but you’re not making any profit if they couldn’t afford to use it in the first place. The next best thing to profit is market share – provided that market share has a conversion path to profit in future (something some startups seem to forget, eh Twitter?).
  • $2.82/mo means IT pros use your product on their home servers for fun and then take their learning back to the enterprise. Every one of the providers above should have a ~ $10/year offering for IT pros to use and get hooked on their product, but they don’t. Newrelic is the closest with their free tier. No prizes if you guess which product I use on my personal servers. Definitely no prizes if you guess which product I can sell the benefits of the most to management at work.

 

But what about real earnings?

As our startup grows and gets bigger, it doesn’t matter if we add more servers, or upsize the existing ones to add bigger servers – the amount we pay for the related support applications is always proportionate.

It also caters for the emerging trend of running systems for limited hours or using spot prices – clients and vendor don’t have to worry about figuring out how it fits into the pricing model, instead the scale of your compute consumption sets the price of the servers.

Suddenly that $2.82/mo becomes $56.40/mo when the startup starts getting successful and starts running a few computers with actual specs. One day it becomes $371 when they’re running $1280/mo of compute tier like the big enterprise. And it goes up from there.

 

I’m not a business analyst and “Cloud Spend Licensing” may not be the best solution, but goddamn there has to be a more sensible approach than believing someone will spend $371/mo for their $10/mo compute setup. And I’d like to get to that future sooner rather than later please, because there’s a lot of cool stuff out there that I’d like to experiment with more in my own time – and that’s good for both myself and vendors.

 

Other thoughts:

  • I don’t want vendors to see all my compute spend details” – This would be easily solved by cloud provider exposing the right kind of APIs for this purpose eg, “grant vendor XYZ ability to see sum compute cost per month, but no details on what it is“.
  • I’ll split my compute into lots of accounts and only pay for services where I need it to keep my costs low” – Nothing different to the current situation where users selectively install agents on specific systems.
  • This one client with an ultra efficient, high profit, low compute app will take advantage of us.” – Nothing different to the per-server/per-core model then other than the min spend. Your client probably deserves the cheaper price as a reward for not writing the usual terrible inefficient code people churn out.
  • “This doesn’t work for my app” – This model is very specific to applications that support infrastructure, I don’t expect to see it suddenly being used for end user products/services.

SMStoXMPP

Having moved to AU means that I now have two cell phones – one with my AU SIM card and another with my NZ SIM card which I keep around in order to receive the odd message from friends/contacts back home and far too many calls from telemarketers.

I didn’t want to have to carry around a second mobile and the cost of having a phone on roaming in AU makes it undesirably expensive to keep in touch with anyone via SMS messaging, so went looking for a solution that would let me get my SMS messages from my phone to my laptop and phone in a more accessible form.

I considered purchasing an SMS gateway device, but they tend to be quite expensive and I’d still have to get some system in place for getting messages from the device to me in an accessible form.

Instead I realised that I could use one of the many older Android cellphones that I have lying around as a gateway device with the right software. The ability to run software makes them completely flexible and with WiFi and 3G data options, it would be entirely possible to leave one in NZ and take advantage of the cheaper connectivity costs to send SMS back to people from within the country.

I was able to use an off-the-shelf application “SMS Gateway” to turn the phone into an SMS gateway, with the option of sending/receiving SMS messages via HTTP or SMTP/POP3.

However emails aren’t the best way to send and reply to SMS messages, particularly if your mail client decides to dump in a whole bunch of MIME data. I decided on a more refined approach and ended up writing a program called “SMStoXMPP“.

Like the name suggestions, SMStoXMPP is lightweight PHP-based SMS to XMPP (Jabber) bi-directional gateway application which receives messages from an SMS gateway device/application and relays them to the target user via XMPP instant messages. The user can then reply via XMPP and have the message delivered via the gateway to the original user.

For me this solves a major issue and means I can leave my NZ cell phone at my flat or even potentially back in NZ and get SMS on my laptop or phone via XMPP no matter where I am or what SIM card I’m on.

smstoxmpp_layout

To make conversations even easier, SMStoXMP does lookups of the phone numbers against any CardDAV address book (such as Google Contacts) and displays your chosen name for the contact. It even provides search functions to make it even easier to find someone to chat to.

Chatting with various contacts via SMStoXMPP with Pidgin as a client.

Chatting with various contacts via SMStoXMPP with Pidgin as a client.

I’ve released version 1.0.0 today, along with documentation for installing, configuring gateways and documentation on how to write your own gateways if you wish to add support for other applications.

Generally it’s pretty stable and works well – there are a few enhancements I want to make to the code and a few bits that are a bit messy, but the major requirements of not leaking memory and being reliably able to send and receive messages have been met. :-)

Whilst I’ve only written support for the one Android phone base gateway, I’m working on getting a USB GSM modem to work which would also be a good solution for anyone with a home server.

It would also be trivial to write in support for one of the many online HTTP SMS gateways that exist if you wanted a way to send messages to people and didn’t care about using your existing phone number.

 

Don’t abandon XMPP, your loyal communications friend

Whilst email has always enjoyed a very decentralised approach where users can be expected to be on all manner of different systems and providers, Instant Messaging has not generally enjoyed the same level of success and freedom.

Historically many of my friends used proprietary networks such as MSN Messenger, Yahoo Messenger and Skype. These networks were never particularly good IM networks, rather what made those networks popular at the time was the massive size of their user bases forcing more and more people to join in order to chat with their friends.

This quickly lead to a situation where users would have several different chat clients installed, each with their own unique user interfaces and functionalities in order to communicate with one another.

Being an open standards and open source fan, this has never sat comfortably with me – thankfully in the last 5-10yrs, a new open standard called XMPP (also known as Jabber) has risen up and had wide spread adoption.

500px-XMPP_logo.svg

XMPP brought the same federated decentralised nature that we are used to in email to instant messaging, making it possible for users on different networks to communicate, including users running their own private servers.

Just like with email, discovery of servers is done entirely via DNS and there is no one centralised company, organisation or person with control over the system -each user’s server is able to run independently and talk directly to the destination user’s server.

With XMPP the need to run multiple different chat programs or connect to multiple providers was also eliminated.  For the first time I was able to chat using my own XMPP server (ejabberd) to friends using their own servers, as well as friends who just wanted something “easy” using hosted services like Google Talk which support(ed) XMPP, all from a single client.

Since Google added XMPP into Google Talk, it’s made the XMPP user base even larger thanks to the strong popularity of Gmail creating so many Google Talk users at the same time. With so many of my friends using it, is has been so easy to add them to my contacts and interact with them on their preferred platform, without violating my freedom and losing control over my server ecosystem.

Sadly this is going to change. Having gained enough critical mass, Google is now deciding to violate their “Don’t be evil” company moral and is moving to lock users into their own proprietary ecosystem, by replacing their well established Google Talk product with a new “Hangouts” product which drops XMPP support.

There’s a very good blog write up here on what Google has done and how it’s going to negatively impact users and how Google’s technical reasons are poor excuses, which I would encourage everyone to read.

The scariest issue is the fact that a user upgrading to Hangouts get silently disconnected from being able to communicate with their non-Google XMPP using friends. If you use Google Talk currently and upgrade to Hangouts, you WILL lose the ability to chat with XMPP users, who will just appear as offline and unreachable.

It’s sad that Google has taken this step and I hope long term that they decide as a company that turning away from free protocols was a mistake and make a step back in the right direction.

Meanwhile, there are a few key bits to be aware of:

  1. My recommendation currently is do not upgrade to Hangouts under any circumstance – you may be surprised to find who drops off your chat list, particularly if you have a geeky set of friends on their own domains and servers.
  2. Whilst you can hang onto Google Talk for now, I suspect long term Google will force everyone onto Hangouts. I recommend considering new options long term for when that occurs.
  3. It’s really easy to get started with setting up an XMPP server. Take a look at the powerful ejabberd or something more lightweight like Prosody. Or you could use a free hosted service such as jabber.org for a free XMPP account hosted by a third party.
  4. You can use a range of IM clients for XMPP accounts, consider looking at Pidgin (GNU/Linux & Windows), Adium (MacOS) and Xabber (Android/Linux).
  5. If you don’t already, it’s a very good idea to have your email and IM behind your own domain like “jethrocarr.com”. You can point it at a provider like Google, or your own server and it gives you full control over your online identity for as long as you wish to have it.

I won’t be going down the path of using Hangouts, so if you upgrade, sorry but I won’t chat to you. Please get an XMPP account from one of the many providers online, or set up your own server, something that is generally a worth while exercise and learning experience.

If someone brings out a reliable gateway for XMPP to Hangouts, I may install it, but there’s no guarantee that this will be possible – people have been hoping for a gateway for Skype for years without much luck, so it’s not a safe assumption to have.

Be wary of some providers (Facebook and Outlook.com) which claim XMPP support, but really only support XMPP to chat to *their* users and lack XMPP federation with external servers and networks, which defeats the whole point of a decentralised open network.

If you have an XMPP account and wish to chat with me, add me using my contact details here. Note that I tend to only accept XMPP connections from people I know – if you’re unknown to me but want to get in touch, email is best at first.

Google Play Region Locking

Sadly Android is giving me further reason for dissatisfaction with the discovery that Google Play refuses to correctly detect the current country I’m in and is blocking content from them.

Being citizens of New Zealand and living in Australia, both Lisa and I have a mixture of applications for services in both countries –  a good example being banking apps, where we tend to want to be able to easily manage our financial affairs in both countries.

The problem is, that all my phones are stuck as being in New Zealand, whilst Lisa’s phone has decided to be stuck to AU. This hasn’t impacted me much yet, since the Commbank application isn’t region locked and being treated by Google as being “inside” NZ I can install the ANZ Go Money application from within AU.

However because Lisa’s phone is the other way around and stuck in AU, she’s unable to install both the ANZ Go Money and Vodafone NZ My Account application, thanks to the developers of both applications region locking them to NZ only.

What makes the issue even more bizarre, is that both of Lisa and my Google Accounts were originally created in New Zealand yet are behaving differently.

Thou shalt not pass!

Thou shalt not pass!

There’s a few particularly silly mistakes here. Firstly ANZ shouldn’t be assuming their customers are always in NZ, us Kiwis do like to get around the world and tend to want to continue managing our financial affairs from other countries. The app is clearly one that should be set to a global region.

I suspect they made this mistake to try and avoid confusing AU and NZ ANZ customers with different apps, but surely they could just make it a bit clearer in the header of the app name…

 

Secondly Google… I’m going to ignore the fact that they even offer region locking for applications (urgghhg DRM evil) and going to focus on the fact that the masters of information can’t even detect what fucking country a user is in properly.

I mean it must be so hard to do, when there are at least 4 ways of doing it automatically.

  1. Use the GPS in every single Android phone made.
  2. Use the SIM card to determine the country/region.
  3. Do a GeoIP lookup of the user’s IP address.
  4. Use the validated credit card on the user’s wallet account to determine where the user is located. (Banks tend to be pretty good at validating people’s locations and identity!)

Sure, none of these are guaranteed to be perfect, but using at least 3 of the above and selecting the majority answer would lead to a pretty accurate system. As it is, I have several phones currently powered up around my apartment in AU, none of which are validating as being in AU at all:

Hey guys, I'm pretty sure my OPTUS CONNECTED PHONE is currently located in AU.

Hey guys, I’m pretty sure my OPTUS CONNECTED PHONE is currently located in AU.

Instead there is an option buried somewhere in Google’s maze of applications and settings interfaces which must be setting the basis for deciding the country our phones are in.

But which one?

  • The internet suggests that Google Wallet is the deciding factor for the user’s country. But which option? Home address is set to NZ, phone number is set to NZ – have even added a validated NZ credit card for good measure!
  • Google Mail keeps a phone number, timezone and country setting.. all of which are set to NZ.
  • Google+ keeps it’s own profile information, all of which is set to NZ.
  • I’m sure there’s probably even more!

To make things even more annoying is that I can’t even tell if making a change helps or not. How do I tell if the phone is *reading* the new settings? I’ve done basic stuff like re-sync Google, but also have gone and cleared the Google Play service cache and saved settings on the phone, to no avail.

It’s a bit depressing when even Apple, the poster child for lock down and DRM makes region changing as simple as signing into iTunes with a new country account and installing an application that way.

Restricting and limiting content by geographic regions makes no sense in a modern global world and it’s just sad to see Google continuing to push this outdated model instead of making a stand and adopting 21st century approaches.

If anyone knows how to fix Google to see the correct country it would be appreciated… if you know how to get Google to fix this bullshit and let us install the same applications globally, even better.

Google Convoy

Spotted this when doing street view around Pyrmont Sydney – there’s at least two Google cars since one has captured this picture, looks like there might even be a third….

The watchers are watching the watched.

The watchers are watching the watchers.

Unsure why there would be a convoy of cars, the few occasions when I’ve seen Google imaging cars in the past, they’ve always been standalone. Having said that, Pyrmont contains Google’s Australian headquarters, so maybe there’s a few special projects on the go.

Fixing Blogging

I’m finding an increasing number of friends and people using services like Tumblr or Google Plus as blogging services, or at least as a place to make posts that are more detailed an indepth than typical micro-blogging (aka Twitter/Facebook).

The problem with both these services, is that they deny interaction from external users who aren’t registered with their service.

With traditional blogging platforms such as WordPress, Blogger, or other custom developed blogs, any visitor to the blog could read it and post comment – the interfaces would vary, the ease of posting would vary and the method of validation of posting would vary, but you could 99% of the time still be able to post comments and engage with the author.

This has not been the case with social networks to date – platforms like Twitter or Facebook require a user to be logged in, in order to communicate with others – however this tends to work OK, since they’re mostly used for person-to-person messages and broadcasting, rather than detailed posts you will sent to users outside of those networks (after all 140 char tweets aren’t exactly where you’ll debate things of key meaning).

The real issue starts with half-blog, half-microblog services such as Tumblr and Google Plus, which users have started to use for anything from cat pictures to detailed Linux kernel posts, turning these tools into de-facto blogging platforms, but without the freedom for outsiders to post comments and engage in conversation.

 

Tumblr is one of the worst networks, as it’s very much designed as a glorified replacement for chain email forwards – you post some text or some pictures and all your friends “reblog” your page if they like it and users all pat themselves on their back at how witty and original they all are.

But to make a comment, one must reblog the post, add a comment and have it end up in the pages long list of reblog and like statements at the bottom of the post. And if the original poster wants to comment on that, you’d have to re-blog their blog. :-/

Yo dawg, we heard you like to reblog your reblogging.

The issue is that more people are starting to use it for more than just funny cat pictures and treat it as a replacement to blogging, which makes for a terrible time engaging with anyone. I have friends who use the service to post updates about their lives, but I can’t engage back – makes me feel like some kind of outcast stalker peering through the windows at them.

And even if I was on Tumblr, I’d actually want to be able to comment on things without reblogging them – nobody else cares if Jane had a baby, but I’d like to say “Congrats Jane, you look a lot less fat now the fork()ed process is out” to let my friend know I care.

Considering most Tumblr users are going to use Facebook or Twitter as well, they might as well use the image and short statement posting features of those networks and instead use an actual blog for actual content. Really the fault is due to PEBKAC – users using a bad service in the wrong way.

 

Google Plus is a bit better than Tumblr, in the respect that it actually has expected functionality like posts you can comment on, however it lacks the ability for outsiders to post comments and engage with the author – Google has been pretty persistent with trying to get people to sign up for an account, so it’s to be expected somewhat.

I’ve seen a lot of uptake with Google Plus by developers and geeks, seemingly because they don’t want the commitment of actually using a blog for detailed posts, but want somewhere to post lengthy bits of test.

Linus Torvalds is one particular user whom I might want to follow on Google Plus, but there’s not even RSS if you wanted to get updates on new posts! (To get RSS, you’d have to use external thirdparty services).

Tumblr at least has RSS so I can still use it in my reader like everything else, even if I can’t reply to the author….

Follow Linus! Teenage fanboy Jethro squeee!

And of course with no ability for posting comments by outsiders, I can’t post Linus comments requesting his hand in marriage after merging a kernel bug fix for my laptop. :-(

 

So with all these issues, why are users adopting these services? After all, there are thousands of free blogging services, several well known and very good ones, all better technical options.

I think it’s a combination of issues:

  • Users got overwhelmed by RSS – we followed everything we loved, then got scared by the 10,000 unread posts in our readers – and resolved by simply not opening the reader in fear of the queue waiting for us. The social media style approaches used by Google+, Tumblr and of course Twitter and Facebook focus less on following every single post by users, but rather what’s happening here and now – users don’t feel bad if they miss reading 1,000 posts overnight, they just go on to the next.
  • Users love copying. The MPAA & RIAA love this fact about humans, we love to copy and share stuff with others. Blogging culture tends to frown on this, but Tumblr’s reblogging style of use makes it more acceptable and maintains a credit trail.
  • Less commitment – if I started posting pictures of funny cats or one paragraph posts on this blog, it wouldn’t be doing it justice or up to the level of quality readers expect. However on social network based services, this is OK, there’s no expectation of a certain level of presentation and effort into a post. A funny cat picture followed by the post about you raging about by GNU Hurd will always be better than BSD is acceptable – on a blog, you’d drop the funny cat and be expected to write a well detailed post explaining your reasoning. Another label would be that it’s “more casual”, than conventional blogs.
  • Easier interactions with your readers (at least with Google+) – there’s no standards with blogging for handling notifications to users about changes to your blog or replies to comments. Even WordPress, one of the most popular platforms, doesn’t provided native email notifications to comments.
  • I noticed a major improvement in the level of interaction between myself and my readers after adding Subscribe to Comments Reloaded plugin to this site, using email notifications to users about replies to my blog post. And considering how slack many people are with checking their email, I do wonder how much better it would be if I added support for notification to new posts and comment replies via Twitter or Facebook.
  • Conventional blogs tend to take a bit more effort to post comments, some go overboard with captcha input fields that take 10 attempts or painful comment validation. I’ve tried to keep mine simple with basic fields and dealing with spam using Akismet rather than captcha (which has worked very well for me).

In my opinion the biggest issue is the communication, notification and interaction issue as noted above. I don’t believe we can fix the cultural side of users such as the crap they post or the inability to actually make the effort to read their RSS but we can go someway towards improving the technology to reduce/eliminate some of the pain points, to encourage use of the services.

There have been some attempts to address these issues already:

  • Linkback techniques such as Pingback address the issue of finding out who’s linking to your blog (although I turned this off as I found it really spammy and I get that information out of awstats anyway).
  • RSS handles getting updates of new posts on a polling basis and smarter RSS readers offer better filtering/grouping/etc.
  • Email notifications for blog comments and updates.

But it’s not good enough yet – what I’d actually like to see would be:

  • Improvement of linkback techniques to spam pages less, potentially with the addition of some AI logic to determine whether the linkback was just “check out this cool post!” or some actual useful content that readers of your post would like to read (such as a rebuttal).
  • Smarter RSS readers that act more like social network feeds, to give users who want more of a “live stream” feel what they want.
  • Live commenting technology – not all users have push email, so email notifications kind of suck for many users. A better solution would be to use the existing XMPP standard to send notifications to the user’s XMPP server (anyone using Gmail already has an XMPP service with them and numerous geeks run their own – like me ;-), so the user gets a chat message pop up. If the message format was standardized, it would be possible to have the IM client recognize it was a blog comment reply and to hand off to the installed RSS reader to handle for better UX – or fall back to posting text with a link to the reply for support with any XMPP standard client.
  • (I did see that there is an outdated plugin for XMPP on WordPress  as well as some commercial live-commenting packages that hook into social networks, but I really want a proper open source solution that does everything in one plugin, so there’s a more seemless UX – rather than having 20 checkboxes for which method the user would like notifications via.)
  • Whilst mentioning XMPP, we could even consider replacing RSS with XMPP based push notifications – blog servers sending out a push message when they get an update, rather than readers polling services. Advantage is near-instant update of new posts and potentially less server load of not having thousands of wasted polls when there hasn’t been any update to fetch.
  • Comment reply via notification support. If you send someone an XMPP IM, email, tweet, virtual sheep or whatever to alert to a comment or blog post, they should be able to reply via that native medium and have the blog server interpret, validate and integrate that reply into the page.

My hope is that with these upgrades, blogging platforms will extend themselves to be better placed for holding up against social networking sites, making it easier to have detailed conversations and long running threads with readers and authors.

Moving to a new generation communication platform build around the existing blogging platforms would be as much of an improvement for real time social responsiveness as shifting from email to Twitter and hopefully, the uptake in real time communications will bring more users back to decentralised, open and varied platforms.

I’m tempted to give this a go by building a WordPress plugin to provide unified notifications using XMPP / Email / Social Media, but it’ll depend on time (lol who has that??) and I haven’t done much with WordPress’s codebase before. If you know of something existing, I would certainly be interested to read about it and I’ll be taking a look at options to build upon.

Google Search & Control

I’ve been using Google for search for years, however it’s the first time I’ve ever come across a DMCA takedown notice included in the results.

Possibly not helped by the fact that Google is so good at finding what I want, that I don’t tend to scroll down more than the first few entries 99% of the time, so it’s easy to miss things at the bottom of the page.

Lawyers, fuck yeah!

Turns out that Google has been doing this since around 2002 and there’s a form process you can follow with Google to file a notice to request a search result removal.

Sadly I suspect that we are going to see more and more situations like this as governments introduce tighter internet censorship laws and key internet gatekeepers like Google are going to follow along with whatever they get told to do.

Whilst people working at Google may truly subscribe to the “Don’t be evil” slogan, the fundamental fact is that Google is a US-based company that is legally required to do what’s best for the shareholders – and the best thing for the shareholders is to not try and fight the government over legalization, but to implement as needed and keep selling advertising.

In response to concerns about Google over privacy, I’ve seen a number of people to shift to new options, such as the increasingly popular and open-source friendly Duck Duck Go search engine, or even Microsoft’s Bing which isn’t too bad at getting decent results with a UI looking much more like early Google.

However these alternatives all suffer from the same fundamental problem – they’re centralized gatekeepers who can be censored or controlled – and then there’s the fact that a centralised entity can track so much about your online browsing. Replacing Google with another company will just leave us in the same position in 10 years time.

Lately I’ve been seeking to remove all the centralized providers from my online life, moving to self-run and federated services – basic stuff like running my own email, instant messaging (XMPP), but also more complex “cloud” services being delivered by federated or self-run servers for tasks such as browser syncing, avatar handling, contacts sync, avoiding URL shortners and quitting or replacing social networks.

The next big one of the list is finding an open source and federated search solution – I’m currently running tests with a search engine called YaCy, which is a peer-to-peer decentralised search engine that is made up of thousands of independent servers, sharing information between themselves.

To use YaCy, you download and run your own server, set it’s search indexing behavior and let it run and share results with other servers (it’s also possible to run it in a disconnected mode for indexing your internal private networks).

The YaCy homepage has an excellent write up of their philiosophy and design fundamentals for the application.

It’s still a bit rough, I think the search results could be better – but this is something that having more nodes will certainly help with and the idea is promising – I’m planning to setup a public instance on my server in the near future for adding all my sites to the index and providing a good test of it’s feasibility.

Mozilla Collusion

This week Mozilla released an add-on called Collusion, an experimental extension which shows and graphs how you are being tracked online.

It’s pretty common knowledge how much you get tracked online these days, if you just watch your status bar when loading many popular sites you’ll always see a few brief hits to services such as Google Analytics, but there’s also a lot of tracking down with social networking services and advertisers.

The results are pretty amazing, I took these after turning it on for myself for about 1 day of browsing, every day I check in the graph is even bigger and more amazing.

The web actually starting to look like a web....

As expected, Google is one of the largest trackers around, this will be thanks to the popularity of their Google Analytics service, not to mention all the advertising technology they’ve acquired and built over the years including their acquisition of DoubleClick.

I for one, welcome our new Google overlords and would like to remind them that as a trusted internet celebrity I can be useful for rounding up other sites to work in their code mines.

But even more interesting is the results for social networks. I ran this test whilst logged out of my Twitter account, logged out of LinkedIn and I don’t even have Facebook:

Mark Zuckerberg knows what porn you look at.

Combine 69+ tweets a day & this information and I think Twitter would have a massive trove of data about me on their servers.

Linkedin isn't quite as linked at Facebook or Twitter, but probably has a simular ratio if you consider the userbase size differences.

When you look at this information, you can see why Google+ makes sense for the company to invest in. Google has all the data about your browsing history, but the social networks are one up – they have all your browsing information with the addition of all your daily posts, musings, etc.

With this data advertising can get very, very targeted and it makes sense for Google to want to get in on this to maintain the edge in their business.

It’s yet another reason I’m happy to be off Twitter now, so much less information that can be used by advertisers for me. It’s not that I’m necessarily against targeted advertising, I’d rather see ads for computer parts than for baby clothes, but I’m not that much of a fan of my privacy being so exposed and organisations like Google having a full list of everything I do and visit and being able to profile me so easily.

What will be interesting will be testing how well the tracking holds up once IPv6 becomes popular. On one hand, IPv6 can expose users more, if they’re connecting with a MAC-based address, but on the other hand, could privatise more using IPv6 address randomisation when assigning systems IP addresses.

Why I hate URL shorteners

I’ve used Awstats for years as my website statistics/reporting program of choice – it’s trivial to setup, reliable and works with Apache log files and requires no modification to the website or usage of remote tools (like with Google Analytics).

One of the handy features is the “Links from an external page” display, which is a great way of finding out where sudden bursts of hits are coming from, such as news posts mentioning your website or other bloggers linking back.

Sadly over the past couple of years it’s getting less useful thanks to the horrible wonder that is URL shortening.

URL shorteners have always been a controversial service – whilst they can be a useful way of making some of the internet’s more horrible website URLS usable, they cause a number of long term issues:

  • Centralisation – The internet works best when decentralised, but URL shortening makes a large number of links dependent on a few particular organisations who may or not be around in the future. There’s already been a number of link shortening companies who have closed down killing large numbers of links and there will undoubtedly be more in the future.
  • Link Hiding – Short URLs are a great way to send someone a link and have them open it without realising what content they’re actually about to open. It could be as innocent as a prank for a friend or as bad as malicious malware or scamming websites.
  • Performance – It takes an extra DNS query (or several) to lookup the short URL servers before the actual destination can be looked up. This sounds like a minor issue, but it can add up when on high latency connections (eg mobile) or when connecting to international content on NZ’s wonderful internet and can add up to a number of seconds sometimes.
  • Privacy – a third party can collate large amounts of information about an individual’s browsing history if they have a popular enough URL shortening service.

Of course URL shortening isn’t entirely evil, there’s a few valid use cases where they are acceptable or at least forgivable:

  • Printed materials with URLs on them for manual entry. Nobody likes typing more than they need to, that’s understandable.
  • Quickly sending temporary links to people via IM or email where the full URL breaks due to the client application’s inability to phrase the URL correctly.

Anything other than the above is inexcusable, computers are great at hiding the complexities of large bits of information, there’s no need for your blog, social network or application to use short URLs where there is no human entry factor involved.

Twitter is particularly guilty at abusing short URLs – part of this was originally historic, but when Twitter had the opportunity to fix, they chose to instead contribute further towards the problem.

Back in the early days of Twitter, there was no native URL handling, so in order to fit many links into the maximum tweet size of 140chars, users would use a URL shortener such as the classic tinyurl.com or more recent arrivals such as bit.ly to keep the URL lengths as small as possible.

Twitter later decided to implement their own URL shortening service called t.co and now enforce the re-writing of all URLs posted via Twitter to use t.co links, in a semi-transparent fashion where some/all of the original URL will be shown in the tweet, but the actual hyperlink will always go through t.co.

This change offers some advantage to users in that they were no longer dependent on external providers closing down and breaking all their links, as well as having some security advantages in that Twitter maintain lists of bad URLs (URLs they consider to serve malware or other unwanted content) to help stop the spread of dodgy content.

But it also gave Twitter the ability to track click data to figure out which links users were clicking on, I imagine this information would be highly valuable to advertisers. (Google do a very similar thing with the Google results web pages, where all clicks are first directed through a Google server to track what results users select, before the user is delivered to the requested page).

The now mandatory use of URL shorteners on Twitter has lead to a situation where it’s no longer easy to track which tweets or even, what tweeters, are leading to the source of your hits.

Even more confusingly, the handling of referred URLs is inconsistent depending on the browser/client following the link. The vast majority will log as the short URL version, but some will be smart enough to provide the referred URL *before* the referring took place.

RFC 2616 doesn’t touch on how shortened URLs should be handled when referring and leaves the issue of how 301 redirects should have their referrers handled up to the implementers decision. And their are valid arguments for using the original page vs the short URL as the referrer.

For example, for this tweet I have about 9 visits via http://twitter.com/jethrocarr/status/170112859685126145 and 29 visits via http://t.co/0RJteq3r, which throws out hit-count based ordering of the results:

Got to love Twitter & shortened URLs - most of these relate to tweets, but to which tweets? No easy way to track back.

A much better solution, would have been for twitter to display shortened versions of URLs in the tweet text to meet the 140 char limit, but the actual link href record featuring the full URL – for example, a tweet could have “jethrocarr.com/i-like-…..” as the link text to fit within 140 chars, but the actual href record would be the full “jethrocarr.com/i-like-cake” URL.

Whilst tweets are known as being 140 chars, there’s actually far more information than that sorted about each tweet: location co-ordinates, full URL information, date, time and more, so there is no excuse for Twitter to not be able to retain that URL data – of course, that information has value for them for advertising and tracking purposes, so I wouldn’t expect it to ever go away.

(As a side note, there’s an excellent write up on ReadWriteWeb about the structure of a tweet and associated information)

 

Over all, shortened URLs are just a pain for dealing with and it would be far better if people avoided them as much as possible, essentially if you’re using a short URL and it’s not because a user will be manually typing out content, then you’re doing it wrong.

Also keep in mind that many sites have their own shortish URL variations. For example, this article can be accessed via both date/name and ID number:

https://www.jethrocarr.com/2012/02/26/why-i-hate-url-shorteners
https://www.jethrocarr.com/?p=1453

Many people also run their own private shorteners, quite common with popular sites such as news websites wanting to retain control of the link process and is a much better idea if you plan to have lots of short URLs for your website for a valid reason.