Editor’s note: This is a guest post by Martin Böhringer, Co-Founder and CEO of Hojoki. -- Steve Bazyl
Hojoki integrate productivity cloud apps into one newsfeed and enables sharing and discussions on top of the feed. We’ve integrated 17 apps now and counting, so it’s safe to say that we’re API addicts. Now it's Time to share with you what we learned about the Google Apps APIs!
Our initial reason for building Hojoki was because of the fragmentation we experience in all of our cloud apps. And all those emails. Still, there was this feeling of “I don’t know what’s going on” in our distributed teamwork. So we decided to build something like a Google+ where streams get automatically filled by activities in the apps you use.
This leads to a comprehensive stream of everything that’s going on in your team combined with comments and microblogging. You can organize your stream into workspaces, which are basically places for discussions and collaboration with your team.
To build this, we first need some kind of information on recent events. As we wanted to be able to aggregate similar activities and to provide a search, as well as splitting up the stream in workspaces, we also had to be able to sort events as unique objects like files, calendar entries and contacts.
Further, it’s crucial to not only know what has changed, but who did it. So providing unique identities is important for building federated feeds.
Google’s APIs share some basic architecture and structure, described in their Google Data Protocol. Based on that, application-specific APIs provide access to the application’s data. What we use is the following:
The basic call for Google Contacts for example looks like this:
https://www.google.com/m8/feeds/contacts/default/full
This responds with a complete list of your contacts. Once we have this list all we have to do is to ask for the delta to our existing knowledge. For such use cases, Google’s APIs support query parameters as well as sorting parameters. So we can set “orderby” to “lastmodified” as well as “updated-min” to the timestamp of our last call. This way we are able to keep the traffic low and get quick results by only asking for things we might have missed.
If you want to develop using those APIs you should definitely have a look at the SDKs for them. We used the Google Docs SDK for an early prototype and loved it. Today, Hojoki uses its own generic connection handler for all our integrated systems so we don’t leverage the SDKs anymore.
If you’re into API development, you’ve probably already realized that our information needs don’t fit into many of the APIs out there. Most of the APIs are object centric. They can tell you what objects are included in a certain folder, but they can’t tell you which object in this folder has been changed recently. They just aren’t built with newsfeeds in mind.
Google Apps APIs support most of our information needs. Complete support of OAuth and very responsive APIs definitely make our lives easier.
However, the APIs are not built with Hojoki-like newsfeeds in mind. For example, ETags may change even if nothing happened to an object because of asynchronous processing on Google’s side (see Google’s comment on this). For us this means that, once we detect an altered ETag, in some cases we still have to check based on our existing data if there really have been relevant activities. Furthermore, we often have trouble with missing actors in our activities. For example, up to now we know when somebody changed a calendar event, but there is no way to find out who this was.
Another issue is the classification of updates. Google’s APIs tell us that something changed with an object. But to build a nice newsfeed you also want to know what exactly has been changed. So you’re looking for a verb like created, updated, shared, commented, moved or deleted. While Hojoki calls itself an aggregator for activities, technically we’re primarily an activity detector.
You can think of Hojoki as a multi-layer platform. First of all, we try to get a complete overview on your meta-data of the connected app. In Google Docs, this means files and collections, and we retrieve URI and name as well as some additional information (not the content itself). This information fills a graph-based data storage (we use RDF, read more about it here).
At the moment, we subscribe to events in the integrated apps. If detected, they create a changeset for the existing data graph. This changeset is an activity for our newsfeed and related to the object representation. This allows us to provide a very flexible aggregation and filtering on the client side. See the following screenshot. You can filter the stream for a certain collection (“Analytics”) or only for the file history or for the Hojoki workspace where this file is added (“Hojoki Marketing”).
What’s really important in terms of such heavy API processing is to use asynchronous calls. We use the great Open Source project async-http-client for this task.
When I wrote that “we subscribe to events” this is a very nice euphemism for “we’re polling every 30s to see if something changed”. This is not really optimal and we’d love to change it. If Google Apps APIs would support a feed of user events modelled in a common standard like ActivityStrea.ms, combined with reliable ETags and maybe even a push API (e.g. Webhooks) this would also make life easier for lots of developers syncing their local files with Google and help to reduce traffic on both sides.