What’s the difference between reality and theory? In theory, there is no difference. But reality often imposes unanticipated constraints on developers. These may come in the form of bandwidth restrictions, memory limits, timeouts, or other requirements of the systems that interact with your application.
My team recently built an application that helps us analyze the scheduling and usage of conference rooms at Google. We use the new Calendar API v3 on Google App Engine to read the rooms’ schedules, which we combine with actual occupancy data to calculate utilization and other metrics.
As you might imagine, Google has a lot of conference rooms (I believe the last official count was “more than twelve.”) And many of the rooms seem to be booked fairly solid. That means we need to read a lot of data from Calendar. So much, in fact, that our queries time out if we try to read an entire calendar at once. But the API team anticipated “Google scale” use and designed a mechanism that allows us to retrieve data in batches.
The idea is simple. When you create a request, you specify the page size: the maximum number of results you’d like Calendar to return in one batch. Calendar returns the data you requested, along with an opaque page token, which you can think of as a bookmark. To retrieve the next batch of data, you ask the API for the next page token and include the new token in your next request. The page token keeps track of the results you’ve already seen, so Calendar can send the next batch each time. You repeat this process until you’ve exhausted all the results.
Here’s how we did this in Java:
public void getRoomEvents(String roomEmail) throws IOException { // Create a request to list this room’s events (see code, below) Calendar.Events.List listRequest = getListRequest(roomEmail); do { // Retrieve one page of events Events events = executeListRequest(listRequest); List eventList = events.getItems(); // Process each event for (Event event : eventList) { processEvent(event); } // Update the page token listRequest.setPageToken(events.getNextPageToken()); // Stop when all results have been retrieved } while (listRequest.getPageToken() != null); } // Create a request to list the events for a room private Calendar.Events.List getListRequest(String roomEmail) throws IOException { return calendarClient.events().list(roomEmail) .setMaxResults(1000) // Limit each response to 1000 events .setPageToken(null) // Start with the first page of results // Return an individual event for each instance occurrence of a // recurring event .setSingleEvents(true); }
We call getRoomEvents() for each room, using the room’s email address to identify it to Calendar. (You can retrieve events from your own calendar by substituting your own email address.) Then getListRequest() creates a request that we will send to Calendar. The request asks for a list of up to 1000 events from the room’s calendar.
getRoomEvents()
getListRequest()
The remainder of getRoomEvents() is a loop that executes the request, processes the results, and updates the page token in preparation for the next request. The loop continues, retrieving and processing each subsequent page of results, until the entire list has been returned. The call to getNextPageToken() indicates the end of the results by returning a null value.
getNextPageToken()
By paginating our requests we avoid timeouts and reduce memory requirements. As an added benefit, each request completes fairly quickly, which means it’s also quick to retry if an error should occur. And finally, a multithreaded application may be able to process one or more pages of results while it retrieves the next, speeding execution. These advantages have led developers at Google to adopt pagination as a best practice. Look for it in our APIs when you need to exchange large amounts of data, and consider adding it to your own services.
If you have questions about our services or APIs, or if you want to see what other developers are doing with Google Calendar, check the discussions and documentation in the Google Apps Calendar API forum.