Lessons Learned from Deploying the Systems Described in "Decisions, Decisions..."

I reviewed ODK, CommCare, Formhub, Ona.io and Enketo in my Decisions, Decisions… post, which covered data collection from a smartphone. I thought it would be a good idea to share some of my experiences and challenges when working with these systems that rely on mobile data (3G) to transmit information. Note that there are work arounds, but they’re generally more costly than submitting over the mobile data connection or do not focus on long-term enterprise deployments. See the Humanitarian Data Toolkit for an awesome example of a 3G-less data collection pilot.

My team and I deployed formhub for a pilot and CommCareHQ.org in a production environment here in Nepal. I have realized a number of untested assumptions over the past 6 months that have to do with operationalization of these systems over a 3G data connection.

Here are the lessons up front. Keep reading for the story behind each.

  1. Host the form submission server as close to the end user as possible. (Note that you have to consider this when choosing a SAAS provider as well.)
  2. During your device research, identify the lowest resolution possible on the device before purchasing. This can be done by taking a photo in the store and looking through the file system to determine the size.
  3. Disable all applications that the user will not use. Updates to these applications may cause a bottleneck in your ability to update applications or use an incredible amount of data that destroys your ability to submit forms.

The problem isn’t the technology. The problem lies in my assumptions about the data backbone that the technology relies on. I assumed that the 3G network map available on the network provider websites was accurate in all locations. I performed numerous evaluations of network providers based on their 3G network availability including calling individuals in our remote offices and testing their internet access over a 3G connection. I setup demo forms and tested them in the field. I studied the best phone applications to install for remote location of missing phones, ensuring security and locking down non-work-related services like YouTube, Facebook and ErosNow. Each test tested an individual component of the entire mobile ecosystem that I had to develop for deployment. We deployed in early September and have been riddled with connectivity issues since.

##The Situation in the Field##
Our most remote workers are setup in small offices without electricity, a phone line or water. They go to work everyday and perform their duties during the hours when there is sunlight and they go home for other needs like lunch and recharging batteries. Their village has an internet cafe with only a couple computers and intermittent internet activity. WiFi isn’t available anywhere.

I analyzed the solar irradiance in the areas where we wanted to deploy phones and purchased portable solar chargers to run the smartphones. Our recurring data and voice plan aimed to improve the phone, data and recharging capacity in these areas. The system was complete. We brought everyone in for a multiple day training session and they were off with increased capacity.

##Data Issues## XForms are the core of the systems that I reviewed in Decisions, Decisions… Each of these platforms distributes an xml questionnaire from the server to the device that is rendered based on the standard. The user completes the form and only the data, with attachments, is sent back to the server. The server aggregates information allowing you to report and export the data.

Lesson 1: Host the form submission server as close to the end user as possible. (Note that you have to consider this when choosing a SAAS provider as well.)
Background
Our form submission server was originally hosted on Amazon Web Services which is known to have a robust Content Delivery Network (CDN). We didn’t realize that the CDN has no benefit for submission of forms, so our forms were travelling from the field in Nepal, down wires to India, under the Pacific Ocean to Seattle and across the US to the AWS East region cloud environment.

Normally, this is fine in any data connection environment. The phone sends small chunks of data to the server which are validated by the server and a verification message is returned for each chunk. At the end, a final verification message is sent from the server stating that all data was received and the phone marks the form as sent. This back and fourth isn’t a major burden for the phone or the server and it’s definitely the right way to do it. From experience, it generally requires a near constant data connection to submit a complete form. Minor interruptions shouldn’t be a problem, but major interruptions cause the form to fail and a resubmission attempt to occur.

The problem occurs when you try to send larger files which take more time and a more efficient network. Approximately half of our devices could submit data to the server without an issue. The other half had intermittent success. We did some tracing from these phones and realized that the issue had to do with the DNS registration among some of the routers that were connecting our phones to the server. Note, that there are many companies involved in getting a packet across a network and each company manages it’s own infrastructure. For example, we rely on the mobile service provider to get the packet from the phone to the internet service provider. Then they contact with the company that runs the fiber optic lines out of the country who delivers the package to India. From there, a number of Indian companies take each packet and deliver as they see fit across their network. This goes on and on. We did some trace routes and recognized that the information travelled out of Nepal appropriately, but got lost when it got to India. Half the time, the packet took the right route and the other half it took the wrong route and got stuck at a server that couldn’t resolve the address. We had a ghost in the system and our organization’s data collection became an international issue. Additionally, these systems queue the submissions. If the submission gets stuck, subsequent smaller submissions also won’t transmit. You have to work around this and try to dump the forms to a computer to submit in another way. CommCareHQ has this capability over both a wireless and USB connection. However, you have to be in a situation where the end user has access to a computer and training in case this happens.

Lesson 2: During your device research, identify the lowest resolution possible on the device before purchasing. This can be done by taking a photo in the store and looking through the file system to determine the size.
Background
Our pilot contained single photo submissions which performed fine. We had to expand the use case to include two photos. Of course this doubled the size of our form. We also piloted the system using the Samsung Young Duos 2 which had a minimum camera resolution that resulted in a file that was about 100kb per image. We upgraded the phones when we deployed the production system to the Samsung Galaxy S Duos 2 because we needed a flash due to problems with taking photos in low light. This phone had a better camera, but the minimum file size was about 1MB per image on the minimum camera resolution. We ultimately created a system that was regularly transmitting xforms that were 1.3MB to 3MB in total size depending on the number of photo attachments.

Lesson 3: Disable all applications that the user will not use. Updates to these applications may cause a bottleneck in your ability to update applications or use an incredible amount of data that destroys your ability to submit forms. Background
The Google Play Store automatically downloads updates to active applications. The YouTube application is updated frequently and requires huge downloads sometimes more than 10MB. This is an incredible burden on devices because the play store queues application downloads, blocking other important updates until YouTube has been updated. To make matters worse, you can’t delete the YouTube application on Android devices because it’s often embedded in the operating system. The only way to ensure YouTube isn’t updated is to disable it in the phone’s application settings. Additionally, each application has it’s own data settings. We installed an antivirus software for our production deployment. Two weeks after training all mobile data plans had been completely used. We tracked it back to the antivirus software which downloaded 100MB+ virus definition files so it could keep up to date. This was fine, but we didn’t realize that the default setting was set to download virus definitions over 3G connectivity. Every application on the phone has data mobile settings like this. It’s not enough to install and configure the applications, we have to make sure we fully comprehend every minute detail about it to make sure this stuff doesn’t happen.

Given these lessons, we have identified that we should consider other products (i.e. BRCK, SMS/MMS, and IVR) and activities that can perform as well or better when operationalizing mobile deployments. I’ll have to write about our process in another post.


Contact me if you'd like to talk about this post.

 