How to Migrate from Google App Engine Python 2 to Python 3 with Minimal Risk

Learn our strategy for migrating 75,000 lines of code over 5 months with no downtime

Cameron Henneke
Feb 28, 2023
Tech
12 minutes

For a large portion of last year I was fully immersed in a software migration I hope is required only once a decade...or less. After successfully emerging from the challenge, and allowing a few months to replenish my joie de vivre, I’m excited to share our journey to Python 3, including how the strategy we used minimized risk and preserved equanimity throughout the process.

Background

GQueues was started in 2008, when Python 2.5 was the only language available for Google App Engine, which had launched in beta earlier that year. So the entire backend of GQueues was written in Python 2. For years we blissfully ignored migrating to Python 3 (with all its breaking changes) because App Engine didn’t even offer it as an option until 2018, when they launched their second generation runtimes. The next couple years flew by as we prioritized building highly requested features for our users. Before we knew it, the official Python 2 sunset date came and went (January 1, 2020), and it was time to accept reality and face the daunting migration that lay ahead.

One of the initial selling points of App Engine was the large number of services built into the platform – it provided everything needed to build a modern web app: a web framework, Datastore API, Search API, Memcache API, Task Queue API, Deferred API, Mail API – and GQueues used them all! In the second generation runtimes, the built-in services were removed from App Engine to increase app portability. So before we could migrate to Python 3, we first needed to migrate all of the built-in services to their corresponding unbundled Google Cloud replacements!

Preparation

In 2020 we moved from the Search API to Elasticsearch on Google Kubernetes Engine, which was a 3-month project itself, with a strategy complex enough to warrant a blog post of its own. In 2021 another six weeks were spent migrating from the Memcache API to Google Cloud Memorystore for Redis. If you're looking to migrate legacy services to the latest runtimes, Google’s Wesley Chun has created a number of excellent codelabs and videos that walk you through most of the details.

At this point in our journey Google announced support for legacy bundled services on second generation runtimes through language-idiomatic libraries. THANK GOODNESS! We could now start the actual Python 3 migration without having to move the remaining services!

The conventional approach

App Engine documentation suggests the following migration strategy to move an app to Python 3: 

  • Upgrade code to work with both Python 2 and 3 using libraries such as six, python-future, or python-modernize.
  • Replace bundled services or modify code to use the legacy bundled service libraries.
  • Replace the built-in App Engine web framework with Flask, Django, or something similar.
  • Update configuration files for Python 3.
  • Test, deploy, and you’re done.

While this process may work for apps with small code bases or occasional users, it seemed way too risky for GQueues. Working months on updating code and then flipping a switch at the end felt irresponsible, and downright dangerous. Spending a significant amount of time upfront without knowing whether the migration would work is not a wise use of company resources. And the danger of updating the entire app all at once, potentially breaking everything for all users, was too much stress and anxiety to bear!

Our strategy

After much thought and consideration we settled on the following migration strategy:

  1. Get a bare bones Python 3 App Engine web service up and running alongside our full Python 2 app (in the same Google Cloud Project).
  2. Set up a Google Cloud Load Balancer so we could direct traffic to either Python 2 or Python 3 servers based on the URL.
  3. Migrate a single feature (usually 1-4 code files) over to Python 3 and test locally.
  4. Deploy to Python 3 beta servers and test again.
  5. Deploy to Python 3 production servers and test again.
  6. Update the Load Balancer to direct live traffic to Python 3 servers for the feature’s relevant URLs, and do final testing.
  7. Repeat steps 3-6 until the entire app is running on Python 3!
Migration Architecture

Benefits

This approach provided a number of advantages for us:

  1. It significantly reduced risk in all aspects of the migration project. We learned what was necessary to get a Python 3 version of the app running right away. By migrating only a few files at a time and deploying them to production as we went, we limited the scope of any problems, and could quickly identify issues and resolve them. There was never a risk of downtime for the entire app.
  2. We eliminated the anxiety associated with a huge “switch over” event where we cross our fingers and hope everything we did over the last several months works.
  3. Our migrated codebase could be written in pure Python 3 instead of littered with calls to compatibility libraries.
  4. We were able to learn and test on less integral components first (like our internal customer support tools) and then move to more critical features as we gained more migration experience and confidence.
  5. We could test each migrated component in the actual Python 3 production environment before moving live traffic over. And if problems did surface for live traffic, we could quickly switch the feature back to Python 2 while we work on a fix.
  6. We could easily see progress, giving us a feeling of momentum as we knocked off features. (This turned out to be very important on such a long and tedious project.)
  7. By migrating only a few files at a time, other development work could happen simultaneously without the burden of a huge merge or having to implement changes in both Python 2 and Python 3.

The details

With a solid strategy in place, we could now start figuring out the implementation details to actually carry out the migration.

Local development

Since our approach centered around running Python 2 and 3 versions of the app in production simultaneously, we first had to figure out how to set this up for local development. We configured dev_appserver.py to support both runtimes and start services for our Python 2 app.yaml and Python 3 app.yaml files.

dev_appserver.py --port=8085 --admin_port=8889
 --enable_console=yes --max_module_instances=1
 --runtime_python_path="python27=/usr/bin/python2.7,python3=/usr/bin/python3"
 --enable_sendmail=yes --show_mail_body=True
 --require_indexes=yes --application=gqueues
 --storage_path=/Users/hennekec/projects/gqueues/appengine_storage
 --clear_search_indexes=True --log_level=debug --support_datastore_emulator=false
 app.yaml ../app/default/app.yaml
 
 # app.yaml was in our Python 2 codebase
 # ../app/default/app.yaml was in our Python 3 codebase

In our local environment we set up nginx as a reverse proxy to fill the same role as the Google Cloud Load Balancer. Then, instead of accessing dev_appserver.py services directly like before, requests went through nginx, which would forward to the Python 2 service or Python 3 service based on the URL. As we migrated features we updated nginx.conf to route traffic to the appropriate instance.

Development Architecture
# nginx.conf
...
http {
    ...  
    server {
        listen       9000;
        server_name  localhost;
        client_max_body_size 50m;
        
        # Proxy for Python 2 dev_appserver
        location / {
            proxy_pass http://localhost:8085;
        }
        # Proxy for Python 3 dev_appserver
        # URLs that have been migrated
        location = /newItem {
            proxy_pass http://localhost:8086;
        }
        location = /smartQueues {
            proxy_pass http://localhost:8086;
        }
        ...
    }
    ...
}

At the time we discovered one annoyance running dev_appserver.py for Python 3: a new virtualenv is created and all dependencies are installed from scratch every time the server is started. This mimics what happens in production, but on a local machine it means initializing the server takes over a minute – painful! Fortunately, we found a patch that forces dev_appserver.py to use an existing virtualenv, which brought initialization down to 5 seconds. And then we only had to update the virtualenv when dependencies were changed in requirements.txt.

Since then, Google has added a "--python_virtualenv_path" flag to dev_appserver.py so you can use a persistent virtualenv now with no patch required!

Migration reference

Of course, before we could start porting code we had to learn what’s actually different in Python 3. There are tons of resources on the web covering the changes, like the cheat sheet from Python-Future, and notes from Guido himself. As we migrated the first files (our internal support tools), we began building our own reference sheet on what we needed to look for in our Python 2 code, and how it should be changed for Python 3. This included Python 3 language changes, as well as those required for switching to the Django web framework. 

Reference Sheet

After migrating the first handful of files we had most of the necessary changes documented in our reference. As with any migration project, accuracy and thoroughness are critical. So just like pilots who always follow their pre-flight checklist, we used our reference sheet as a checklist, reviewing each Python 2 file for every item in the sheet, to make sure nothing was missed.

Limited refactoring

Reading through all the backend code, some of which was written over ten years ago, it was tempting to refactor during the migration process. This would of course lengthen an already very long project, so we decided to refrain from refactoring unless absolutely necessary. 

We did reorganize files when it made sense, and split code when required so there was only one class per file (which was definitely not always the case in earlier code). Many classes also had staticmethods, which we had to split out into their own utility modules to eliminate dependencies that would have made migrating individual features nearly impossible.

Web framework

We chose Django as our new web framework because it's fairly similar to the webapp2 framework built for App Engine’s Python 2 runtime. So we didn't have to update any of our HTML templates, and most of the required updates were minor syntax changes. And we were able to simplify some other parts of the migration by writing custom middleware to wrap the requests.

For instance, webapp2 conveniently allowed cookies to be set on the request object, which we used throughout the Python 2 codebase. Django only supports settings cookies on the response object, but creating the following middleware provided the missing functionality, so no other cookie-related changes were needed.

import datetime
import time

from types import MethodType

from django.utils import timezone
from django.http.cookie import SimpleCookie
from django.http import HttpRequest
from django.utils.http import http_date

class RequestCookieMiddleware(object):
    """Allows setting and deleting of cookies from requests.
    
    >>> request.set_cookie('name', 'value')
    
    The set_cookie and delete_cookie are exactly the same as the ones built
    into the Django HttpResponse class. 
    https://github.com/django/django/blob/master/django/http/response.py
    """
    
    def __init__(self, get_response=None):
        self.get_response = get_response

    def __call__(self, request):
        request._resp_cookies = SimpleCookie()
        request.set_cookie = MethodType(_set_cookie, request)
        request.delete_cookie = MethodType(_delete_cookie, request)

        response = self.get_response(request)

        if hasattr(request, '_resp_cookies') and request._resp_cookies:
            response.cookies.update(request._resp_cookies)

        return response
        
...

Another custom middleware allowed us to easily route logs to Google Cloud Logging in production and to the console when running locally.

import logging
import os
import google.cloud.logging

class CloudLoggingMiddleware(object):
    """Inits Google Cloud logging if running in production.

    Uses standard Python logging when running locally.
    """

    def __init__(self, get_response=None):
        self.get_response = get_response

        # only log to Google Cloud Logging when on servers
        if os.getenv('GAE_ENV', '').startswith('standard'):
            logging_client = google.cloud.logging.Client()
            logging_client.setup_logging(log_level=logging.INFO)
        else:
            logging.basicConfig(level=logging.DEBUG)

    def __call__(self, request):
        return self.get_response(request)
# Django settings.py
...
MIDDLEWARE = [
    'gqueues.middleware.cloudlogging.CloudLoggingMiddleware',
    'gqueues.middleware.cookies.RequestCookieMiddleware',
    'gqueues.middleware.customexceptions.CustomExceptionsMiddleware', # must be last in the list
]
...

Memorystore

As noted earlier, we had already moved from App Engine’s memcache to Memorystore for Redis, but we still needed to make a few Python 3 related changes to the SerializedRedis client wrapper we had written. Most importantly, we needed to ensure that values written to Redis by Python 2 code could be retrieved by Python 3 code, and vice versa, since both versions would be running and interacting with each other.

Since Python 3 handles strings and unicode differently than Python 2, we needed a way to identify the type of the value in Redis before decoding it. It turns out Google already solved this problem in their update to the legacy memcache bundled service, so we mimicked this approach of setting integer bit flags for all values going into Redis and decoding based on these flags.

Likewise, we also had to ensure that all pickled objects stored in Redis used protocol version 2, since that’s the highest version that’s compatible with both Python 2 and 3.

Production releases

Since we were migrating only one or two features at a time, Python 3 code was released to production every few days. We alerted our Customer Care team as changes went out so they could be on the lookout for any support issues that might be related to a recent update. We also tracked the progress of files migrated in GQueues itself and celebrated milestones in team meetings. 🥳

Migration Progress

Results

This approach worked amazingly well, and as the sole engineer on the project, I was able to migrate 266 files containing 75,000 lines of code over 5 months with 51 code releases to production. Porting code is not a difficult task per se, but this was definitely a grueling project with the sheer volume of changes and testing required. Whenever my mental energy waned, I took a break. If I pushed on, the chances of something being overlooked greatly increased with such a tedious undertaking. As these types of projects go, it was successful precisely because our users didn’t even know that it happened. It’s not quite the same joy as launching a highly requested feature to cheers from users, but the relief I felt with GQueues off an unsupported version of Python was enormous!

Migration Summary
1 software engineer
266 files
75,153 lines of code
5 months
51 code releases
0 service disruptions

Without a big “switch-over” event, finishing the migration was surprisingly anti-climactic. And that was just fine with me. On the last day, there was very little difference between 98% percent of the app being served by Python 3 servers, and 100% after the final release. The strategy worked exactly as I hoped.

It turns out the most thrilling moment of the migration was release 26, when more of GQueues was running on Python 3 than Python 2. That’s when everything began to feel real. A close second was the pure joy I felt when I finally got to shut down the Python 2 servers and delete all the old code from the repository.  

If you still have a Python 3 migration on the horizon, I hope this helps in figuring out an approach that will work best for your app. You have my empathy, and heartfelt encouragement: You can do this! For everyone else, we can wake up every day grateful that a Python 3 migration isn’t looming over our heads. 😅

About the author
Cameron Henneke
Founder

I love building products! And Python. And dark chocolate. When I'm not leading the team at GQueues, I can be found running ultras on the trails of the Rocky Mountains.

Subscribe to our blog
Get the latest posts in your email
Thank you for subscribing!
Oops! Something went wrong while submitting the form.