Menu
Amund Tveit

This blog post shows a basic example of a Serverless Thrift API with Python for AWS Lambda and AWS API Gateway.

1. Serverless Computing for Thrift APIs?

Serverless computing - also called Cloud Functions - is an interesting type of cloud service due to its simplicity. An interpretation of serverless computing is that you (with relatively low effort):

  1. Deploy only the function needing to do the work
  2. Only pay per request to the function
    1. With the notable exception of other cloud resources used, e.g. storage
  3. Get some security setup automation/support (e.g. SSL and API keys)
  4. Get support for request throttling (e.g. QPS) and quotas (e.g. per month)
  5. Get (reasonably) low latency - when the serverless function is kept warm
  6. Get support for easily setting up caching
  7. Get support for setting up custom domain name
  8. Lower direct (cloud costs) and indirect (management) costs?


These are characteristics that in my mind make Serverless computing an interesting infrastructure to develop and deploy Thrift APIs (or other types of APIs) for.
Perhaps over time even Serverless will be preferred over (more complex) container (Kubernetes/Docker) or virtual machine based (IaaS) or PaaS solutions for APIs?

2. Example Cloud Vendors providing Serverless Services

  1. AWS Lambda in combination with AWS API Gateway
  2. Google Cloud Functions
  3. IBM Bluemix Openwhisk
  4. Microsoft Azure Functions

Since Python is a key language in my team, for this initial test AWS Serverless

3. Thrift (over HTTPS) on AWS Lambda and API Gateway with Python

This shows an example of the (classic) Apache Thrift tutorial Calculator API running on AWS Lambda and API Gateway, the service requires 2 thrift files:

  1. tutorial.thrift
  2. shared.thrift
3.1 Development Environment and Tools

The tool used for deployment in this blog post is Zappa, I recommend using Zappa together with Docker for Python 3.6 as described in this blog post, with a slight change of the Dockerfile if you want to build and compile Apache thrift Python library yourself, here is the altered Dockerfile. There hasn't been official releases of Apache Thrift since 0.10.0 January 6th 2017, and there has been important improvement related to its Python support since last release - in particular the fix for supporting recursive thrift structs in Python

a. Dockerfile - for creating a Zappashell (same as Lambda runtime ) and builds Thrift

# build this with command:
#   docker build -t myzappa .
FROM lambci/lambda:build-python3.6
WORKDIR /var/task
# Fancy prompt to remind you are in zappashell                                                                                                             
RUN echo 'export PS1="\[\e[36m\]zappashell>\[\e[m\] "' >> /root/.bashrc
# Build Apache thrift Python library                                                                                                                           
RUN yum clean all && \                                                                                                                                   
    yum -y install emacs boost* gcc 
RUN git clone https://github.com/apache/thrift.git && \
     cd thrift && \
     ./bootstrap.sh && \
     ./configure && \
     make && make install && \
     cd lib/py && python setup.py install && \
     python setup.py sdist # Builds a thriftSomeVersion.tar.gz
CMD ["bash"]

After building this Dockerfile (see command on top of file) and adding zappashell to your .bash_profile like this (source: the above mentioned blog post)

alias zappashell='docker run -ti -e AWS_PROFILE=zappa -v $(pwd):/var/task -v ~/.aws/:/root/.aws  --rm myzappa'
alias zappashell >> ~/.bash_profile

You can start your serverless deployment environment with the command zappashell (inside an new empty directory on your host platform e.g. a mac), this gives something like this - with an empty directory.

username@MyMac$ mkdir my_thrift_app
username@MyMac$ cd my_thrift_app
username@MyMac$ zappashell
[zappashell> pwd
/var/task
[zappashell> ls 
[zappashell>

Install virtualenv and create/activate an environment(and assuming you installed thrift as shown in Dockerfile above)

[zappashell> pip install virtualenv
[zappashell> virtualenv serverlessdeployenv
[zappashell> source serverlessdeployenv/bin/activate
(serverlessdeployenv)[zappashell>

Use thrift to generate python code for tutorial.thrift and shared.thrift

(serverlessdeployenv)[zappashell> thrift --gen py tutorial.thrift
(serverlessdeployenv)[zappashell> thrift --gen py shared.thrift
(serverlessdeployenv)[zappashell> ls gen-py
__init__.py  shared  tutorial

Convert the gen-py package into a python library (for convenient packaging) with a setup.py file as below (change version according to your wants)

setup.py

from setuptools import setup, find_packages
# https://stackoverflow.com/questions/10924885/is-it-possible-to-include-subdirectories-using-dist-utils-setup-py-as-part-of
setup(
    name='genpy',
    version='1.1',
    packages=find_packages(),
    license='Creative Commons Attribution-Noncommercial-Share Alike license',
    long_description=open('README.txt').read(),
)
(serverlessdeployenv)[zappashell> mv gen-py genpy
(serverlessdeployenv)[zappashell> cd genpy
(serverlessdeployenv)[zappashell> touch README.txt # or create one
(serverlessdeployenv)[zappashell> # create setup.py as shown above
(serverlessdeployenv)[zappashell> python setup.py sdist
(serverlessdeployenv)[zappashell> ls dist
genpy-1.1.tar.gz
(serverlessdeployenv)[zappashell> cp dist/genpy-1.1.tar.gz ..

Copy the generated thrift library - note: thrift itself not the tutorial code - (ref thriftSomeVersion.tar.gz generated by python setup.by sdist in Dockerfile) to the same directory and add it to requirements.txt

requirements.txt should look something like this:

zappa
flask
thriftSomeVersion.tar.gz
genpy-1.1.tar.gz

Run pip install -r requirements.txt

Create app.py that has code for calculator thrift

import io
from flask import Flask
from flask import make_response, send_file, request
from tutorial import Calculator
import tutorial.ttypes
from shared.ttypes import SharedStruct
from thrift.protocol import TBinaryProtocol
from thrift.server import THttpServer
from thrift.server import TServer
from thrift.transport import TTransport
app=Flask(__name__)
class CalculatorHandler(object):
    def __init__(self):
        self.log = {}
    def ping(self):
        print("ping()")
    def add(self, n1, n2):
        print("add({}, {})".format(n1, n2))
        return n1 + n2
@app.route('/thr', methods=['POST'])
def thr():
    # get the thrift package from HTTP POST body
    body = request.get_data()
    content_length = int(request.headers["Content-Length"])
    # thrift setup
    itrans = TTransport.TMemoryBuffer(body)
    itrans = TTransport.TBufferedTransport(itrans, content_length)
    otrans = TTransport.TMemoryBuffer()
    handler = CalculatorHandler()
    processor = Calculator.Processor(handler)
    inputProtocolFactory = TBinaryProtocol.TBinaryProtocolFactory()
    outputProtocolFactory = inputProtocolFactory
    iprot = inputProtocolFactory.getProtocol(itrans)
    oprot = outputProtocolFactory.getProtocol(otrans)
    processor.process(iprot,oprot)
    response = make_response(otrans.getvalue())
    response.headers['Content-Type'] = 'application/x-thrift'
    return response, 200
if __name__ == '__main__':
    app.run()

Create .aws directory with files:

credentials

[default]
aws_access_key_id=SOMETHINGSECRET
aws_access_access_key=SOMETHINGELSESECRET

config

[default]
region=eu-west-1 # or the region of your choice
output=text

Run zappa init and answers questions, it should look something like the image below:

you should now be able to deploy the API with

zappa deploy beta

You can test the deployed API with the following client, remember to change the https address to the address that the deploy gave you

from tutorial import Calculator
from tutorial.ttypes import Operation, Work, InvalidOperation
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from thrift import Thrift
from thrift.transport import TSocket
from thrift.transport import TTransport
from thrift.protocol import TBinaryProtocol
from thrift.transport import THttpClient
import logging

def main():
    transport = THttpClient.THttpClient('https://SOMETHING.amazonaws.com/beta/thr')
    #transport = THttpClient.THttpClient('http://127.0.0.1:5000/thr')                                                                                                                                                   

    # NEW - set API key                                                                                                                                                                                                 
    transport.setCustomHeaders({"x-api-key":"SomeGeneratedKey"})

    # Buffering is critical. Raw sockets are very slow                                                                                                                                                                  
    transport = TTransport.TBufferedTransport(transport)

    # Wrap in a protocol                                                                                                                                                                                                
    protocol = TBinaryProtocol.TBinaryProtocol(transport)
    client = Calculator.Client(protocol)
    transport.open()

    client.ping()
    result = client.add(11,77)
    print("11+77 = ", result)

    transport.close()


if __name__ == "__main__":
    main()

But wait, something is missing, this API is reachable by anyone. Let us add an API key (and update the client with the x-api-key). This can be done through AWS Console (and perhaps with Zappa itself through automation soon?) with the following steps:

Go to Amazon API Gateway Console and click on the generated API (perhaps named task-beta due to the Docker file path and the selected stage during zappa init)

Create a Usage Plan and associate it with the API (e.g. task-beta), then create an API Key (on the left side menu) and attach the API Key to the Usage Plan

Do a zappa update dev and and uncomment/update the transport.setCustomHeaders with x-api-key in the python client above to get authentication and throttling in place.

4. Conclusion

Have shown an example of getting thrift API running on Serverless that can relatively easily be automated, and when the API is initially created it is very little effort to update it (e.g. through continuous deployment).

A final note on roundtrip time performance, based on a few rough tests it looks like the roundtrip time for calls to API is around 300-400 milliseconds (with the test client based in Trondheim, Norway and accessing API Gateway in AWS and AWS Lambda in Germany), which is quite good. Believe that with A AWS Route53 Routing Policy one could have automatic selection of the closest AWS API Gateway/Lambda to get the lowest latency (note that one of the selections in zappa init was to deploy globally, but default was one availability zone).

Believe personally that Serverless computing has a strong future ahead wrt API development, and look forward to what cloud vendors software engineers/product managers add of new features, my wish list is:

  1. Strong Python support
  2. Built-in Thrift support and service discovery, as well as other RPC systems, e.g.
  3. Improved software tooling for automation

Best regards,

Amund Tveit

VP Data, Zedge