Connecting Snowflake to your Python Lambda made easy

What are we doing today ?

Our objective is to write a JSON data into our snowflake table using a API .
We are going to achieve this by creating a API endpoint in AWS API GATEWAY , which will receive the data and pass it on to the Lambda which in turn will connect to Snowflake and insert the data into that table.

What is snowflake ?

Simply put snowflake is a database in the cloud with amazing API’s that allow you to seamless integrate with it , it offers data storage and analysis, data warehousing and several other purposes.

What is AWS Lambda ?

As you here I will assume you are familiar with the cloud world but in laymen terms , Lambda is compute mechanism , which an be orchestrated to run your code at invocation, the best however is you never have to worry about the server to run it in .

Connecting AWS Lambda with Snowflake using Python

AWS Lambda allows you to run code without provisioning or managing servers, and Snowflake is a powerful data warehouse. Together, they can create a robust solution for data processing and storage. In this blog, we’ll walk through creating a Python Lambda function to connect to Snowflake and insert data.

Prerequisites

  1. AWS Account
  2. Snowflake Account
  3. Python 3.9 installed locally
  4. AWS CLI configured locally
  5. Necessary permissions to create and deploy Lambda functions

Step 1: Setting Up Snowflake

First, ensure you have a Snowflake account and a database set up. You’ll need:

  • Snowflake account URL
  • Username and password
  • Database, schema, and table

For this example, we’ll use a table named example_table with the following structure:

CREATE TABLE example_table (
    id INTEGER AUTOINCREMENT,
    data VARIANT
);

Step 2: Writing the Lambda Function

Create a Python script named lambda_function.py. This script will contain the code to connect to Snowflake and insert data.

we are going to trigger the lambda via a Api Gateway hence at the start of the lambda we are tapping on to a event which contains the data that we are going to write.

import json
import snowflake.connector
import os

def lambda_handler(event, context):
    # Decode base64-encoded data from API Gateway
    body = event.get("body")
    data = json.loads(body)

    # Snowflake connection parameters
    conn = snowflake.connector.connect(
        user=os.getenv("SNOWFLAKE_USER"),
        password=os.getenv("SNOWFLAKE_PASSWORD"),
        account=os.getenv("SNOWFLAKE_ACCOUNT"),
        warehouse=os.getenv("SNOWFLAKE_WAREHOUSE"),
        database=os.getenv("SNOWFLAKE_DATABASE"),
        schema=os.getenv("SNOWFLAKE_SCHEMA")
    )

    cursor = conn.cursor()
    
    try:
        # Insert data into Snowflake
        sql = "INSERT INTO example_table (data) VALUES (%s)"
        cursor.execute(sql, [json.dumps(data)])

        # Commit the transaction
        conn.commit()
        
        return {
            'statusCode': 200,
            'body': json.dumps('Data inserted successfully')
        }
    except Exception as e:
        return {
            'statusCode': 500,
            'body': json.dumps(f'Error inserting data: {str(e)}')
        }
    finally:
        cursor.close()
        conn.close()

Step 3: Packaging the Lambda Function

Lambda requires all dependencies to be packaged in a ZIP file. We’ll use a virtual environment to manage dependencies and package the function.

  1. Create a virtual environment python3 -m venv venv
  2. Activate the virtual environment:
    • On macOS/Linux: source venv/bin/activate
    • On Windows: .\venv\Scripts\activate
  3. Install the Snowflake connector:
    pip install snowflake-connector-python
  4. Deactivate the virtual environment deactivate
  5. Package the Lambda function
    cd venv/lib/python3.9/site-packages zip -r9 ../../../../lambda_function.zip . cd ../../../../ zip -g lambda_function.zip lambda_function.py

Step 4: Creating the Lambda Function

  1. Create an IAM role for the Lambda function with the necessary permissions.
  2. Create the Lambda function using the AWS CLI
    aws lambda create-function \ –function-name InsertDataToSnowflake \ –zip-file fileb://lambda_function.zip \ –handler lambda_function.lambda_handler \ –runtime python3.9 \ –role arn:aws:iam::YOUR_ACCOUNT_ID:role/YOUR_LAMBDA_ROLE
  3. Set the environment variables for the Lambda function
    aws lambda update-function-configuration \ –function-name InsertDataToSnowflake \ –environment “Variables={SNOWFLAKE_USER=your_user,SNOWFLAKE_PASSWORD=your_password,SNOWFLAKE_ACCOUNT=your_account,SNOWFLAKE_WAREHOUSE=your_warehouse,SNOWFLAKE_DATABASE=your_database,SNOWFLAKE_SCHEMA=your_schema}”

Step 5: Testing the Lambda Function

  1. Create an API Gateway to trigger the Lambda function.
  2. Configure the API Gateway to accept JSON input and pass it to the Lambda function.
  3. Test the setup by sending a POST request to the API Gateway endpoint with JSON data:
    { "key": "value" }

The Lambda function should insert the JSON data into the Snowflake example_table.

Conclusion

We have created an AWS Lambda function that connects to Snowflake and inserts data. This setup is scalable and serverless, making it an efficient way to handle data processing and storage.

Leave a Reply

Your email address will not be published. Required fields are marked *