Introduction

This example demonstrates advanced AI data processing using Upstash Workflow. The following example workflow downloads a large dataset, processes it in chunks using OpenAI’s GPT-4 model, aggregates the results and generates a report.

Use Case

Our workflow will:

  1. Receive a request to process a dataset
  2. Download the dataset from a remote source
  3. Process the data in chunks using OpenAI
  4. Aggregate results
  5. Generate and send a final report

Code Example

Code Breakdown

1. Preparing our data

We start by retrieving the dataset URL and then downloading the dataset:

Note that we use context.call for the download, a way to make HTTP requests that run for much longer than your serverless execution limit would normally allow.

2. Processing our data

We split the dataset into chunks and process each one using OpenAI’s GPT-4 model:

3. Aggregating our data

After processing our data in smaller chunks to avoid any function timeouts, we aggregate results every 10 chunks:

4. Sending a report

Finally, we generate a report based on the aggregated results and send it to the user:

Key Features

  1. Non-blocking HTTP Calls: We use context.call for API requests so they don’t consume the endpoint’s execution time (great for optimizing serverless cost).

  2. Long-running tasks: The dataset download can take up to 2 hours, though is realistically limited by function memory.