AI-Driven Data Unification: Weather and Tracking Data in BigQuery

Learn how to establish a production-ready pipeline to ingest diverse data sources into BigQuery.

We will demonstrate this using a Function Calling architecture where:

A user's natural language query is processed by an AI Agent.
The AI standardizes the location details (country, city, lat/lon) necessary to call a weather API.
The resulting structured data (weather metrics and original location) is streamed and persisted in BigQuery for tracking and analysis.

Data source and Base model:

We are implementing a simple AI Function Calling pipeline:

The user initiates a query (e.g., asking for weather).
The AI Model determines intent, calls the internal weather tool, and provides structured geographic parameters (lat/lon).
The backend calls the Open-Meteo API to fetch the relevant temperature and wind speed.
The output is serialized and streamed to BigQuery for permanent tracking and analysis.
The system sends the final result back to the user.

Diagram

Step 1: Big Query table setup

We will capture attributes using the following schema model:

{
  "id": "d1a9b60d-3cee-4be7-b9fa-be7814f41eb3",
  "created_at": "2025-10-22T12:07:54",
  "country": "GB",
  "city": "LONDON",
  "original_question": "what is the weather in london?",
  "question_lang": "EN",
  "temperature": "13.6°C",
  "wind_speed": "5.0km/h"
}

In the BigQuery Studio within your Google Cloud account, create a new table named capture_get_weather_queries. Ensure all fields from the provided JSON model are included in the table schema.

Your schema should look like this:

Big query schema

Step 2: Zparse Workflow

In Zparse create a new Workflow: AI Weather

2.1 Endpoint

To set up your data intake, follow these steps:

Add a Web server > JSON endpoint. This creates an accessible API where you can send data using the JSON format.
Extract the Query: Link a function (JSON > Extract from JSON) to parse the incoming request and safely retrieve the user's query as a simple text string.

Your resulting API signature will successfully process requests formatted like this:

{
  "query": "What's the weather like in Bordeaux?"
}

2.2 AI Agent

a) Virtual Agent Setup

Now, let's integrate the intelligence layer by adding the AI > Virtual Agent node. In Zparse, this Agent serves as the bridge that connects the AI provider (here, Mistral) to the specific tools you want it to execute.

Agent Description (Context is Key!)

Under the Agent's configuration, define its purpose using the description field. While a simple description works for basic tasks, a more detailed prompt significantly improves the AI's reliability in complex scenarios.

For now, you can use the concise description:

"Use the best tool at disposal to answer the user query."

tip

Crucial Context: The Agent's description provides the core context for the AI model's reasoning. For complex use cases, taking the time to write a detailed, well-defined context is essential for high accuracy and effective tool calling.

b) Tool Setup: Geographic Data Extractor

Next, let's define the first external function the AI Agent can call by adding the AI > Virtual Tool node.

This initial tool GetWeather will be responsible for processing the user's natural language query and extracting the necessary geographic parameters to make the external weather API call.

Tool Configuration

Under the tool's settings, you must define the output schema—the exact information the AI must retrieve from the user's text before proceeding.

Add four required properties to the tool's schema, corresponding to the parameters you need for data retrieval:

country (String): country matching current requested city in ISO format in uppercase
city (String): city detected in the query in uppercase
latitude (String): latitude of the city
longitude (String): longitude of the city
lang (String): language of the original question in ISO code format

By marking these fields as required, you instruct the AI (Mistral) that it must successfully extract all four pieces of information from the user's question before it can complete the tool call and send the data to the next step.

AI Agent tools

c) Fallback tool

To handle scenarios where the AI fails to find a suitable function, implement a fallback tool.

Add a new AI > Virtual Tool named NoToolMatched. This tool should include one required parameter:

reason (string): Use this to capture and return the AI's explanation for why it couldn't utilize any of the available tools. This is crucial for debugging and understanding the agent's limitations.

2.3 Query open-meteo

Next, we will use the Web Request > Request GET function to fetch the required weather data.

This function will make a call to the Open-Meteo API (https://api.open-meteo.com) to retrieve the wind speed and temperature, utilizing the geographic parameters (latitude/longitude) extracted by the AI in the previous step.

URL: https://api.open-meteo.com

Query parameters:

current --> temperature_2m,wind_speed_10m (not dynamic)
latitude --> dynamic
longitude --> dynamic

Response type: Json

Example response from open-meteo API:

{
  "current": {
    "interval": 900,
    "temperature_2m": 14.1,
    "time": "2025-10-23T09:00",
    "wind_speed_10m": 12.6
  },
  "current_units": {
    "interval": "seconds",
    "temperature_2m": "°C",
    "time": "iso8601",
    "wind_speed_10m": "km/h"
  },
  "elevation": 37,
  "generationtime_ms": 198.6376047134399,
  "latitude": 52.52,
  "longitude": 13.4,
  "timezone": "GMT",
  "timezone_abbreviation": "GMT",
  "utc_offset_seconds": 0
}

2.4 Mapping: Meteo Response to BigQuery Schema

Now that we have successfully retrieved the raw weather data from Open-Meteo, the next critical step is to process and transform this response into the exact schema required for our BigQuery table (capture_get_weather_queries).

Create a new data mapping named Weather API Collector.

Source: JSON Elements with path $

Extraction: add JSON > Extract values from path with both current and current_units extracted as JSON.

Add another extraction layer from those to get temperature_2m and wind_speed_10m. Next: Add 2 Text > Template to generate our 13.5°C from 13.5 and °C

id: Add a Text > Random UUID to generate our unique identifier

created_at: Add a Module > DateTime now in UTC + format %Y-%m-%dT%H:%M:%S (accepted by DATETIME format from BigQuery)

# Injected parameters: Add 4 Module > Injected value

country (String): to inject detected country from the AI tool
city (String): to inject detected city from the AI tool
original_question (String): to inject original question from the user
question_lang (String): to inject detected lang from the AI tool

# Destination: JSONL Writer with a Data Model containing all fields matching our BigQuery data model properly linked.

mapping datamodel

View of the Mapping once completed

mapping complete

With the mapping defined, let's finalize the pipeline connections in the workflow:

Route the API response directly into the input of the Weather API Collector. Simultaneously, ensure all injected parameters (such as city, country, and the original query) are correctly linked to the mapping's corresponding input slots.

mapping connected

2.5 Store data into BigQuery

To persist the final, formatted data, introduce the Google > BigQuery node.

After configuring your Google Cloud application (including the required Project ID and Dataset ID):

Set the operation endpoint to Streaming Insert.
Specify the destination table as capture_get_weather_queries.

Now, connect the structured output from your data mapping directly into the BigQuery node to complete the ingestion pipeline.

2.6 Return response

To close the loop and send the final data back to the user's browser, add a Web server > File Response node. Connect the structured output from your Mapping into this final node.

✅ Workflow Complete! You have successfully built an AI-powered data ingestion and retrieval API.

Complete workflow

Step 3: First run !

Now that the setup is complete, you can test your pipeline.

Navigate to the upper-left corner of the workflow interface and click Run. Enter your first query in the input field that appears:

{
  "query": "what is the weather in Bordeaux?"
}

As confirmed by the execution logs, the AI successfully routed the query to our designated tool:

Complete workflow

Feel free to experiment with the workflow! Try different questions, languages, and locations to fully test the Agent's routing and extraction capabilities.

Step 4: Validate data reception in BigQuery

Confirm Data Insertion in BigQuery:

Execute a SELECT query in BigQuery to verify that the data has been inserted correctly into the table.

SELECT * FROM `{PROJECT_ID}.{DATASET_ID}.capture_get_weather_queries` ORDER BY created_at DESC LIMIT 1000

Response

tip

Based on the current results, we should refine the weather tool's output schema to ensure the city name is returned in plain English (or a standardized format).

Step 5: Adding another tool

Now, let's add a second tool to enable the Agent to retrieve historical user data. This new tool will allow us to query the last 10 queries made by users for a specific country from BigQuery.

1. Configure the Virtual Tool

Add a new AI > Virtual Tool named LastQueriesByCountry.
Description: Set this to "Get the last 10 queries made by users for a specific country."
Parameter: Add one required parameter:
- Name: country (String)
- Description: "The ISO 3166-1 alpha-2 code of the country (e.g., 'FR', 'US') to retrieve the last 10 user queries from." (Using a standard like ISO code improves AI accuracy).

2. Connect to the Agent

Link the LastQueriesByCountry tool to the AI > Virtual Agent node defined in the previous steps.

3. Set Up the BigQuery Query Module

Add a Google > BigQuery node.
Endpoint: Set the endpoint to Query Select.
Query to get last request by country

SELECT *
  FROM `{PROJECT_ID}.{DATASET_ID}.capture_get_weather_queries`
  WHERE country = @country
  ORDER BY created_at DESC
  LIMIT 10

Query Parameter: Define one required input parameter on the BigQuery node:
- Parameter Name: country (This will receive the value extracted by the AI tool).
- Named Parameter in SQL: Use @country in your actual SQL query to filter the results (e.g., WHERE country = @country).

4. Complete the Flow

Add a Web server > File Response node to return the data to the user.
Connect the output of the BigQuery node (the query results) directly to the input of the File Response node.

✅ Setup complete! The Agent can now intelligently decide between fetching weather and retrieving historical query data.

Additional tool

Now execute your workflow with a query like:

{ "query": "last queries for japan" }

you should get a response like:

{"city":"KYOTO","country":"JP","created_at":"2025-10-23T12:10:12","id":"32b13273-c4a2-47db-8f43-41e67f5aacbb","original_question":"京都の気温はどれくらいですか？","question_lang":"JA","temperature":"15.2°C","wind_speed":"3.6km/h"}
{"city":"TOKYO","country":"JP","created_at":"2025-10-23T12:09:54","id":"d96fd8c8-4940-48b7-b051-6c167958060b","original_question":"東京の気温は何度ですか？","question_lang":"JA","temperature":"13.7°C","wind_speed":"4.2km/h"}
{"city":"TOKYO","country":"JP","created_at":"2025-10-23T12:09:27","id":"854c6278-4c04-408a-89d0-3838d5034347","original_question":"東京の気温は何度ですか？","question_lang":"JA","temperature":"13.7°C","wind_speed":"4.2km/h"}

Overview: Final workflow

The image below illustrates the complete, end-to-end workflow:

Final version

Step 1: Big Query table setup​

Step 2: Zparse Workflow​

2.1 Endpoint​

2.2 AI Agent​

a) Virtual Agent Setup​

b) Tool Setup: Geographic Data Extractor​

c) Fallback tool​

2.3 Query open-meteo​

2.4 Mapping: Meteo Response to BigQuery Schema​

2.5 Store data into BigQuery​

2.6 Return response​

Step 3: First run !​

Step 4: Validate data reception in BigQuery​

Step 5: Adding another tool​

Overview: Final workflow​