import xarray as xr
import pandas as pd
import requests
import io
import pyproj
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns
from sklearn.metrics import r2_score
import cartopy.crs as ccrs # cartopy: geospatial data visualization
import cartopy.feature as cfeatureMatching Satellite and Buoy Data
In this exercise, you will combine satellite and buoy data by extracting satellite measurements around specific points defined by buoy locations and dates. The focus of this exercise is on matching two data sources from different projections.
Similar tutorials for mid to lower latitudes can be found at https://github.com/coastwatch-training/CoastWatch-Tutorials.
This exercise demonstrates the following techniques:
- Using ERDDAP to retrieve buoy data in CSV format and satellite data in netCDF format
- Importing and manipulating data with the pandas and xarray libraries
- Resampling data to lower-resolution time steps
- Converting latitude and longitude coordinates to the polar stereographic projection
Data used in this exercise
Ice Surface Temperature, NOAA-20 VIIRS, Near Real-Time, Polar Stereographic (North), 4-day
This dataset provides VIIRS sea ice surface temperature for the Arctic at a 750m resolution, collected by the NOAA-20 satellite. It includes near-real-time daily data and 4-day composites for the past three weeks. For this exercise, we will use 4-day composites data.
International Arctic Buoy Programme (IABP) Buoy Data, Daily
This data set is from the US International Arctic Buoy Programme and includes meteorological and oceanographic data from buoys. Dataset is updated daily and includes multiple variables. For this exercise, we will extract surface temperature data.
While both instruments measure surface temperatures, they do so at different locations and with different sensors.
Satellite Ice Surface Temperature (IST) is measured by the Visible Infrared Imaging Radiometer Suite (VIIRS) and captures the temperature of the surface layer of ice.
Buoy Surface Temperature (Ts) is measured from the bottom of the buoy hull. If the buoy is floating, the reported temperature is of the sea surface. If the buoy is frozen into the ice or sitting on top of it, the reported temperature is of the ground or ice. The freezing temperature of seawater is about -1.8°C, so temperature readings below this indicate ground or ice temperatures.
More details can be found in the metadata section of the data products (click on the data links above).
Load packages
Load buoy data (IABP) from PolarWatch ERDDAP data server
- Construct ERDDAP URL to query the IABP buoy for: buoy_id, longitude, latitude, time, and surface temperature that filters for data where surface_temp exists and for the date range between 2023-08-01 and 2023-09-30.
- Download the data into a Pandas dataframe.
# Construct ERDDAP URL
buoy_url = ''.join(['https://polarwatch.noaa.gov/erddap/tabledap/iabpv2_buoys.csv?',
'buoy_id,longitude,latitude,time,surface_temp',
'&has_surface_temp="yes"&time>=2023-08-01&time<=2023-09-30'
])
# Make a request to the ERDDAP server
req = requests.get(buoy_url).content
# The response from the web service is read as a CSV file into a pandas DataFrame.
df = pd.read_csv(io.StringIO(req.decode('utf-8')), skiprows=[1], parse_dates=['time'])
df.head(3)
| buoy_id | longitude | latitude | time | surface_temp | |
|---|---|---|---|---|---|
| 0 | 300234066034140 | -28.5226 | 55.0168 | 2023-08-01 00:00:00+00:00 | 13.5 |
| 1 | 300234066034140 | -28.5226 | 55.0168 | 2023-08-01 01:00:02+00:00 | 13.4 |
| 2 | 300234066034140 | -28.5226 | 55.0168 | 2023-08-01 01:59:57+00:00 | 13.4 |
Select one buoy and process data
Select one buoy (buoy id = “300534062897730”). The buoy records measurements at intervals of minutes, resulting in a high-resolution dataset. , * Downsample the buoy data to align it with the daily resolution of the satellite dataset.
* The time data is recorded in the UTC time zone. Pandas operations often encounter issues with time zones, so remove the time zone information for easier processing.
# Select one buoy (buoy id = "300534062897730")
buoy_df = df.loc[df["buoy_id"]== 300534062897730]
print('# of timesteps before =', buoy_df.shape[0] )
# The resampling will put time as the df index
buoy_df_resampled = buoy_df.resample('D', on='time').mean()
print('# of timesteps after =', buoy_df_resampled.shape[0] )
# Remove the timezone (UTC, GMT).
buoy_df_resampled = buoy_df_resampled.tz_localize(None)
# Rename the variable
buoy_df_resampled.rename(columns={"surface_temp": "temp_buoy"},
errors="raise",
inplace=True)
buoy_df_resampled.head(2)# of timesteps before = 8523
# of timesteps after = 60
| buoy_id | longitude | latitude | temp_buoy | |
|---|---|---|---|---|
| time | ||||
| 2023-08-01 | 3.005341e+14 | -143.782008 | 86.385202 | 2.193916 |
| 2023-08-02 | 3.005341e+14 | -143.104535 | 86.333932 | 1.520208 |
Transform buoy coordinates to polar projection
The buoy locations are provided in latitude and longitude coordinates. The satellite data is in polar stereographic projection, which provided location in columns and rows with units of meters. * Convert the buoy locations from latitude and longitude to the corresponding columns and rows values in the polar projection.
# Define the projection using the PROJ4 string format
proj4text = "+proj=stere +lat_0=90 +lat_ts=70 +lon_0=-45 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"
# Initialize the projection object using pyproj with the given PROJ4 text
proj = pyproj.Proj(proj4text)
# Transform latitude and longitude to x, y coordinates
# Longitude and latitude are passed as arrays,
# pyproj returns the corresponding x (cols) and y (rows) values
buoy_df_resampled['cols'], buoy_df_resampled['rows'] = proj(buoy_df_resampled['longitude'].values,
buoy_df_resampled['latitude'].values)
# Verify that the 'cols' and 'rows' columns were added to dataframe
buoy_df_resampled.head(2)| buoy_id | longitude | latitude | temp_buoy | cols | rows | |
|---|---|---|---|---|---|---|
| time | ||||||
| 2023-08-01 | 3.005341e+14 | -143.782008 | 86.385202 | 2.193916 | -387113.857367 | 59803.924517 |
| 2023-08-02 | 3.005341e+14 | -143.104535 | 86.333932 | 1.520208 | -393297.704017 | 56006.315907 |
# Select the first buoy location to pull corresponding satellite data
buoy_cols = buoy_df_resampled['cols'].iloc[0]
buoy_rows = buoy_df_resampled['rows'].iloc[0]Load satellite data from PolarWatch
# Construct ERDDAP data request
gridded_url = "https://polarwatch.noaa.gov/erddap/griddap/noaacwVIIRSn20icesrftempNP06Daily4Day"
# Open and load data into xarray dataset
srftemp_ds = xr.open_dataset(gridded_url)
# The altitude dimension has a size of 1, remove it to reduce the dimensionality
srftemp_ds = srftemp_ds.squeeze()Select satellite data to match buoy location and dates
Subset the satellite data using the buoy locations and dates.
buoy_cols = buoy_df_resampled['cols'].values
buoy_rows = buoy_df_resampled['rows'].values
buoy_times = buoy_df_resampled.index.values
sat_temps =[]
for ct, buoy_col in enumerate(buoy_cols):
sat_temp = srftemp_ds['IceSrfTemp'].sel(
rows=buoy_rows[ct],
cols=buoy_col,
time=buoy_times[ct],
method='nearest'
)
sat_temps.append(sat_temp.values.item())Merge satellite ice temperature data with buoy
Merge the datasets by index (date). Not all buoy dates have corresponding satellite data. Unmatched dates will be filled with NaN values.
merged_df = buoy_df_resampled
merged_df['temp_sat'] = np.array(sat_temps) - 273.15
merged_df.head(3)| buoy_id | longitude | latitude | temp_buoy | cols | rows | temp_sat | |
|---|---|---|---|---|---|---|---|
| time | |||||||
| 2023-08-01 | 3.005341e+14 | -143.782008 | 86.385202 | 2.193916 | -387113.857367 | 59803.924517 | -0.413336 |
| 2023-08-02 | 3.005341e+14 | -143.104535 | 86.333932 | 1.520208 | -393297.704017 | 56006.315907 | NaN |
| 2023-08-03 | 3.005341e+14 | -142.195432 | 86.271987 | 0.803264 | -400800.933610 | 50600.465317 | NaN |
Visualize matched dataSets
Visualize the matched buoy and satellite datasets to assess the data alignment.
plt.figure(figsize = (10, 5))
# Plot the SeaWiFS data
plt.plot_date(merged_df.index, merged_df.temp_buoy,
'o', markersize=3,
label='Buoy Surface Temperature', c='red',
linestyle='-', linewidth=1)
# Add MODIS data
plt.plot_date(merged_df.index, merged_df.temp_sat,
's', markersize=3,
label='VIIRS Sea Ice Surface Temperature', c='blue',
linestyle='-', linewidth=0)
#plt.ylim([0, 3])
plt.ylabel('Temperature (degrees C)')
plt.xticks(rotation=45)
plt.legend()