CryoTEMPO-EOLIS Point product - quality and coverage

This example investigates the coverage and quality of the CryoTEMPO-EOLIS Point Product while using Specklia to retrieve the data.

You can view the code below or click here to run it yourself in Google Colab!

Environment Setup

To run this notebook, you will need to make sure that the folllowing packages are installed in your python environment (all can be installed via pip/conda):

matplotlib
geopandas
contextily
ipykernel
shapely
specklia

If you are using the Google Colab environment, these packages will be installed in the next cell. Please note this step may take a few minutes.

[1]:

import sys
if 'google.colab' in sys.modules:
    %pip install rasterio --no-binary rasterio
    %pip install specklia
    %pip install matplotlib
    %pip install geopandas
    %pip install contextily
    %pip install shapely
    %pip install python-dotenv

Explore dataset source files

The CryoTEMPO-EOLIS Point Product documentation tells us that it is generated at a monthly temporal resolution, one product file per 100x100km tile.

Let’s run a small query and use the returned source files to determine the bounding polygon of an example tile.

[3]:

svalbard_polygon = shapely.Polygon(([3.9, 79.88], [34.64, 80.97], [
    36.86, 77.03], [14.57, 75.51], [3.9, 79.88]))

dataset_name = 'CryoTEMPO-EOLIS Point Product'
available_datasets = client.list_datasets()
point_product_dataset = available_datasets[
    available_datasets['dataset_name'] == dataset_name].iloc[0]

point_product_data, sources = client.query_dataset(
    dataset_id=point_product_dataset['dataset_id'],
    epsg4326_polygon=svalbard_polygon,
    min_datetime=datetime.datetime(2023, 1, 1),
    max_datetime=datetime.datetime(2023, 1, 2),
    columns_to_return=['timestamp', 'elevation', 'uncertainty'])

print(
    f'Query complete, {len(point_product_data)} points returned, drawn from {len(sources)} original sources.')

Query complete, 98044 points returned, drawn from 4 original sources.

We pick a source at random, and retrieve its geospatial bounding box.

[4]:

source_info = sources[1]['source_information']

tile_x_min = source_info['geospatial_x_min']
tile_x_max = source_info['geospatial_x_max']
tile_y_min = source_info['geospatial_y_min']
tile_y_max = source_info['geospatial_y_max']

tile_polygon_points = [(tile_x_min, tile_y_min), (tile_x_min, tile_y_max),
                       (tile_x_max, tile_y_max), (tile_x_max, tile_y_min)]

We must query Specklia with a polygon in the EPSG4326 projection, so let’s transform the information we have just retrieved and plot it.

[5]:

source_projection = pyproj.CRS.from_proj4(source_info['geospatial_projection'])
desired_projection = pyproj.CRS.from_epsg('4326')
transformer = pyproj.Transformer.from_proj(
    source_projection, desired_projection, always_xy=True)

tile_polygon = shapely.Polygon([transformer.transform(p[0], p[1])
                                for p in tile_polygon_points])

tile_spatial_coverage = gpd.GeoDataFrame(
    geometry=[tile_polygon], crs=4326)

ax = tile_spatial_coverage.to_crs(epsg=3413).plot(
    figsize=(5, 5), alpha=0.5, color=data_coverage_color, edgecolor=data_coverage_color)
ax.set_xlabel('x [m]')
ax.set_ylabel('y [m]')
ax.set_title(f"Example 100x100km product tile")
ctx.add_basemap(ax, source=ctx.providers.Esri.WorldImagery, crs=3413, zoom=8)

../_images/example_notebooks_cryotempo_point_product_11_0.png

Query dataset

Now that we have the bounding box of our 100x100km tile, let’s use it for a query.

According to the definition of the point product, querying the dataset over a single tile over a time window of 3 months should return data from 3 original sources. Let’s check!

[6]:

point_product_data, sources = client.query_dataset(
    dataset_id=point_product_dataset['dataset_id'],
    epsg4326_polygon=tile_polygon,
    min_datetime=datetime.datetime(2023, 1, 1),
    max_datetime=datetime.datetime(2023, 3, 31),
    columns_to_return=['timestamp', 'elevation', 'uncertainty', 'x', 'y'])

print(
    f'Query complete, {len(point_product_data)} points returned, drawn from {len(sources)} original sources.')

Query complete, 1335027 points returned, drawn from 3 original sources.

Plot coverage

Now, let’s plot the data coverage.

[7]:

from mpl_toolkits.axes_grid1 import make_axes_locatable

fig, axs = plt.subplots(1, 2, figsize=(20, 10), sharex=True)
point_product_data.dropna(inplace=True)
point_product_data_3413 = point_product_data.to_crs(epsg=3413)
divider = make_axes_locatable(axs[0])
cax = divider.append_axes("right", size="5%", pad=0.1)
point_product_data_3413.plot(
    ax=axs[0], column=point_product_data['elevation'], s=1, cmap='viridis', legend=True, legend_kwds={
        "cax": cax,
        "label": "Elevation [m]"
    }, )

axs[0].set_xlabel('x [m]')
axs[0].set_ylabel('y [m]')

divider = make_axes_locatable(axs[1])
cax = divider.append_axes("right", size="5%", pad=0.1)
point_product_data_3413.plot(
    ax=axs[1], column=point_product_data['uncertainty'], s=1, cmap='viridis', legend=True, legend_kwds={
        "cax": cax,
        "label": "Uncertainty [m]"
    })
axs[1].set_xlabel('x [m]')

plt.ticklabel_format(axis='x', style='sci', scilimits=(0, 0))
t = plt.suptitle(
    'Coverage of CryoTEMPO-EOLIS point product over example 100x100km tile')
ctx.add_basemap(axs[0], source=ctx.providers.Esri.WorldImagery, crs=3413, zoom=8)
ctx.add_basemap(axs[1], source=ctx.providers.Esri.WorldImagery, crs=3413, zoom=8)

../_images/example_notebooks_cryotempo_point_product_17_0.png

The plot shows all point data available in our 100x100km tile in the timeframe that we specified for the query, and demonstrates the spatial coverage achieved using swath processing.

In the figure on the left, the scatter points are colour-coded by their elevations. On the right, the colour shows the uncertainty associated with each point.

For more information on CryoSat-2 and swath processing, see https://earth.esa.int/eogateway/news/cryosats-swath-processing-technique.

Explore data quality

Firstly we calculate the mean and median uncertainties, as well as the standard deviation.

[8]:

med_unc = np.nanmedian(point_product_data['uncertainty'])
mean_unc = np.nanmean(point_product_data['uncertainty'])
std_unc = np.std(point_product_data['uncertainty'])

We next set up some histogram bins, requiring 40 bins in the uncertainty range.

[9]:

bins = np.linspace(np.nanmin(point_product_data['uncertainty']), np.nanmax(
    point_product_data['uncertainty']), 40)

Finally, we plot a histogram of the uncertainty values for this point product dataset, and display the mean and median values, as well as the standard deviation of uncertainties.

[10]:

fig, axs = plt.subplots(1, 1, figsize=(20, 10))

a = axs.hist(point_product_data['uncertainty'], bins=bins,
             facecolor='paleturquoise', edgecolor='k', alpha=0.6)

axs.vlines(mean_unc, 0, 12e4, color='k', linestyle='dashed',
           label='Mean Uncertainty = {:0.2f}'.format(mean_unc))
axs.vlines(med_unc, 0, 12e4, color='b', linestyle='dashed',
           label='Median Uncertainty = {:0.2f}'.format(med_unc))
axs.vlines(mean_unc - std_unc, 0, 12e4, color='orange', linestyle='dashed')
axs.vlines(mean_unc + std_unc, 0, 12e4, color='orange',
           linestyle='dashed', label='$\sigma$ = {:0.2f}'.format(std_unc))

axs.set_ylim(0, 12e4)
axs.set_xlabel('Uncertainty [m]')
axs.set_ylabel('Frequency')
axs.set_title('Uncertainty distribution for CryoTEMPO-EOLIS point product')
plt.legend()

[10]:

<matplotlib.legend.Legend at 0x7f542aa15d90>

../_images/example_notebooks_cryotempo_point_product_25_1.png