{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# CryoTEMPO-EOLIS Point product - quality and coverage"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This example investigates the coverage and quality of the [CryoTEMPO-EOLIS Point Product](https://cryotempo-eolis.org/point-product/)\n",
    "while using Specklia to retrieve the data.\n",
    "\n",
    "You can view the code below or [click here](https://colab.research.google.com/github/earthwave/specklia_demo_notebooks/blob/main/cryotempo_point_product.ipynb/) to run it yourself in Google Colab!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Environment Setup\n",
    "To run this notebook, you will need to make sure that the folllowing packages are installed in your python environment (all can be installed via pip/conda):\n",
    "- matplotlib\n",
    "- geopandas\n",
    "- contextily\n",
    "- ipykernel\n",
    "- shapely\n",
    "- specklia\n",
    "\n",
    "If you are using the Google Colab environment, these packages will be installed in the next cell. Please note this step may take a few minutes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sys\n",
    "\n",
    "if \"google.colab\" in sys.modules:\n",
    "    %pip install rasterio --no-binary rasterio\n",
    "    %pip install specklia\n",
    "    %pip install matplotlib\n",
    "    %pip install geopandas\n",
    "    %pip install contextily\n",
    "    %pip install shapely\n",
    "    %pip install python-dotenv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "nbsphinx": "hidden"
   },
   "outputs": [],
   "source": [
    "# fix an issue that can sometimes occur with rasterio using the wrong version of proj\n",
    "import os\n",
    "\n",
    "import pyproj\n",
    "\n",
    "os.environ[\"PROJ_LIB\"] = pyproj.datadir.get_data_dir()\n",
    "\n",
    "import datetime\n",
    "\n",
    "import contextily as ctx\n",
    "import geopandas as gpd\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import shapely\n",
    "from dotenv import load_dotenv\n",
    "\n",
    "from specklia import Specklia\n",
    "\n",
    "# load a demonstration API key from a .env file.\n",
    "load_dotenv()\n",
    "\n",
    "# Specify polygon colors to match another Earthwave product, https://cs2eo.org.\n",
    "data_coverage_color = np.array([52, 211, 153]) / 255\n",
    "geo_filter_colour = np.array([96, 165, 250]) / 255\n",
    "\n",
    "# To run this code yourself, first generate your own key using https://specklia.earthwave.co.uk.\n",
    "if \"DEMO_API_KEY\" in os.environ:\n",
    "    client = Specklia(os.environ[\"DEMO_API_KEY\"])\n",
    "else:\n",
    "    user_api_key = input(\n",
    "        \"Please generate your own key using https://specklia.earthwave.co.uk/ApiKeys and paste it here:\"\n",
    "    )\n",
    "    client = Specklia(user_api_key)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Explore dataset source files"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The CryoTEMPO-EOLIS Point Product [documentation](https://cryotempo-eolis.org/point-product/) tells us that it is\n",
    "generated at a monthly temporal resolution, one product file per 100x100km tile.\n",
    "\n",
    "Let's run a small query and use the returned source files to determine the bounding polygon of an example tile."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "svalbard_polygon = shapely.Polygon(([3.9, 79.88], [34.64, 80.97], [36.86, 77.03], [14.57, 75.51], [3.9, 79.88]))\n",
    "\n",
    "dataset_name = \"CryoTEMPO-EOLIS Point Product\"\n",
    "available_datasets = client.list_datasets()\n",
    "point_product_dataset = available_datasets[available_datasets[\"dataset_name\"] == dataset_name].iloc[0]\n",
    "\n",
    "point_product_data, sources = client.query_dataset(\n",
    "    dataset_id=point_product_dataset[\"dataset_id\"],\n",
    "    epsg4326_polygon=svalbard_polygon,\n",
    "    min_datetime=datetime.datetime(2023, 1, 1),\n",
    "    max_datetime=datetime.datetime(2023, 1, 2),\n",
    "    columns_to_return=[\"timestamp\", \"elevation\", \"uncertainty\"],\n",
    ")\n",
    "\n",
    "print(f\"Query complete, {len(point_product_data)} points returned, drawn from {len(sources)} original sources.\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We pick a source at random, and retrieve its geospatial bounding box."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "source_info = sources[1][\"source_information\"]\n",
    "\n",
    "tile_x_min = source_info[\"geospatial_x_min\"]\n",
    "tile_x_max = source_info[\"geospatial_x_max\"]\n",
    "tile_y_min = source_info[\"geospatial_y_min\"]\n",
    "tile_y_max = source_info[\"geospatial_y_max\"]\n",
    "\n",
    "tile_polygon_points = [\n",
    "    (tile_x_min, tile_y_min),\n",
    "    (tile_x_min, tile_y_max),\n",
    "    (tile_x_max, tile_y_max),\n",
    "    (tile_x_max, tile_y_min),\n",
    "]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We must query Specklia with a polygon in the EPSG4326 projection, so let's transform the information we have just\n",
    "retrieved and plot it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "source_projection = pyproj.CRS.from_proj4(source_info[\"geospatial_projection\"])\n",
    "desired_projection = pyproj.CRS.from_epsg(\"4326\")\n",
    "transformer = pyproj.Transformer.from_proj(source_projection, desired_projection, always_xy=True)\n",
    "\n",
    "tile_polygon = shapely.Polygon([transformer.transform(p[0], p[1]) for p in tile_polygon_points])\n",
    "\n",
    "tile_spatial_coverage = gpd.GeoDataFrame(geometry=[tile_polygon], crs=4326)\n",
    "\n",
    "ax = tile_spatial_coverage.to_crs(epsg=3413).plot(\n",
    "    figsize=(5, 5), alpha=0.5, color=data_coverage_color, edgecolor=data_coverage_color\n",
    ")\n",
    "ax.set_xlabel(\"x [m]\")\n",
    "ax.set_ylabel(\"y [m]\")\n",
    "ax.set_title(\"Example 100x100km product tile\")\n",
    "ctx.add_basemap(ax, source=ctx.providers.Esri.WorldImagery, crs=3413, zoom=8)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Query dataset\n"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now that we have the bounding box of our 100x100km tile, let's use it for a query.\n",
    "\n",
    "According to the definition of the point product, querying the dataset over a single tile over a time window of 3 months\n",
    "should return data from 3 original sources. Let's check!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "point_product_data, sources = client.query_dataset(\n",
    "    dataset_id=point_product_dataset[\"dataset_id\"],\n",
    "    epsg4326_polygon=tile_polygon,\n",
    "    min_datetime=datetime.datetime(2023, 1, 1),\n",
    "    max_datetime=datetime.datetime(2023, 3, 31),\n",
    "    columns_to_return=[\"timestamp\", \"elevation\", \"uncertainty\", \"x\", \"y\"],\n",
    ")\n",
    "\n",
    "print(f\"Query complete, {len(point_product_data)} points returned, drawn from {len(sources)} original sources.\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Plot coverage"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, let's plot the data coverage."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from mpl_toolkits.axes_grid1 import make_axes_locatable\n",
    "\n",
    "fig, axs = plt.subplots(1, 2, figsize=(20, 10), sharex=True)\n",
    "point_product_data.dropna(inplace=True)\n",
    "point_product_data_3413 = point_product_data.to_crs(epsg=3413)\n",
    "divider = make_axes_locatable(axs[0])\n",
    "cax = divider.append_axes(\"right\", size=\"5%\", pad=0.1)\n",
    "point_product_data_3413.plot(\n",
    "    ax=axs[0],\n",
    "    column=point_product_data[\"elevation\"],\n",
    "    s=1,\n",
    "    cmap=\"viridis\",\n",
    "    legend=True,\n",
    "    legend_kwds={\"cax\": cax, \"label\": \"Elevation [m]\"},\n",
    ")\n",
    "\n",
    "axs[0].set_xlabel(\"x [m]\")\n",
    "axs[0].set_ylabel(\"y [m]\")\n",
    "\n",
    "divider = make_axes_locatable(axs[1])\n",
    "cax = divider.append_axes(\"right\", size=\"5%\", pad=0.1)\n",
    "point_product_data_3413.plot(\n",
    "    ax=axs[1],\n",
    "    column=point_product_data[\"uncertainty\"],\n",
    "    s=1,\n",
    "    cmap=\"viridis\",\n",
    "    legend=True,\n",
    "    legend_kwds={\"cax\": cax, \"label\": \"Uncertainty [m]\"},\n",
    ")\n",
    "axs[1].set_xlabel(\"x [m]\")\n",
    "\n",
    "plt.ticklabel_format(axis=\"x\", style=\"sci\", scilimits=(0, 0))\n",
    "t = plt.suptitle(\"Coverage of CryoTEMPO-EOLIS point product over example 100x100km tile\")\n",
    "ctx.add_basemap(axs[0], source=ctx.providers.Esri.WorldImagery, crs=3413, zoom=8)\n",
    "ctx.add_basemap(axs[1], source=ctx.providers.Esri.WorldImagery, crs=3413, zoom=8)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The plot shows all point data available in our 100x100km tile in the timeframe that we specified for the query,\n",
    "and demonstrates the spatial coverage achieved using swath processing.\n",
    "\n",
    "In the figure on the left, the scatter points are colour-coded by their elevations. On the right, the\n",
    "colour shows the uncertainty associated with each point.\n",
    "\n",
    "For more information on CryoSat-2 and swath processing, see https://earth.esa.int/eogateway/news/cryosats-swath-processing-technique."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Explore data quality"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Firstly we calculate the mean and median uncertainties, as well as the standard deviation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "med_unc = np.nanmedian(point_product_data[\"uncertainty\"])\n",
    "mean_unc = np.nanmean(point_product_data[\"uncertainty\"])\n",
    "std_unc = np.std(point_product_data[\"uncertainty\"])"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We next set up some histogram bins, requiring 40 bins in the uncertainty range."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "bins = np.linspace(np.nanmin(point_product_data[\"uncertainty\"]), np.nanmax(point_product_data[\"uncertainty\"]), 40)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we plot a histogram of the uncertainty values for this point product dataset, and display the mean and median values, as well as the standard deviation of uncertainties."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fig, axs = plt.subplots(1, 1, figsize=(20, 10))\n",
    "\n",
    "a = axs.hist(point_product_data[\"uncertainty\"], bins=bins, facecolor=\"paleturquoise\", edgecolor=\"k\", alpha=0.6)\n",
    "\n",
    "axs.vlines(mean_unc, 0, 12e4, color=\"k\", linestyle=\"dashed\", label=\"Mean Uncertainty = {:0.2f}\".format(mean_unc))\n",
    "axs.vlines(med_unc, 0, 12e4, color=\"b\", linestyle=\"dashed\", label=\"Median Uncertainty = {:0.2f}\".format(med_unc))\n",
    "axs.vlines(mean_unc - std_unc, 0, 12e4, color=\"orange\", linestyle=\"dashed\")\n",
    "axs.vlines(mean_unc + std_unc, 0, 12e4, color=\"orange\", linestyle=\"dashed\", label=\"$\\sigma$ = {:0.2f}\".format(std_unc))\n",
    "\n",
    "axs.set_ylim(0, 12e4)\n",
    "axs.set_xlabel(\"Uncertainty [m]\")\n",
    "axs.set_ylabel(\"Frequency\")\n",
    "axs.set_title(\"Uncertainty distribution for CryoTEMPO-EOLIS point product\")\n",
    "plt.legend()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "specklia",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}