Recreating the ICEP from Sarelli et al (2018) with CMEMS

Introduction

Overview

This notebook aims to reproduce the results in Sarelli et al (2018): creating an Index of Coastal Eutrophication (ICEP) using the Copernicus Marine Environment Monitoring Service (CMEMS). The authors developped an automated methodology to estimate ICEP from the environmental variables:

  • nitrate concentrations
  • phosphate concentrations
  • silica concentrations
  • chlorophyll concentrations
  • the euphotic zone depth

Motivations

One of the United Nations (UN) Sustainable Development Goal (SDG) is focuses on the oceans: "SDG 14 - Conserve and sustainably use the oceans, seas and marine resources for sustainable development". The very first of its 10 targets is "Indicator 14.1.1: Index of coastal eutrophication and floating plastic debris density". The ICEP is however not precisely defined and Sarelli et al (2018), among others, have proposed methodologies to estimate ICEP from Earth Observation (EO) systems like Copernicus' Marine Service.

The Index for Coastal Eutrophication Potential (ICEP) is an indicator for the potential of riverine nutrient export to sustain new production of non-diatoms phytoplankton biomass, including Harmful Algal Blooms. It uses nitrogen (N), phosphorus (P) and silica (Si) values of riverine loadings.

Another reference is “3.10 Coastal Eutrophication Potential” in Hofste et al, 2019 (World Resources Institute).

Note that the Copernicus data is only available at a resolution of 1/12$^\circ$: it is too coarse to quantity most conservation efforts (unless they are at the oceanic mesoscale, which is rare). It is however a great approach to obtain a baseline ICEP for a region.

Workflow: 01 Retrieving the CMEMS data

First we retrieved the CMEMS data: this can be manually downloaded via CMEMS My Ocean Viewer using the “Atlantic-Iberian Biscay Irish- Ocean Biogeochemical Analysis and Forecast” product.

This is the product page: NON ASSIMILATIVE Hindcast - IBI_MULTIYEAR_BGC_005_003.

  • [x] To do: retrieve the data using the Copernicus API through the Motu client.

Here, we use the motuclient to retrieve the data. NOTE: you need a registered username!

References

Sarelli, A., Sykas, D., Miltiadou, M., Bliziotis, D., Spastra, Y. and Ieronymaki, M., 2018, August. A novel automated methodology that estimates the United Nations (UN) Sustainable Development Goal (SDG) 14.1. 1.: index of coastal eutrophication using the Copernicus Marine Environment Monitoring Service (CMEMS). In Sixth International Conference on Remote Sensing and Geoinformation of the Environment (RSCy2018) (Vol. 10773, p. 1077302). International Society for Optics and Photonics.doi:10.1117/12.2326160

Hofste, R.W., Kuzma, S., Walker, S., Sutanudjaja, E.H., Bierkens, M.F., Kuijper, M.J., Sanchez, M.F., Van Beek, R., Wada, Y., Rodríguez, S.G. and Reig, P., 2019. Aqueduct 3.0: Updated decision-relevant global water risk indicators. World Resources Institute: Washington, DC, USA. Available online at: https://www.wri.org/publication/aqueduct-30.

In [1]:
import os

Source directory where CMEMS file(s) are saved

In [2]:
source_dir = "data_CMEMS"
if not os.path.exists(source_dir):
    os.mkdir(source_dir)
print("The downloaded files will be saved in the following directory:")
print(source_dir)
The downloaded files will be saved in the following directory:
data_CMEMS
In [3]:
# utilities packages
import datetime
from netCDF4 import Dataset
import pandas as pd
In [4]:
# plotting packages
import matplotlib.pyplot as plt
from matplotlib.lines import Line2D
import cmocean
import seaborn as sns
import cartopy.crs as ccrs
import cartopy.feature as cfeature

ICEP calculations definitions

In [5]:
from eutrophication_utils import *

Year as variable to save results to their respective folders

In [6]:
year = 2017

Out directory to save results

In [7]:
out_dir = "Sarelli_et_al2018_Aug" + str(year) + "_example_out"
if not os.path.exists(out_dir):
    os.mkdir(out_dir)
In [8]:
save_plot = False # saves plots to out_dir
show_all = True # show all the plots, at all steps

Workflow

01 Retrieving the CMEMS data

We use the motuclient:

--motu https://my.cmems-du.eu/motu-web/Motu \
--service-id IBI_MULTIYEAR_BGC_005_003-TDS \
--product-id cmems_mod_ibi_bgc_my_0.083deg-3D_P1D-m \

This refers to the specific CMEMS product mentioned above. Note that for dates between 2020-05-02 12:00:00 and 2022-05-14 12:00:00 we would use IBI_ANALYSISFORECAST_BGC_005_004-TDS instead.

To illustrate how to recreate the results from Sarelli et al (2018), we first focus on the "Week 1" subpanel in Figure 6: the first week of August 2017. The authors did not provide exact spatio-temporal bounds, so I guessed the following:

--date-min '2017-08-03 00:00:00' \
--date-max '2017-08-10 23:59:59' \
--longitude-min -7 --longitude-max 3.5 --latitude-min 49.5 --latitude-max 55.0 \

We also download surface data only:

--depth-min 0.5057600140571594 --depth-max 0.5057600140571594

We download all the following variables at once:

--variable no3 \
--variable po4 \
--variable si \
--variable chl \
--variable zeu \

Finally we save the downloaded file to the specified out-dir and out-name.

In [ ]:
!python3 -m motuclient \
--motu https://my.cmems-du.eu/motu-web/Motu \
--service-id IBI_MULTIYEAR_BGC_005_003-TDS \
--product-id cmems_mod_ibi_bgc_my_0.083deg-3D_P1D-m \
--date-min '2017-08-03 00:00:00' \
--date-max '2017-08-10 23:59:59' \
--depth-min 0.5057600140571594 \
--depth-max 0.5057600140571594 \
--variable no3 \
--variable po4 \
--variable si \
--variable chl \
--variable zeu \
--out-dir 'data_CMEMS' \
--out-name 'chl_no3_po4_si_zeu_201708.nc' \
--longitude-min -7 --longitude-max 3.5 --latitude-min 49.5 --latitude-max 55.0 \
--user <YOUR_USERNAME> --pwd <YOUR_PASSWORD>

Import local data from source_dir.

In [15]:
fnames = os.listdir(source_dir)

We want to import all the variables from all the files automatically. We know that latitude, longitude, depth and time will be recurrent independent variables. We want to import the data as "the variables that are NOT these independent variables".

In [16]:
indep_var = ["depth", "latitude", "time", "longitude"]

We loop through the files, then loop through the netCDF4 variables, find the variable of interest and make a dictionary out of it. Note: in a netCDF4 file with Python, we can get all the variables names with fh.variables.keys(). Same for dimensions.

In [17]:
for fname in fnames:
    
    # load the netCDF4 file
    fh = Dataset(os.path.join(source_dir,fname), 'r')
    
    # now we loop through individual variables
    for variable in fh.variables.keys():
        
        # if this is the variable (eg chlorophyll, nitrate)
        if variable not in indep_var:
            # transform the dict_keys intro attributes with a 'variable" dictionary
            exec(variable+' = xr.open_dataset(os.path.join(source_dir,fname))')
            print("The variable " + variable + " was imported.")
The variable no3 was imported.
The variable po4 was imported.
The variable si was imported.
The variable chl was imported.
The variable zeu was imported.
In [18]:
print("The resolution is: ")
print("* " + "{:.4f}".format((chl.latitude[1]-chl.latitude[0]).data) + " degrees of latitude;")
print("* " + "{:.4f}".format(((chl.latitude[1]-chl.latitude[0]).data)*111) + " km;")
print("* " + "{:.4f}".format(((chl.latitude[1]-chl.latitude[0]).data)*111*0.6213712) + " mi.")
The resolution is: 
* 0.0833 degrees of latitude;
* 9.2499 km;
* 5.7476 mi.
In [19]:
# This is helpful in e.g. case po4.latitude.shape != chl.latitude.shape
# dims = fh.dimensions
# # now we get individual dimensions
# for dim in dims.keys():
#     temp_dim = fh.dimensions[dim].name
#     temp_num = fh.dimensions[dim].size
#     exec('N'+temp_dim + '= ' + str(temp_num))
#     print(dim)

02 Water Area Classification Workflow

"The first step of the methodology contains the normalization of all the products to the parameters used in the algorithm. Concentration of chlorophyll-a and nutrients (silica, phosphate and nitrates) are provided directly from the CMEMS product. For the calculation of Secchi Depth (SD) as a measure of water transparency, a formula that combines euphotic zone depth with the SD of a water region is used according to the equations"

For the euphotic zone depth ($Z_{eu}$) and the Secchi Depth ($Z_{SD}$) in meters as a measure of water transparency:

In [20]:
linear_fn = True

2.1 Get index for nitrate, phosphorus, silicate concentrations

The concentrations of nitrogen N ($C_N$), of silica Si ($C_{Si}$) and of phosphorus ($C_P$) are in mmol/m$^3$.

In [21]:
print(po4.po4.long_name + ": " + po4.po4.units)
print(no3.no3.long_name + ": " + no3.no3.units)
print(si.si.long_name + ": " + si.si.units)
Mole Concentration of Phosphate in Sea Water: mmol.m-3
Mole Concentration of Nitrate in Sea Water: mmol.m-3
Mole Concentration of Silicate in Sea Water: mmol.m-3

We now verify that the coordinates are the same. We can inspect visually with plt.plot or just use boolean.

In [22]:
if (po4.longitude == no3.longitude).all() & (po4.longitude == si.longitude).all(): 
    print("The longitude coordinates are compatible!")
else:
    print("Issues with longitude compatibility. Check out https://docs.xarray.dev/en/stable/generated/xarray.combine_by_coords.html to auto-magically combine the datasets!")

if (po4.latitude == no3.latitude).all() & (po4.latitude == si.latitude).all():
    print("The latitude coordinates are compatible!")
else:
    print("Issues with latitude compatibility. Check out https://docs.xarray.dev/en/stable/generated/xarray.combine_by_coords.html to auto-magically combine the datasets!")
    
if (po4.depth == no3.depth).all() & (po4.depth == si.depth).all():
    print("The depth coordinates are compatible!")
else:
    print("Issues with depth coordinate compatibility. Check out https://docs.xarray.dev/en/stable/generated/xarray.combine_by_coords.html to auto-magically combine the datasets!")
The longitude coordinates are compatible!
The latitude coordinates are compatible!
The depth coordinates are compatible!
In [23]:
ind_N = get_indN(no3.no3.isel(depth = 0, time=0), 
                 si.si.isel(depth = 0, time=0),
                po4.po4.isel(depth = 0, time=0))

2.2 Get index for chlorophyll

Mass concentration of chlorophyll ($chl$) in mg/m$^3$.

In [24]:
print(chl.chl.long_name + ": " + chl.chl.units)
Mass Concentration of Chlorophyll in Sea Water: mg.m-3
In [25]:
ind_chl = get_ind_chl(chl.chl.isel(depth = 0, time=0))

2.3 Get index for Secchi Depth

First we get the Secchi Depth from the Euphotic Zone Depth.

In [26]:
Z_SD = get_Z_SD(zeu);

It is likely that this is the main source of discrepancy between our results and the plots from Sarelli et al (2018).

In [27]:
ind_Z_SD = get_ind_ZSD(Z_SD.zeu.isel(time=0))

03 Get eutrophication index

In [28]:
index_eutroph = get_index(ind_N, ind_chl, ind_Z_SD)

Save output to out_dir.

In [29]:
index_eutroph.to_netcdf(out_dir + "/" + "index_eutroph_" + str(year) + ".nc")

04 Plots

Create longitude, latitude, datetime arrays etc. for ease of plotting.

In [30]:
Lon, Lat = np.meshgrid(zeu.longitude, zeu.latitude)
In [31]:
ts = pd.to_datetime(str(zeu.time[0].time.data)) 
In [32]:
d = ts.strftime('%Y-%m-%d')

Plots of input variables

Discover the beauty of xarray.

In [33]:
if show_all:
    po4.po4.isel(depth = 0).plot(cmap = cmocean.cm.amp, col="time", col_wrap=4);
In [34]:
if show_all:
    no3.no3.isel(depth = 0).plot(cmap = cmocean.cm.matter, col="time", col_wrap=4, vmax = 50);
In [35]:
if show_all:
    si.si.isel(depth = 0).plot(cmap = cmocean.cm.turbid, col="time", col_wrap=4);
In [36]:
if show_all:
    chl.chl.isel(depth = 0).plot(cmap = cmocean.cm.algae, col="time", col_wrap=4);
In [37]:
if show_all:
    zeu.zeu.plot(cmap = sns.cubehelix_palette(start=.5, rot=-.75, as_cmap=True), col="time", col_wrap=4);
In [38]:
if show_all:
    Z_SD.zeu.plot(cmap = sns.cubehelix_palette(start=.5, rot=-.75, as_cmap=True), col="time", col_wrap=4);

Plots of the different indices

In [39]:
plt.pcolor(ind_N, cmap = sns.cubehelix_palette(as_cmap=True))
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.tight_layout()
plt.title("2.1 Index for nutrients", fontsize = 20);
In [40]:
plt.pcolor(ind_chl, cmap = sns.cubehelix_palette(start=2, rot=0, dark=0, as_cmap=True))
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.tight_layout()
plt.title("2.2 Index for chlorophyll", fontsize = 20);
In [41]:
plt.pcolor(ind_chl, cmap = sns.cubehelix_palette(start=.5, rot=-.5, as_cmap=True))
plt.gca().set_aspect('equal', adjustable='box')
plt.colorbar()
plt.tight_layout()
plt.title("2.3 Index for euphotic depth", fontsize = 20);

ICEP plots

In [42]:
pcm = plt.pcolor(index_eutroph, cmap = 'Spectral_r');
plt.colorbar(pcm);
plt.title("Out-of-the-box results for ICEP");
In [43]:
fig, axs = plt.subplots(2, 3, figsize=(20,10))

# Ind N
pcm = axs[0,0].pcolor(ind_N, cmap = sns.cubehelix_palette(as_cmap=True))
axs[0,0].set_title("Index N")
axs[0,0].set_aspect('equal', adjustable='box')
fig.colorbar(pcm, ax=axs[0,0], orientation="horizontal")

# Ind chlorophyll
pcm = axs[0,1].pcolor(ind_chl, cmap = sns.cubehelix_palette(start=.5, rot=-.75, as_cmap=True))
axs[0,1].set_title("Index Chlorophyll")
axs[0,1].set_aspect('equal', adjustable='box')
fig.colorbar(pcm, ax=axs[0,1], orientation="horizontal")

# Ind depth
pcm = axs[0,2].pcolor(ind_Z_SD, cmap = sns.cubehelix_palette(start=.5, rot=-.5, as_cmap=True))
axs[0,2].set_title("Index Secchi depth")
axs[0,2].set_aspect('equal', adjustable='box')
fig.colorbar(pcm, ax=axs[0,2], orientation="horizontal")

# make single eutrophication index plot
axs[1,0].remove()
axs[1,2].remove()
pcm = axs[1,1].pcolor(index_eutroph, cmap = 'Spectral_r')
axs[1,1].set_title("Eutrophication level")
axs[1,1].set_aspect('equal', adjustable='box')
fig.colorbar(pcm, ax=axs[1,1], orientation="horizontal")

plt.suptitle("Water quality indices and ICEP", fontsize = 20);

fig.tight_layout()

if save_plot:
    plt.savefig(out_dir + "/" + "Water quality indices and ICEP")
In [44]:
# This plots the matrix with ALL indices, including ICEP==zero (e.g. offshore)
# following the colormap from Sarelli et al (2018).
# This does not highlight actionable data for eutrophication as ICEP is coastal
# and offshore values will tend to zero. It however helps check the subsequent
# calculations when we aim to create a time series.
if show_all:
    plt.figure(figsize=(10,7))

    # Creates the map
    mymap = plt.axes(projection=ccrs.PlateCarree())

    mymap.add_feature(cfeature.LAND, alpha=0.5)
    mymap.add_feature(cfeature.OCEAN, alpha=0.5)
    mymap.add_feature(cfeature.BORDERS, linestyle='-', alpha=0.5)
    mymap.add_feature(cfeature.RIVERS)

    mymap.xaxis.set_visible(True)
    mymap.yaxis.set_visible(True)

    # Plots the data onto map
    plt.scatter(Lon[index_eutroph<=2], Lat[index_eutroph<=2],
                marker = 's', s= 10, 
                transform=ccrs.PlateCarree(),
                c = 'limegreen')
    plt.scatter(Lon[index_eutroph==3], Lat[index_eutroph==3],
                marker = 's', s= 10, 
                transform=ccrs.PlateCarree(),
                c = 'yellow')
    plt.scatter(Lon[(index_eutroph>3)&(index_eutroph<=5)], Lat[(index_eutroph>3)&(index_eutroph<=5)], 
                marker = 's', s= 10, 
                transform=ccrs.PlateCarree(),
                c = 'orange')
    plt.scatter(Lon[index_eutroph>5], Lat[index_eutroph>5],
                marker = 's', s= 10, 
                transform=ccrs.PlateCarree(),
                c = 'red')

    # Plot labels
    plt.ylabel("Latitude", fontsize=14)
    plt.xlabel("Longitude", fontsize=14)
    plt.legend()
No artists with labels found to put in legend.  Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
In [45]:
# Another custom plot for fun.

plt.figure(figsize=(20,14))

# Creates the map
mymap = plt.axes(projection=ccrs.PlateCarree())

mymap.add_feature(cfeature.LAND, alpha=0.5)
mymap.add_feature(cfeature.OCEAN, alpha=0.5)
mymap.add_feature(cfeature.BORDERS, linestyle='-', alpha=0.5)
mymap.add_feature(cfeature.RIVERS)

mymap.xaxis.set_visible(True)
mymap.yaxis.set_visible(True)

# Plots the data onto map
pcm = plt.scatter(Lon[index_eutroph>0], Lat[index_eutroph>0], c = index_eutroph.data[index_eutroph>0], 
                 cmap = 'RdYlGn_r', transform=ccrs.PlateCarree())
plt.title("Eutrophication level on " + d, fontsize = 20);

plt.legend(*pcm.legend_elements(), title = "Index");

# Plot labels
plt.ylabel("Latitude", fontsize=14);
plt.xlabel("Longitude", fontsize=14);

if save_plot:
    plt.savefig(out_dir + "/" + "ICEP_map_custom-legend")
In [46]:
# This plots the matrix with selected indices, as we try to recreate
# the colormap in Figure 6 from Sarelli et al (2018).

plt.figure(figsize=(20,14))

# Creates the map
mymap = plt.axes(projection=ccrs.PlateCarree())

mymap.add_feature(cfeature.LAND, alpha=0.5)
mymap.add_feature(cfeature.OCEAN, alpha=0.5)
mymap.add_feature(cfeature.BORDERS, linestyle='-', alpha=0.5)
mymap.add_feature(cfeature.RIVERS)

mymap.xaxis.set_visible(True)
mymap.yaxis.set_visible(True)

# Plots the data onto map
plt.scatter(Lon[(index_eutroph>0)&(index_eutroph<=2)], Lat[(index_eutroph>0)&(index_eutroph<=2)],
            marker = 's', s= 10, 
            transform=ccrs.PlateCarree(),
            c = 'limegreen')
plt.scatter(Lon[index_eutroph==3], Lat[index_eutroph==3],
            marker = 's', s= 10, 
            transform=ccrs.PlateCarree(),
            c = 'yellow')
plt.scatter(Lon[(index_eutroph>3)&(index_eutroph<=5)], Lat[(index_eutroph>3)&(index_eutroph<=5)], 
            marker = 's', s= 10, 
            transform=ccrs.PlateCarree(),
            c = 'orange')
plt.scatter(Lon[index_eutroph>5], Lat[index_eutroph>5],
            marker = 's', s= 10, 
            transform=ccrs.PlateCarree(),
            c = 'red')

#Custom legend
custom_lines = [Line2D([0], [0], color='limegreen', lw=4),
                Line2D([0], [0], color='yellow', lw=4),
                Line2D([0], [0], color='orange', lw=4),
                Line2D([0], [0], color='red', lw=4)]


plt.legend(custom_lines, ['Index <= 2: non-problem areas', 
                          'Index = 3: tendency for eutrophication events',
                          '4 <= Index <=5: possibility of eutrophication events',
                          'Index = 6: problem areas'],
          title = "Legend")

# Plot labels
plt.ylabel("Latitude", fontsize=14);
plt.xlabel("Longitude", fontsize=14);

if save_plot:
    plt.savefig(out_dir + "/" + "ICEP_map_SarelliEtAl2018-legend")

I think we are very close!

In [47]:
import session_info
session_info.show()
Out[47]:
Click to view session information
-----
cartopy                     0.20.2
cmocean                     2.0
eutrophication_utils        NA
matplotlib                  3.5.1
netCDF4                     1.5.8
numpy                       1.22.3
pandas                      1.4.2
seaborn                     0.11.2
session_info                1.0.0
xarray                      0.16.1
-----
Click to view modules imported as dependencies
PIL                 9.1.0
appnope             0.1.3
asttokens           NA
backcall            0.2.0
beta_ufunc          NA
binom_ufunc         NA
certifi             2020.06.20
cffi                1.15.0
cftime              1.6.0
click               8.1.2
cloudpickle         2.0.0
cycler              0.10.0
cython_runtime      NA
cytoolz             0.11.2
dask                2022.02.1
dateutil            2.8.1
debugpy             1.6.0
decorator           4.4.2
defusedxml          0.6.0
distributed         2022.2.1
executing           0.8.3
fsspec              2022.02.0
hypergeom_ufunc     NA
ipykernel           6.9.1
ipython_genutils    0.2.0
ipywidgets          7.5.1
jedi                0.18.1
jinja2              2.11.2
kiwisolver          1.4.2
markupsafe          2.0.1
matplotlib_inline   NA
mpl_toolkits        NA
msgpack             1.0.3
nbinom_ufunc        NA
packaging           21.3
parso               0.8.0
pexpect             4.8.0
pickleshare         0.7.5
pkg_resources       NA
prompt_toolkit      3.0.8
psutil              5.9.0
ptyprocess          0.7.0
pure_eval           0.2.2
pydev_ipython       NA
pydevconsole        NA
pydevd              2.8.0
pydevd_file_utils   NA
pydevd_plugins      NA
pydevd_tracing      NA
pygments            2.7.1
pyparsing           3.0.8
pyproj              3.3.0
pytz                2022.1
scipy               1.8.0
setuptools          62.1.0
shapefile           2.2.0
shapely             1.8.0
six                 1.15.0
sortedcontainers    2.4.0
stack_data          0.2.0
statsmodels         0.13.2
tblib               1.7.0
tlz                 0.11.2
toolz               0.11.2
tornado             6.1
traitlets           5.1.1
typing_extensions   NA
wcwidth             0.2.5
yaml                6.0
zmq                 22.3.0
-----
IPython             8.2.0
jupyter_client      6.1.7
jupyter_core        4.9.2
notebook            6.4.11
-----
Python 3.10.4 | packaged by conda-forge | (main, Mar 24 2022, 17:39:37) [Clang 12.0.1 ]
macOS-12.1-arm64-arm-64bit
-----
Session information updated at 2022-05-10 10:57
In [ ]: