5  Geospatial Data

This Section follows the steps shown in Figure 1 for geospatial data. It explains hands-on how to arrive at and process the required data for later inclusion in the hydrological-hydraulic model RSMinerve.

Why bother first deriving the catchment shape of a basin under consideration, its subbasins, rivers, and junctions? While it is true that we do not need to take this detour in RSMinerve and can start to assemble the model components piece by piece by hand, such an approach quickly gets very tedious and labor-intensive. Imagine, for a moment, a large watershed such as the Gunt River basin, which covers a total area of more than 13’000 square kilometers (km2) over an elevation range of 4’000 meters (m), approximately. If we decide to manually enter information on 20 elevation bands that act as hydrological response units for each of the four subbasins, the task quickly becomes unmanageable.

This is where the automatic derivation of the model structure from GIS shapefiles comes in handy and can significantly make our lives easier. This is why we are focusing on preparing these layers, as shown in the Figure 5.1 in this Section. The requirements for GIS layers to be imported in RSMinerve are summarized, showing the concept for a small, stylized catchment with four subbasins that each contribute to flow generation without further subdivisions, e.g., into elevation bands.

Figure 5.1: Required GIS input files for RS Minerve. See the corresponding user manual for more information.

RSMinerve connects subbasins to junctions via the corresponding naming of the shapefile fields. River stretches run from the upstream junction to the downstream junction. Consistent naming of rivers and junctions must be ensured to import the shapefiles correctly in RSMinerve.

This Chapter will demonstrate how to derive such GIS layers for a demonstration catchment. First, the data prerequisites are quickly introduced. Second, it illustrates how the reader can delineate the outlines of a catchment that contributes to discharge at the location of a given discharge station. Third, a QGIS Graphical Modeler model that we have prepared for the students takes the DEM, the delineated basin shapefile, and the gauge point shapefile to compute subbasins, junctions, and rivers.

Note

Go through the steps in this Chapter to generate the GIS layers of your sample catchment for later import into RSMinerve and processing.

The reader should know that generating these GIS files is a time-consuming and iterative process and requires manual post-processing after their automatic generation. The result will significantly simplify the hydrological model generation in the next step. Hence, it is essential to familiarize yourself with all the necessary preparatory steps properly.

5.1 Geospatial Data Prerequisites

Here, a local installation of QGIS and a basic understanding of Geographic Information Systems are required. Please see ?sec-study-guide-materials for more information on how to install QGIS.

Tip

Appendix C in the Appendix walks you through in a detailed manner of many of the required steps in the Chapter. Therefore, please also consult the resources there.

There are very many online resources that can be consulted for learning QGIS. You can Google them or start with a video tutorial like the following one.

To process the data for your case study pack, make sure that you downloaded the corresponding files via this link. Depending on the catchment you have chosen, the files are either in the folder ./AMUDARYA, ./CHIRCHIK, ./CHU or ./SYRDAYA.

5.2 Catchment Delineation

Assuming you have downloaded the catchment data folder of your choice, the corresponding DEM file should be loaded in a new, empty QGIS project. Before doing this, make sure that the project’s projection is in UTM (How to change the projection of a project). Check the projection of the DEM layer and reproject it if necessary (how to).

If you do not have a DEM available to start with, the process of downloading one is straightforward. Probably the easiest way is to install the QGIS SRTM3 plugin (how to). An alternative way is to download it via the USGS Earth Explorer (how to). Both solutions require you to register an Earth Explorer Account (how to register). Finally, after downloading the DEM tiles for the region of interest, the tiles need to be merged (how to merge DEM tiles).

Typically, geospatial data from open sources is stored in the standard projection WGS84 (EPSG:4326). The WGS84 is in degrees, minutes, and seconds, but having spatial layers in the UTM projection for hydrological analysis is more convenient for hydrological analysis. Look up in which UTM Zone your catchment lies and re-project to that zone. In the example of the Chirchiq basin, the preferred UTM zone is 42N, i.e., EPSG:32642.

As a next step, we will load the shapefile of the discharge gauge station location. This file contains the point location where discharge is measured and from which we want to delineate the upstream contributing area. The data is stored in the ./GaugeData folder and is called XXXXX_Gauge.shp, where the XXXXX are placeholders for the five-digit code uniquely identifying your station. Once the shapefile is loaded, check in Properties/Information that it is correctly projected. If this is not the case, reproject to the relevant UTM zone as described above.

Now, we are ready to start with the catchment delineation.

Instead of just loading the existing catchment shapefile from the corresponding ./Basin folder (how to add a vector layer to a QGIS project), it is better to actually go through the steps step-by-step to learn how to derive them. The steps are also described in this online tutorial:

To derive the catchment area upstream of the discharge gauge that we have loaded in the previous step, the DEM is traced upstream until elevation values no longer increase, i.e., until the watershed’s boundary is reached. The SRTM DEM contains the rounded average elevation in each cell of about 25 - 30 meters [m] resolution. Due to the averaging and rounding it may happen, that in a SRTM DEM, an upstream river cell contains lower elevations than the downstream river cell. If the catchment delineation algorithm reaches such a situation, it stops, thinking it has reached the watershed’s boundary. To avoid this, cells that form local depressions should be filled to create a river bed that is continuously increasing in elevation in the upstream direction until it reaches the watershed boundary.

In QGIS, several methods can fill terrain depressions. We are using the r.fill.dir algorithm to perform this task. Figure 5.2 shows the process in detail. First, we make sure that we have loaded the DEM in the correct projection. Second, we open the Processing Toolbox and type r.fill.dir in the search bar, and then double-click on the corresponding algorithm. Third, in the open window, we ensure that the DEM is selected under the Elevation header. On top of that, the Flow Direction and Problem Areas output files are not needed, and their checkboxes can be left unchecked.

Figure 5.2: Filling sinks in a DEM with the r.fill.dir algorithm.

If the algorithm has run, a new entry for the Depressionless DEM will appear in the layers panel with the correct DEM. This raster can now be used to delineate the basin area. The following steps need to be carried out for this;

  1. Use the r.watershed algorithm to obtain the flow accumulation and drainage direction rasters.
  2. Ensure that the gauge is correctly located on the DEM
  3. Use the r.water.outlet algorithm to delineate the upstream area
  4. Convert the resulting Basin raster into a polygon via the Conversion/Polygonize method.

We show the process in detail here.

Figure 5.3: Running the r.watershed algorithm. Select the sink-filled DEM, set the minimum size of the exterior watershed to 200’000 and check the box Use positive flow accumulation even for likely underestimation.

Figure 5.3 shows step-by-step how to run the r.watershed algorithm. First, make sure that the Depressionless DEM raster is selected in the Elevation field. Fill in 200’000 as the minimum size of the exterior watershed (this is a guessed number that normally yields good results). Also, check the box to use positive flow accumulation even for likely underestimates. Finally, just select the top two output files, i.e., Number of cell that drain through each cell and Drainage direction. The remaining resulting rasters are not required during the next steps.

Once you click run, the resulting raster layers will be computed. The flow accumulation raster (called Number of cells that drain through each cell) contains a large range of values. The largest value is obviously in the one cell that drains the largest area of the DEM.

Tip

Use the raster calculator to compute the logarithm of the flow accumulation raster. This helps to visualize the raster better. This raster can ensure that the gauge from which the upstream area will be delineated is actually correctly located on the streams. If misplacement is evident, we can edit the gauge shapefile and relocate the gauge to the correct river stretch.

Once it is ensured that the gauge is correctly located on the stream, we are ready to delineate the upstream area by using the r.water.outlet algorithm. For this, zoom in to display a close-up of the gauge and select and open the r.water.outlet window. The algorithm only needs two inputs, i.e., the Drainage direction raster and the coordinates of the outlet point. Instead of manually entering them, you can just click on the map to select the precise point of the gauge location. This action will transfer the coordinates to the algorithm’s interface. After pressing run, the Basin raster will be computed. Figure 5.4 shows the process.

Figure 5.4: Basin delineation using the r.water.outlet algorithm. The resulting Basin raster can easily be polygonized.

As a final step, you should polygonize the Basin raster. We are now ready to run the process model, which generates the required shapefiles for further processing in RSMinerve. The resulting vectorized basin can be stored in the corresponding folder on the local computer.

The process can be repeated for any other basin in a similar manner. Note also that if you have several gauges to map the upstream areas from, the Flow accumulation and the Drainage direction rasters can be computed once, stored, and reused again for all gauges.

We now have all the necessary files to process them in the Graphical Modeler model that is provided with the learning pack.

5.3 Preparation of RSMinerve Input files

You can download the Graphical Modeler model from the Students_CaseStudyPacks online repository. The model file is called rsminerve_gis_files_preparation_2022_win_os.model3 and should be downloaded in your local working directory. The model can then be loaded via the Processing/Graphical Modeler menu in QGIS. Once you have loaded the model, you should see the model pop up on your screen, as is shown in a similar fashion in Figure 5.5.

Note that the present version of the model file has been developed under QGIS 3.24 and may not run error-free in a different QGIS release.

Warning

Warning (14.02.2023): The model does currently not run on QGIS 3.34. Efforts are underway to address and resolve the issue.

Figure 5.5: The Graphical Modeler model rsminerve_gis_files_preparation_2022_win_os.model3 is shown in a graphical manner in the right window (1). When you click Run, the parameters specification window on the left will pop up (2). See text for further explanations

Figure 5.5 shows nothing more than a graphical representation of a script. This script is like a recipe to execute algorithms in QGIS sequentially where an output of one algorithm feeds as input into the following algorithm. The yellow highlighted elements in window (1) are input parameters that are specified in the window (2) before the execution of the script. The green elements in window (1) are the results that are stored during and after the execution of the algorithm and available for further processing then.

As explained above, the rsminerve_gis_files_preparation_2022_win_os.model3 model, in a nutshell, prepares input files for RSMinerve using the DEM, the basin shapefile, and the gauge’s location as input. The key output elements are called SUBBASINS, RIVERS, and JUNCTIONS.

Before the execution of the model (script), the parameters have to be set in a careful manner. For some of the parameter values, a first best guess might produce outcomes that are not at the required level of detail or, alternatively, too detailed. Depending on the basin under consideration, an iterative approach typically must be followed to arrive at the desired output, as explained in the following.

In Figure 5.6, we show a meaningful parameter selection for the Chatkal River basin in the Case Studies pack (Gauge 16279). The basin and gauge shapefiles, as well as the DEM, are the geospatial assets that need to be specified, as shown in (1), (5), and (4). The following parameters are important to set correctly.

Figure 5.6: Parameter selection menu of the Graphical Modeler rsminerve_gis_files_preparation_2022_win_os.model3 model. The detailed description of the input elements as well as parameter values chosen is explained in the main text.
  • BasinShapeBuffer_meters (2): This is a buffer value to be specified in meters. The default value of 500 m is a value that can be used for a wide range of different catchment sizes and does not need to be changed.
  • Channel Network Cutoff Value (3): This is a crucial parameter for determining the subbasin granularity. If many the main basin is to be subdivided into a larger number of subbasins, this value needs to be chosen to be around 107 - 108. For Chatkal, as our example shows, a perfect value for the subdivision into the main four subbasins is 109.
  • Elevation Bands Table (4): Figure 5.7 shows the table that can be customized according to user needs and wants. The table specifies the granuarity with which the basin domain will be subdivided into elevation bands (hydrological response units, i.e. HRUs). As a rule of thumb, in climate change impact studies and for snow melt-dominated basins, a spacing of 100 m - 200 m is an optimal choice for HBV models in RSMinerve. Note, however, that the finer the domain is, the larger the number of HRUs to be modeled. This proportionally increases the computational demand of the hydrological model.
  • River Network Level (8): Here, you set a number between 1 and 8. The lower the number, the higher the number of tributaries and sub-tributaries that will be delineated for the RIVERS output shapefile. If you choose 8, this means that normally, only the river’s main stem will be delineated. This is relevant for the case where no further subdivisions into small subbasins are desired for modeling in RSMinerve. 7 is a robust choice if only the main tributaries should be delineated.
Figure 5.7: The elevation bands table is shown. It shows the user-customizable elevation band intervals. The example shows the specification of 9 elevation bands (HRUs) of 500 m spacing between each. For each HRU and each subbasin, a corrsponding hydrological model will be configured in RSMinerve
Tip

You will likely have to iterate and run the model a couple of times to achieve the desired results. Try it yourself! Note that if you get an error message that POLYGONS.shp was not found, try increasing the Channel Network Cutoff Value by 50 % - 70 %.

Sample output results for our demonstration catchment are shown in Figure 5.8. The resulting layers for this demonstration are available in the data pack. We are, however, not quite done yet.

Figure 5.8: Resulting output of the Graphical Modeler model for Chatkal River basin. The next step is to clean up manually the SUBBASINS, RIVERS and JUNCTIONS shapefiles.

5.4 Post-Processing Results Shapefiles

As mentioned in the previous Section, we need to post-process the SUBBASINS, RIVERS, and JUNCTIONS shapefiles so that they correctly contain the required fields, as shown in Figure 5.1. A good way to post-process the files is to start with the RIVERS layer, continue with the JUNCTIONS layer, and end with the SUBBASINS layer.

5.4.1 JUNCTIONS and RIVER Layers

In editing the junctions layer, the basic idea is to edit the junctions layer in a way to remove junctions that are not needed, given the desired subdivision of the larger basin into subbasins. For the demo catchment shown in Figure 5.9, the situation is explained in greater detail.

Figure 5.9: The Figure shows the output files of the Graphical Modeler model. For this demonstration catchment, the only junction that we want to keep (apart from the outflow junction located at the gauge) is the one highlighted with the red arrow which marks the confluence of the two major upstream tributaries.
Tip

If you struggle with shapefile editing, please consult the detailed step-by-step guide as shown in the corresponding Quick Guide in the Appendix.

Once we have removed the junctions that are not needed, we can very easily also remove the river lines that are not required. Generally, we do not need river sections above the most upstream junctions for any given subbasin within the river basin or the river basin itself.

If we follow these guides, we end up with the results shown in Figure 5.10. After editing the attribute tables of these shapefiles, the files are ready for import in RSMinerve.

Figure 5.10: Final edited shapefiles with the unnecessary junctions and river stretches removed. The two red arrows point to the two remaining junctions, and the blue arrow shows the river stretch that is left.

What is left now is to ensure that the attribute tables contain the correct fields and that these fields contain the correct labels. Figure 5.1 shows the required fields. These are

  • JUNCTIONS: Junctions Name (junct_name), Junctions ID (junct_id), Rivers ID (riv_id)
  • RIVERS: Rivers Name (riv_name), Rivers ID (riv_id), Junctions ID (junct_id)

The abbreviated field names are used in QGIS since field names are limited in shapefiles. The fields and values of the final junctions shapefile are shown in Figure 5.11. Compare this with the edited final attribute table of the rivers shapefile, as shown in Figure 5.12.

Figure 5.11: The final attribute table of the demo junctions shapefile is shown. (1) denotes the fields for the donwstream outlet junction and (2) shows the values for the upstream junction. The naming convention has to be understood in the context of the naming convention using in the RIVERS shapefile
Figure 5.12: The final rivers shapefile attribute table is shown with the only river highlighted (1). Very importantly, the junt_id field referes to the downstream junction where the river drains into.

5.4.2 SUBBASINS Layer

Now, we are working on the subbasins. The major subbasins are highlighted in the following Figure 5.13. We call them the Chatkal River downstream, the Chatkal River upstream, and the Sandalash River. These big subbasins are the hydrological response units (HRUs) for hydrological modeling.

Figure 5.13: The major subbasins of the Chatkal River basin above Gauge 16279 are shown. (1) is Chatkal downstream, (2) is Sandalash River, and (3) is Chatkal upstream.

However, do not forget that we have classified the entire basin into zones of elevation bands. These are shown in the Figure 5.13 as polygons with black outlines from the shapefile layer BasinElevationBands_poly_fixed.shp. Exactly 7 elevation bands (or, in other words, HRUs) were delineated with the Graphical Modeler like this. We must somehow intersect the subbasin HRUs with the elevation bands to generate elevation band HRUs for each subbasin. We will show the steps required to achieve this in the following.

First, we prepare the attribute table of the subbasins shapefile. The way this is done correctly for the demonstration catchment is shown in Figure 5.14. The attribute table contains two fields, i.e., basin_name and junct_id. We name the downstream and upstream sections of the Chatkal River with chatkal_ds and chatkal_us correspondingly. The cells of the field junct_id are populated with the name of the corresponding downstream junction where the subbasin drains. Since both, the subbasin chatkal_us and sandalash drain into the junction upstream_junct, the latter appears in both cells of the junct_id field for these rivers.

Figure 5.14: Subbasins attribute table after adding the required fields and entires as per the requirements of RSMinerve. (1) is for the downstream part of Chatkal River, (2) is for the upstream part of Chatkal River, and (3) is for Sandalash River.

Second, we clean up and prepare the attribute table for intersecting with the subbasins layer in a third step. The attribute table of the layer BasinElevationbands_Poly_fixed contains three fields, i.e., ID, VALUE, and NAME, from which you can safely delete the ID and VALUE fields. What remains is the NAME field that contains a unique identifier for the elevation bands in ascending order, starting from the lowest. Figure 5.15 shows the attribute table.

Figure 5.15: The attribute table of the elevation bands shapefile is shown after the fields ID and VALUE have been removed (1). When an elevation band is selected in the edit mode, the corresponding geometry is highlighted on the map as is shown by the arrows (2).

Third, we must intersect the elevation bands shapefile with the subbasins shapefile. This can easily be accomplished using the Vector/Geoprocessing/Intersection tool in QGIS. When you select the Intersection Algorithm, make sure that you set the elevation bands shapefile as the Input layer and set subbasins shapefile as the Overlay Layer. When executed, you receive an attribute table that looks like the one shown in Figure 5.16.

Figure 5.16: The attribute table resulting from the Intersection Algorithm. You now have to manually add the numbers of the Name field as a suffix to the existing names of the basin_name field.

As a final step, you must manually add the numbers stored in the NAME field to the basin names in the field basin_name. Alternatively, you can open the field calculator and overwrite the basin_name field using the formula: basin_name + ‘_’ + NAME. After this is done, you can safely delete the NAME field and have proudly completed the required steps to set up the GIS shapefiles that can be used as input to the RSMinerve hydrological-hydraulic modeling tool. Figure 5.17 shows the result.

Figure 5.17: The final attribute table of the subbasins file is shown. The elevation band identifiers have been added as suffixes with “_x” to the subbasin names where x denotes the number of the subbasin-specific elevation band.

Using the tool Zonal Statistics from the Processing Toolbox, you can extract the mean elevation for each hydrological response unit. Save the elevation in a field named Z. The resulting GIS layers produced with the above-described steps are available in the data pack

Intersecting with glacier outlines

If glaciers are present in the basin, discharge from glacier melt can be simulated using the GSM model in RS MINERVE. The SUBBASINS layer can be intersected with the glacier outlines from the Randolph Glacier Inventory (RGI) v6.0 data set see section on glacier outlines. The following step-by-step instructions show how to further process the SUBBASINS layer to for later automated model generation.

In the data pack of the book, you will find a GIS layer named 16279_RGI_Glaciers with the glacier outlines from the Randolph Glacier Inventory (RGI), clipped to the case study basin. The RGI glacier outlines are not always consistent with the basin boundaries derived using the process modeler in QGIS as described above because the two data sets are based on different data sources and spatial resolution. We, therefore, start out by estimating the glacier area and volume for each of the RGI glaciers in the basin. Open the field calculator in the attribute table of the RGI glaciers layer and estimate the glacier area in \(m^2\) for each glacier using the formula $area (see Figure 5.18).

Figure 5.18: Calculate the glacier area using the field calculator. Use descriptive output field names and add units where possible (in this example, we choose the name Area_m2). Make sure to select decimal numbers for the output field type

You can then use an empirical function to estimate glacier volume in the field calculator; for example, following Erasov (1968): \(V = 0.027 \cdot A^{1.5}\) with Volume \(V\) in \(km^3\) and Area \(A\) in \(km^2\)) as shown in ?fig-glacier-volume-field-calculator.

Estimate the glacier volume using an area-volume scaling relationship. Here an example using the empirical scaling relationsihp by Erasov. Note that the area field is in $m^2$ and needs to be converted to $km^2$. We now have an estimate of the water volume stored in each glacier. The goal is to estimate the glacier volume in each of the subbasins and to edit the elevation bands layer so that automated model creation in RS MINERVE can create an appropriate GSM object for each glaciated sub-basin. In the first step, we clip the glacier areas from the elevation bands layer BasinElevationbands_Poly_fixed using the tool Vector/Geoprocessing/Difference. That layer now only contains HRUs for HBV model objects. You can save this layer to your GIS folder, as it will later be merged with the layers for the GSM objects and be your final layer for import to RS MINERVE.

We now prepare the layer for the GSM objects. Intersect the glaciers layer with the elevation bands layer BasinElevationbands_Poly_fixed using the tool Vector/Geoprocessing/Intersection. From the glaciers layer you only need the fields RGIId, Area_m2, and Volume_m3. This will give you a layer with the glacier outlines and the elevation bands on the glacier areas. For each of the glaciers, you now see in which elevation band it lies.

We now aggregate the glacier area and volume in each elevation band using the field calculator and the formulas sum(Area_m2, group_by := basin_name) and sum(Volume_m3, group_by := basin_name). The layer is then dissolved using the tool Vector/Geoprocessing/Dissolve with the dissolve field basin_name. This merges all features with the same basin_name into one geometric feature of the layer. The fields RGIId, Area_m2, and Volume_m3 are now obsolete and can be removed. Thus, we now have the total glacier area and volume for each glaciated HRU in the study basin. For simulating runoff formation with the GSM model, we still need an initial glacier thickness, which we obtain using the field calculator and dividing the total glacier volume in the HRU by the total glacier area in the HRU. This gives us a total glacier thickness in the basin, which is available for melting.

Now, we can clean up the layer and merge it with the HBV elevation bands layer we prepared and saved earlier. You may want to edit the names of the features for the GSM objects with the prefix glacier or gl so that you recognize which hydrological model to assign to which HRU in RS MINERVE (see Section 8.6.2). You then merge the two vector layers and are ready to import them into RS MINERVE for automated model generation. We save the layer under the name 16297_Chatkal_RSM_HRU.

5.5 References

Erasov, N. V. 1968. “Method for Determining of Volume of Mountain Glaciers.” MGI, no. 14: 307–8.