diff --git a/01-spatial-data.qmd b/01-spatial-data.qmd index a6f29490..26110581 100644 --- a/01-spatial-data.qmd +++ b/01-spatial-data.qmd @@ -268,7 +268,7 @@ gdf.geometry.crs Many geometry operations, such as calculating the centroid, buffer, or bounding box of each feature, involve just the geometry. Applying this type of operation on a `GeoDataFrame` is therefore basically a shortcut to applying it on the `GeoSeries` object in the geometry column. -For example, the two following commands return exactly the same result, a `GeoSeries` with country bounding box polygons (using the [`.envelope`](https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoSeries.envelope.html) method). +For example, the two following commands return exactly the same result, a `GeoSeries` with country bounding box polygons (using the `.envelope` method). ```{python} gdf.envelope @@ -886,7 +886,7 @@ In this case, since the rasters are arbitrary, we also set up an arbitrary trans - The origin ($x_{min}$, $y_{max}$) is at `-1.5,1.5` - The raster resolution ($delta_{x}$, $delta_{y}$) is `0.5,-0.5` -We can add this information using [`rasterio.transform.from_origin`](rasterio.transform.from_origin), and specifying `west`, `north`, `xsize`, and `ysize` parameters. +We can add this information using `rasterio.transform.from_origin`, and specifying `west`, `north`, `xsize`, and `ysize` parameters. The resulting transformation matrix object is hereby named `new_transform`. ```{python} @@ -1061,7 +1061,7 @@ zion = gpd.read_file('data/zion.gpkg') zion.crs ``` -We can also illustrate the difference between a geographic and a projected CRS by plotting the `zion` data in both CRSs (@fig-zion-crs). Note that we are using the [`.grid`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.grid.html) method of **matplotlib** to draw grid lines on top of the plot. +We can also illustrate the difference between a geographic and a projected CRS by plotting the `zion` data in both CRSs (@fig-zion-crs). Note that we are using the `.grid` method of **matplotlib** to draw grid lines on top of the plot. ```{python} #| label: fig-zion-crs diff --git a/02-attribute-operations.qmd b/02-attribute-operations.qmd index a132587e..9910e73a 100644 --- a/02-attribute-operations.qmd +++ b/02-attribute-operations.qmd @@ -141,7 +141,7 @@ To remove specific columns we need to add an extra argument, `axis=1` (i.e., col world.drop(['name_long', 'continent'], axis=1) ``` -We can also rename columns using the [`.rename`](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rename.html) method, in which we pass a dictionary with items of the form `old_name:new_name` to the `columns` argument. +We can also rename columns using the `.rename` method, in which we pass a dictionary with items of the form `old_name:new_name` to the `columns` argument. ```{python} world[['name_long', 'pop']].rename(columns={'pop': 'population'}) @@ -205,7 +205,7 @@ world[ .loc[:, ['name_long', 'continent']] ``` -However, specifically, expressions combining multiple comparisons with `==` combined with `|` can be replaced with the [`.isin`](https://pandas.pydata.org/docs/reference/api/pandas.Series.isin.html) method and a `list` of values to compare with. +However, specifically, expressions combining multiple comparisons with `==` combined with `|` can be replaced with the `.isin` method and a `list` of values to compare with. The advantage of `.isin` is more concise and easy to manage code, especially when the number of comparisons is large. For example, the following expression gives the same result as above. @@ -312,7 +312,7 @@ world_agg4 Combining data from different sources is a common task in data preparation. Joins do this by combining tables based on a shared "key" variable. -**pandas** has a function named [`pd.merge`](https://pandas.pydata.org/docs/reference/api/pandas.merge.html) for joining `(Geo)DataFrames` based on common column(s) that follows conventions used in the database language SQL [@grolemund_r_2016]. +**pandas** has a function named `pd.merge` for joining `(Geo)DataFrames` based on common column(s) that follows conventions used in the database language SQL [@grolemund_r_2016]. The `pd.merge` result can be either a `DataFrame` or a `GeoDataFrame` object, depending on the inputs. A common type of attribute join on spatial data is to join `DataFrames` to `GeoDataFrames`. @@ -393,7 +393,7 @@ world2 ``` The resulting `GeoDataFrame` object has a new column called `con_reg` representing the continent and region of each country, e.g., `'South America:Americas'` for Argentina and other South America countries. -The opposite operation, splitting one column into multiple columns based on a separator string, is done using the [`.str.split`](https://pandas.pydata.org/docs/reference/api/pandas.Series.str.split.html) method. +The opposite operation, splitting one column into multiple columns based on a separator string, is done using the `.str.split` method. As a result we go back to the previous state of two separate `continent` and `region_un` columns (only that their position is now last, since they are newly created). The `str.split` method returns a column of `list`s by default; to place the strings into separate `str` columns we use the `expand=True` argument. @@ -489,7 +489,7 @@ elev3d ``` ::: callout-note -You can see that the above array is three-dimensional according to the number of brackets `[`, or check explicitly using `.shape` or [`.ndim`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.ndim.html). +You can see that the above array is three-dimensional according to the number of brackets `[`, or check explicitly using `.shape` or `.ndim`. ::: In three-dimensional arrays, we access cell values using three indices, keeping in mind that dimensions order is `(layers,rows, columns)` @@ -537,7 +537,7 @@ np.nanmean(elev1) ``` Raster value statistics can be visualized in a variety of ways. -One approach is to "flatten" the raster values into a one-dimensional array (using `.flatten`), then use a graphical function such as [`plt.hist`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html) or [`plt.boxplot`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.boxplot.html) (from **matplotlib.pyplot**). +One approach is to "flatten" the raster values into a one-dimensional array (using `.flatten`), then use a graphical function such as `plt.hist` or `plt.boxplot` (from **matplotlib.pyplot**). For example, the following code section shows the distribution of values in `elev` using a histogram (@fig-raster-hist). ```{python} @@ -554,7 +554,7 @@ grain = src_grain.read(1) grain ``` -To calculate the frequency of unique values in an array, we use the [`np.unique`](https://numpy.org/doc/stable/reference/generated/numpy.unique.html) with the `return_counts=True` option. +To calculate the frequency of unique values in an array, we use the `np.unique` with the `return_counts=True` option. The result is a `tuple` with two corresponding arrays: the unique values, and their counts. ```{python} @@ -562,7 +562,7 @@ freq = np.unique(grain, return_counts=True) freq ``` -These two arrays can be passed to the [`plt.bar`](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.bar.html) function to draw a barplot, as shown in @fig-raster-bar. +These two arrays can be passed to the `plt.bar` function to draw a barplot, as shown in @fig-raster-bar. ```{python} #| label: fig-raster-bar diff --git a/03-spatial-operations.qmd b/03-spatial-operations.qmd index 28fca03e..3f781a14 100644 --- a/03-spatial-operations.qmd +++ b/03-spatial-operations.qmd @@ -1080,7 +1080,7 @@ Other names for this operation are spatial filtering and convolution [@burrough_ ![Input raster (left) and resulting output raster (right) due to a focal operation---finding the minimum value in $3 \times 3$ moving windows.](images/04_focal_example.png){#fig-focal-filter} -In Python, the [**scipy.ndimage**](https://docs.scipy.org/doc/scipy/tutorial/ndimage.html) [@scipy] package has a comprehensive collection of [functions](https://docs.scipy.org/doc/scipy/reference/ndimage.html#filters) to perform filtering of **numpy** arrays, such as: +In Python, the **scipy.ndimage** [@scipy] package has a comprehensive collection of [functions](https://docs.scipy.org/doc/scipy/reference/ndimage.html#filters) to perform filtering of **numpy** arrays, such as: - `scipy.ndimage.minimum_filter`, - `scipy.ndimage.maximum_filter`,