fatiando · santisoler · Oct 27, 2021 · Oct 20, 2021 · Oct 20, 2021 · Oct 20, 2021
diff --git a/CITATION.rst b/CITATION.rst
@@ -4,8 +4,24 @@ Citing Harmonica
 This is research software **made by scientists**. Citations help us justify the
 effort that goes into building and maintaining this project.
 
+Citing the software
+-------------------
+
 If you used this software in your research, please consider
 citing the following source: https://doi.org/10.5281/zenodo.3628741
 
 The link above includes full citation information and export formats (BibTeX,
 Mendeley, etc).
+
+Citing the methods
+------------------
+
+Harmonica offers implementations of some methods that have been published in
+scientific journals. We appreciate citations of these publications as well in
+case you made use of them.
+
+* :class:`harmonica.EquivalentSources` with ``block_size`` set (block-averaged
+ sources):
+
+ Soler, S. R. and Uieda, L. (2021). Gradient-boosted equivalent sources, Geophysical Journal International.
+ doi:`10.1093/gji/ggab297 <https://doi.org/10.1093/gji/ggab297>`__
diff --git a/doc/references.rst b/doc/references.rst
@@ -12,5 +12,6 @@ References
 .. [Hofmann-WellenhofMoritz2006] Hofmann-Wellenhof, B., & Moritz, H. (2006). Physical Geodesy (2nd, corr. ed. 2006 edition ed.). Wien ; New York: Springer.
 .. [Nagy2000] Nagy, D., Papp, G. & Benedek, J.(2000). The gravitational potential and its derivatives for the prism. Journal of Geodesy 74: 552. doi:`10.1007/s001900000116 <https://doi.org/10.1007/s001900000116>`__
 .. [Nagy2002] Nagy, D., Papp, G. & Benedek, J.(2002). Corrections to "The gravitational potential and its derivatives for the prism". Journal of Geodesy. doi:`10.1007/s00190-002-0264-7 <https://doi.org/10.1007/s00190-002-0264-7>`__
+.. [Soler2021] Soler, S. R. and Uieda, L. (2021). Gradient-boosted equivalent sources, Geophysical Journal International. doi:`10.1093/gji/ggab297 <https://doi.org/10.1093/gji/ggab297>`__
 .. [Vajda2004] Vajda, P., Vaníček, P., Novák, P. and Meurers, B. (2004). On evaluation of Newton integrals in geodetic coordinates: Exact formulation and spherical approximation. Contributions to Geophysics and Geodesy, 34(4), 289-314.
 .. [TurcotteSchubert2014] Turcotte, D. L., & Schubert, G. (2014). Geodynamics (3 edition). Cambridge, United Kingdom: Cambridge University Press.
diff --git a/examples/equivalent_sources/block_averaged_sources.py b/examples/equivalent_sources/block_averaged_sources.py
@@ -0,0 +1,126 @@
+# Copyright (c) 2018 The Harmonica Developers.
+# Distributed under the terms of the BSD 3-Clause License.
+# SPDX-License-Identifier: BSD-3-Clause
+#
+# This code is part of the Fatiando a Terra project (https://www.fatiando.org)
+#
+"""
+Gridding with block-averaged equivalent sources
+===============================================
+
+By default, the :class:`harmonica.EquivalentSources` class locates one point
+source beneath each data point during the fitting process. Alternatively, we
+can use another strategy: the *block-averaged sources*, introduced in
+[Soler2021]_.
+
+This method divides the survey region (defined by the data) into square blocks
+of equal size, computes the median coordinates of the data points that fall
+inside each block and locates one source beneath every averaged position. This
+way, we define one equivalent source per block, with the exception of empty
+blocks that won't get any source.
+
+This method has two main benefits:
+
+- It lowers the amount of sources involved in the interpolation, therefore it
+ reduces the computer memory requirements and the computation time of the
+ fitting and prediction processes.
+- It might avoid to produce aliasing on the output grids, specially for
+ surveys with oversampling along a particular direction, like airborne ones.
+
+We can make use of the block-averaged sources within the
+:class:`harmonica.EquivalentSources` class by passing a value to the
+``block_size`` parameter, which controls the size of the blocks. We recommend
+using a ``block_size`` not larger than the desired resolution of the
+interpolation grid.
+
+The depth of the sources can be set analogously to the regular equivalent
+sources: we can use a ``constant`` depth (every source is located at the same
+depth) or a ``relative`` depth (where each source is located at a constant
+shift beneath the median location obtained during the block-averaging process).
+The depth of the sources and which strategy to use can be set up through the
+``depth`` and the ``depth_type`` parameters, respectively.
+"""
+import matplotlib.pyplot as plt
+import numpy as np
+import pyproj
+import verde as vd
+import harmonica as hm
+
+
+# Fetch the sample total-field magnetic anomaly data from Great Britain
+data = hm.datasets.fetch_britain_magnetic()
+
+# Slice a smaller portion of the survey data to speed-up calculations for this
+# example
+region = [-5.5, -4.7, 57.8, 58.5]
+inside = vd.inside((data.longitude, data.latitude), region)
+data = data[inside]
+print("Number of data points:", data.shape[0])
+print("Mean height of observations:", data.altitude_m.mean())
+
+# Since this is a small area, we'll project our data and use Cartesian
+# coordinates
+projection = pyproj.Proj(proj="merc", lat_ts=data.latitude.mean())
+easting, northing = projection(data.longitude.values, data.latitude.values)
+coordinates = (easting, northing, data.altitude_m)
+
+# Create the equivalent sources.
+# We'll use block-averaged sources at a constant depth beneath the observation
+# points. We will interpolate on a grid with a resolution of 500m, so we will
+# use blocks of the same size. The damping parameter helps smooth the predicted
+# data and ensure stability.
+eqs = hm.EquivalentSources(depth=1000, damping=1, block_size=500, depth_type="constant")
+
+# Fit the sources coefficients to the observed magnetic anomaly.
+eqs.fit(coordinates, data.total_field_anomaly_nt)
+
+# Evaluate the data fit by calculating an R² score against the observed data.
+# This is a measure of how well the sources fit the data, NOT how good the
+# interpolation will be.
+print("R² score:", eqs.score(coordinates, data.total_field_anomaly_nt))
+
+# Interpolate data on a regular grid with 500 m spacing. The interpolation
+# requires the height of the grid points (upward coordinate). By passing in
+# 1500 m, we're effectively upward-continuing the data (mean flight height is
+# 500 m).
+grid = eqs.grid(upward=1500, spacing=500, data_names=["magnetic_anomaly"])
+
+# The grid is a xarray.Dataset with values, coordinates, and metadata
+print("\nGenerated grid:\n", grid)
+
+# Plot original magnetic anomaly and the gridded and upward-continued version
+fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(12, 9), sharey=True)
+
+# Get the maximum absolute value between the original and gridded data so we
+# can use the same color scale for both plots and have 0 centered at the white
+# color.
+maxabs = vd.maxabs(data.total_field_anomaly_nt, grid.magnetic_anomaly.values)
+
+ax1.set_title("Observed magnetic anomaly data")
+tmp = ax1.scatter(
+ easting,
+ northing,
+ c=data.total_field_anomaly_nt,
+ s=20,
+ vmin=-maxabs,
+ vmax=maxabs,
+ cmap="seismic",
+)
+plt.colorbar(tmp, ax=ax1, label="nT", pad=0.05, aspect=40, orientation="horizontal")
+ax1.set_xlim(easting.min(), easting.max())
+ax1.set_ylim(northing.min(), northing.max())
+
+ax2.set_title("Gridded and upward-continued")
+tmp = grid.magnetic_anomaly.plot.pcolormesh(
+ ax=ax2,
+ add_colorbar=False,
+ add_labels=False,
+ vmin=-maxabs,
+ vmax=maxabs,
+ cmap="seismic",
+)
+plt.colorbar(tmp, ax=ax2, label="nT", pad=0.05, aspect=40, orientation="horizontal")
+ax2.set_xlim(easting.min(), easting.max())
+ax2.set_ylim(northing.min(), northing.max())
+
+plt.show()
diff --git a/harmonica/equivalent_sources/cartesian.py b/harmonica/equivalent_sources/cartesian.py
@@ -46,14 +46,39 @@ class EquivalentSources(vdb.BaseGridder):
  * Reduction to the pole of magnetic total field anomaly data
  * Analytical derivative calculations
 
- Point sources are located beneath the observed potential-field measurement
- points by default [Cooper2000]_. Custom source locations can be used by
- specifying the *points* argument. Coefficients associated with each point
- source are estimated through linear least-squares with damping (Tikhonov
- 0th order) regularization.
+ By default, the point sources are located beneath the observed
+ potential-field measurement points [Cooper2000]_ that are passed as
+ arguments to the :meth:`EquivalentSources.fit` method, producing the same
+ number of sources as data points.
+ Alternatively, we can reduce the number of sources by using block-averaged
+ sources [Soler2021]_: we divide the data region in blocks of equal size and
+ compute the median location of the observations points that fall under each
+ block. Then, we locate one point source beneath each one of these
+ locations. The size of the blocks, that indirectly controls how many
+ sources will be created, can be specified through the ``block_size``
+ argument.
+ We recommend choosing a ``block_size`` no larger than the resolution of the
+ grid where interpolations will be carried out.
+
+ The depth of the sources can be controlled by the ``depth`` argument.
+ If ``depth_type`` is set to ``"relative"``, then each source is located
+ beneath each data point or block-averaged location at a depth equal to its
+ elevation minus the value of the ``depth`` argument.
+ If ``depth_type`` is set to ``"constant"``, then every source is located at
+ a constant depth given by the ``depth`` argument.
+ In both cases a positive value of ``depth`` locates sources _beneath_ the
+ data points or the block-averaged locations, thus a negative ``depth`` will
+ put the sources _above_ them.
+
+ Custom source locations can be chosen by specifying the ``points``
+ argument, in which case the ``depth_type``, ``bloc_size`` and ``depth``
+ arguments will be ignored.
+
+ The corresponding coefficient for each point source is estimated through
+ linear least-squares with damping (Tikhonov 0th order) regularization.
 
  The Green's function for point mass effects used is the inverse Euclidean
- distance between the grid coordinates and the point source:
+ distance between the observation points and the point sources:
 
  .. math::
 
@@ -78,24 +103,26 @@ class EquivalentSources(vdb.BaseGridder):
  depth : float
  Parameter used to control the depth at which the point sources will be
  located.
- If ``depth_type`` is equal to ``"relative"``, the ``depth`` specifies
- the relative depth at which the point sources are placed beneath the
- observation points. Each source point will be set beneath each data
- point at a depth calculated as the elevation of the data point minus
- this *depth*. Use positive numbers (negative numbers would mean point
- sources are above the data points).
- If ``depth_type`` is equal to ``"constant"``, the ``depth`` specifies
- the constant depth at which the point sources are placed beneath the
- observation points. Every source point will be located at this *depth*.
- Use positive numbers (negative numbers would mean point sources are
- located above the zeroth level).
+ If ``depth_type`` is ``"constant"``, each source is located at the same
+ depth specified through the ``depth`` argument.
+ If ``depth_type`` is ``"relative"``, each source is located beneath
+ each data point (or block-averaged location) at a depth equal to its
+ elevation minus the ``depth`` value.
  This parameter is ignored if *points* is specified.
  Defaults to 500.
  depth_type : str
  Strategy used for setting the depth of the point sources.
  The two available strategies are ``"constant"`` and ``"relative"``.
  This parameter is ignored if *points* is specified.
  Defaults to ``"relative"``.
+ block_size: float, tuple = (s_north, s_east) or None
+ Size of the blocks used on block-averaged equivalent sources.
+ If a single value is passed, the blocks will have a square shape.
+ Alternatively, the dimensions of the blocks in the South-North and
+ West-East directions can be specified by passing a tuple.
+ If None, no block-averaging is applied.
+ This parameter is ignored if *points* are specified.
+ Default to None.
  parallel : bool
  If True any predictions and Jacobian building is carried out in
  parallel through Numba's ``jit.prange``, reducing the computation time.
@@ -111,6 +138,10 @@ class EquivalentSources(vdb.BaseGridder):
  The boundaries (``[W, E, S, N]``) of the data used to fit the
  interpolator. Used as the default region for the
  :meth:`~harmonica.EQLHarmonic.grid` method.
+
+ References
+ ----------
+ [Soler2021]_
  """
 
  # Set the default dimension names for generated outputs
@@ -130,13 +161,15 @@ def __init__(
  points=None,
  depth=500,
  depth_type="relative",
+ block_size=None,
  parallel=True,
  **kwargs,
  ):
  self.damping = damping
  self.points = points
  self.depth = depth
  self.depth_type = depth_type
+ self.block_size = block_size
  self.parallel = parallel
  # Define Green's function for Cartesian coordinates
  self.greens_function = greens_func_cartesian
@@ -198,13 +231,14 @@ def _build_points(self, coordinates):
  """
  Generate coordinates of point sources based on the data points
 
- Locate the point sources following the chosen ``depth_type`` strategy.
+ Locate the point sources following the chosen ``depth_type`` strategy
+ and apply block-averaging if ``block_size`` is not None.
  If ``depth_type`` is equal to ``"relative"``, the point sources will be
- placed beneath the observation points at a depth calculated as the
- elevation of the data point minus the ``depth``.
+ placed beneath the (averaged) observation points at a depth calculated
+ as the elevation of the data point minus the ``depth``.
  If ``depth_type`` is equal to ``"constant"``, the point sources will be
- placed beneath the observation points at the same height equal to minus
- ``depth``.
+ placed beneath the (averaged) observation points at the same height
+ equal to minus ``depth``.
 
  Parameters
  ----------
@@ -220,6 +254,8 @@ def _build_points(self, coordinates):
  Tuple containing the coordinates of the equivalent point sources,
  in the following order: (``easting``, ``northing``, ``upward``).
  """
+ if self.block_size is not None:
+ coordinates = self._block_average_coordinates(coordinates)
  if self.depth_type == "relative":
  return (
  coordinates[0],
@@ -234,6 +270,33 @@ def _build_points(self, coordinates):
  )
  return None
 
+ def _block_average_coordinates(self, coordinates):
+ """
+ Run a block-averaging process on observation points
+
+ Apply a median as the reduction function. The blocks will have the size
+ specified through the ``block_size`` argument on the constructor.
+
+ Parameters
+ ----------
+ coordinates : tuple of arrays
+ Arrays with the coordinates of each data point. Should be in the
+ following order: (``easting``, ``northing``, ``upward``, ...).
+
+ Returns
+ -------
+ blocked_coords : tuple of arrays
+ Tuple containing the coordinates of the block-averaged observation
+ points.
+ """
+ reducer = vd.BlockReduce(
+ spacing=self.block_size, reduction=np.median, drop_coords=False
+ )
+ # Must pass a dummy data array to BlockReduce.filter(), we choose an
+ # array full of zeros. We will ignore the returned reduced dummy array.
+ blocked_coords, _ = reducer.filter(coordinates, np.zeros_like(coordinates[0]))
+ return blocked_coords
+
  def predict(self, coordinates):
  """
  Evaluate the estimated equivalent sources on the given set of points.