ADD: add notebook for MCH data retrieval

MeteoSwiss · Aug 29, 2024 · a5705ed · a5705ed
1 parent 68bf2ad
commit a5705ed
Show file tree

Hide file tree

Showing 2 changed files with 269 additions and 3 deletions.
diff --git a/doc/source/notebooks/retrieve_meteoswiss_data_from_cscs.ipynb b/doc/source/notebooks/retrieve_meteoswiss_data_from_cscs.ipynb
@@ -0,0 +1,266 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "# Retrieving MeteoSwiss products from CSCS with pyrad"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Most MeteoSwiss products can be retrieved from the CSCS or MeteoSwiss servers using the functions documented [here](https://meteoswiss.github.io/pyrad/API/generated/pyrad.util.html#data-retrieval-utilities).\n",
+ "\n",
+ "The reading/writing functions for these files are available in the [MeteoSwiss Py-ART fork](https:/MeteoSwiss/pyart). Please also check the following [notebook](https://meteoswiss.github.io/pyart/notebooks/read_mch_metranet_data.html) for an example of reading/writing such files.\n",
+ "\n",
+ "Please also check the [internal MeteoSwiss confluence page]()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ " Most products can be fetched with the *retrieve_mch_prod* function. You need to specify an output directory, make sure there is sufficient space in the output directory before you run the command!"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Example 1: reading polar data from Albis at low-resolution (500m)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "<frozen importlib._bootstrap>:530: DeprecationWarning: the load_module() method is deprecated and slated for removal in Python 3.12; use exec_module() instead\n",
+ "/users/wolfensb/pyrad/src/pyart/pyart/io/nexrad_level3.py:11: DeprecationWarning: 'xdrlib' is deprecated and slated for removal in Python 3.13\n",
+ " from xdrlib import Unpacker\n"
+ ]
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "## You are using the Python ARM Radar Toolkit (Py-ART), an open source\n",
+ "## library for working with weather radar data. Py-ART is partly\n",
+ "## supported by the U.S. Department of Energy as part of the Atmospheric\n",
+ "## Radiation Measurement (ARM) Climate Research Facility, an Office of\n",
+ "## Science user facility.\n",
+ "##\n",
+ "## If you use this software to prepare a publication, please cite:\n",
+ "##\n",
+ "## JJ Helmus and SM Collis, JORS 2016, doi: 10.5334/jors.119\n",
+ "\n",
+ "Welcome to PyDDA 2.0.0\n",
+ "If you are using PyDDA in your publications, please cite:\n",
+ "Jackson et al. (2020) Journal of Open Research Science\n",
+ "Detecting Jax...\n",
+ "Jax/JaxOpt are not installed on your system, unable to use Jax engine.\n",
+ "Detecting TensorFlow...\n",
+ "Unable to load both TensorFlow and tensorflow-probability. TensorFlow engine disabled.\n",
+ "No module named 'tensorflow'\n"
+ ]
+ },
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "/users/wolfensb/pyrad/src/pyrad_proc/pyrad/flow/flow_aux.py:54: UserWarning: Memory profiler not available\n",
+ " warn(\"Memory profiler not available\")\n",
+ "/users/wolfensb/pyrad/src/pyrad_proc/pyrad/flow/flow_control.py:56: UserWarning: dask not available: The processing will not be parallelized\n",
+ " warn(\"dask not available: The processing will not be parallelized\")\n"
+ ]
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "60\n",
+ "/scratch/wolfensb//MLA2413110100U.001\n",
+ "dict_keys(['reflectivity', 'signal_to_noise_ratio', 'reflectivity_vv', 'differential_reflectivity', 'uncorrected_cross_correlation_ratio', 'uncorrected_differential_phase', 'velocity', 'spectrum_width', 'reflectivity_hh_clut'])\n"
+ ]
+ }
+ ],
+ "source": [
+ "from pyrad.util import retrieve_mch_prod\n",
+ "import pyart\n",
+ "import datetime\n",
+ "\n",
+ "OUTPUT_DIRECTORY = '/scratch/wolfensb/temp/' # Adjust to your needs\n",
+ "T0 = datetime.datetime(2024,5,10,10,10)\n",
+ "T1 = datetime.datetime(2024,5,10,10,20)\n",
+ "\n",
+ "files = retrieve_mch_prod('/scratch/wolfensb/',T0, T1, product_name='MLA')\n",
+ "\n",
+ "print(len(files))\n",
+ "print(files[0])\n",
+ "\n",
+ "radar = pyart.aux_io.read_metranet(files[0])\n",
+ "print(radar.fields.keys())\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Example 2: same as example 1 but getting only certain sweeps"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": []
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "12\n",
+ "[np.str_('/scratch/wolfensb//MLA2413110100U.001'), np.str_('/scratch/wolfensb//MLA2413110100U.002'), np.str_('/scratch/wolfensb//MLA2413110100U.003'), np.str_('/scratch/wolfensb//MLA2413110100U.004')]\n"
+ ]
+ }
+ ],
+ "source": [
+ "files = retrieve_mch_prod('/scratch/wolfensb/',T0, T1, product_name='MLA', sweeps = [1,2,3,4])\n",
+ "print(len(files))\n",
+ "print(files[0:4])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Example 3: getting a Cartesian product, RZC = radar QPE"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "5\n",
+ "[np.str_('/scratch/wolfensb//RZC241311010VL.001'), np.str_('/scratch/wolfensb//RZC241311012VL.001'), np.str_('/scratch/wolfensb//RZC241311015VL.001'), np.str_('/scratch/wolfensb//RZC241311017VL.001')]\n"
+ ]
+ }
+ ],
+ "source": [
+ "files = retrieve_mch_prod('/scratch/wolfensb/',T0, T1, product_name='RZC')\n",
+ "print(len(files))\n",
+ "print(files)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Example 3: getting a Cartesian product, CPC = radar-gauge QPE, but only at hourly accumulation"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "2\n",
+ "[np.str_('/scratch/wolfensb//CPC/CPC2413110100_00060.801.gif'), np.str_('/scratch/wolfensb//CPC/CPC2413110200_00060.801.gif')]\n"
+ ]
+ }
+ ],
+ "source": [
+ "files = retrieve_mch_prod('/scratch/wolfensb/',T0, T1, product_name='CPC', pattern='*00060*')\n",
+ "print(len(files))\n",
+ "print(files)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Example 4: more complex filtering of retrieved files with regex"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Extracting all CPC files except those at 5 minute resolution (CPC*_00005.801.gif)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "<>:4: SyntaxWarning: invalid escape sequence '\\.'\n",
+ "<>:4: SyntaxWarning: invalid escape sequence '\\.'\n",
+ "/tmp/ipykernel_136299/2246465886.py:4: SyntaxWarning: invalid escape sequence '\\.'\n",
+ " files = retrieve_mch_prod('/scratch/wolfensb/',T0, T1, product_name='CPC', pattern='^(?!.*00005\\.801\\.gif$).*\\.gif$', pattern_type='regex')\n"
+ ]
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "74\n",
+ "[np.str_('/scratch/wolfensb//CPC/CPC2413108100_00060.801.gif'), np.str_('/scratch/wolfensb//CPC/CPC2413108100_00180.801.gif'), np.str_('/scratch/wolfensb//CPC/CPC2413108100_00360.801.gif'), np.str_('/scratch/wolfensb//CPC/CPC2413108100_00720.801.gif')]\n"
+ ]
+ }
+ ],
+ "source": [
+ "T0 = datetime.datetime(2024,5,10,8,10)\n",
+ "T1 = datetime.datetime(2024,5,10,10,20)\n",
+ "files = retrieve_mch_prod('/scratch/wolfensb/',T0, T1, product_name='CPC', pattern='^(?!.*00005\\.801\\.gif$).*\\.gif$', pattern_type='regex')\n",
+ "\n",
+ "print(files)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "pyart_new",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.12.4"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/src/pyrad_proc/pyrad/util/data_retrieval_utils.py b/src/pyrad_proc/pyrad/util/data_retrieval_utils.py
@@ -465,7 +465,7 @@ def retrieve_mch_prod_RT(
  content_zip = [
  c
  for c in content_zip
- if re.match(os.path.basename(c), pattern) is not None
+ if re.match(pattern, os.path.basename(c)) is not None
  ]
  else:
  raise ValueError('Unknown pattern_type, must be either "shell" or "regex".')
@@ -540,7 +540,7 @@ def _retrieve_prod_daily(
  content_zip = [
  c
  for c in content_zip
- if re.match(os.path.basename(c), pattern) is not None
+ if re.match(pattern, os.path.basename(c)) is not None
  ]
  else:
  raise ValueError('Unknown pattern_type, must be either "shell" or "regex".')
@@ -590,7 +590,7 @@ def _retrieve_prod_daily(
  subprocess.call(cmd, shell=True)
 
  files = sorted(np.array([folder_out + c for c in content_zip[conditions]]))
-
+ 
  return files