Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write to .monitoring index on elasticsearch if xpack.enabled is true on standalone metricbeat #28365

Merged
merged 5 commits into from
Nov 4, 2021

Conversation

sayden
Copy link
Contributor

@sayden sayden commented Oct 12, 2021

What does this PR do?

This PR implements the possibility to write to .monitoring indices in Stack Monitoring modules when using xpack.enabled: true. When false it will write into metricbeat-* unless it's using Agent, that the index will be a data-stream.

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Closes #25043

@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Oct 12, 2021
@sayden sayden self-assigned this Oct 12, 2021
@mergify
Copy link
Contributor

mergify bot commented Oct 12, 2021

This pull request does not have a backport label. Could you fix it @sayden? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-v./d./d./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Oct 12, 2021
@elasticmachine
Copy link
Collaborator

elasticmachine commented Oct 12, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Duration: 150 min 19 sec

❕ Flaky test report

No test was executed to be analysed.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

@andresrc andresrc added Feature:Stack Monitoring Team:Integrations Label for the Integrations team labels Nov 3, 2021
@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Nov 3, 2021
@sayden sayden added the bug label Nov 3, 2021
@sayden sayden changed the title WIP Write to old .monitoring index on elasticsearch Write to old .monitoring index on elasticsearch if xpack.enabled is true Nov 3, 2021
@sayden sayden changed the title Write to old .monitoring index on elasticsearch if xpack.enabled is true Write to .monitoring index on elasticsearch if xpack.enabled is true onb standalone metricbeat Nov 3, 2021
@sayden sayden force-pushed the sm/elasticsearch/write_to_old_index branch from e163621 to 138b247 Compare November 3, 2021 16:44
@sayden sayden changed the title Write to .monitoring index on elasticsearch if xpack.enabled is true onb standalone metricbeat Write to .monitoring index on elasticsearch if xpack.enabled is true on standalone metricbeat Nov 3, 2021
@sayden sayden marked this pull request as ready for review November 3, 2021 16:45
@elasticmachine
Copy link
Collaborator

Pinging @elastic/integrations (Team:Integrations)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/stack-monitoring (Stack monitoring)

@sayden
Copy link
Contributor Author

sayden commented Nov 4, 2021

/test

Copy link
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@sayden sayden merged commit 462f42f into elastic:master Nov 4, 2021
@matschaffer
Copy link
Contributor

I gave this a try today hoping it'd get stack monitoring working correctly on kibana+metricbeat master, but it looks like the indices are missing mappings for at least metricset.name

Without that I'm not sure how we should query for ES cluster stats docs which is currently what the UI uses to pull a list of ES clusters.

Is there another issue tracking fixing that or should I open one?

@matschaffer
Copy link
Contributor

Looks like also the kibana mapping has source_node but the data doesn't.

@sayden
Copy link
Contributor Author

sayden commented Nov 5, 2021

@matschaffer can you explain what is your expectation and what doesn't work? To give some context, there's a lot of mappings that might or might not have a related data. This is because, according to Kibana folks on SM UI, it was impossible to know in Kibana which fields were being accessed so I had to map every possible field "just in case". Partly because many fields were being accessed out of the query (Kibana get the _doc and Kibana knows structure of each doc)

Modules also have a _meta/fields.yml which are aliases to [metricset]/_meta/fields.yml because Kibana could not refactor itself its queries, so we had to create that refactor in Metricbeat so Kibana has to do nothing in that sense, introducing technical debt.

Along the 15-20 metricsets of the 4 modules, I'd expect less than 10 possible mapping errors. Which is very optimistic knowing that I had to manually alias around a thousand of fields to a mapping of around a 1000 of ECS fields.

I hope this clarification helps 🙂

@matschaffer
Copy link
Contributor

@sayden I was expecting to be able to load the stackmonitoring UI using metricbeat from master monitoring kibana main via the config in https:/elastic/beats/blob/master/metricbeat/modules.d/kibana-xpack.yml.disabled

This is similar to how I might use metricbeat 7.15 to monitor kibana.

Before this PR, the data would land in metricbeat-*, so of course the UI won't load. With this PR the lands in the .monitoring indices, but not in a shape the UI understands.

I'm wondering if we have an issue to get metricbeat shipping the 7.15 doc structure again that I should be following (cc @jasonrhodes incase he has context on this)

@jasonrhodes
Copy link
Member

Before this PR, the data would land in metricbeat-*, so of course the UI won't load. With this PR the lands in the .monitoring indices, but not in a shape the UI understands.

I'm wondering if we have an issue to get metricbeat shipping the 7.15 doc structure again that I should be following (cc @jasonrhodes incase he has context on this)

FWIW I expect this PR to put the data in .monitoring and to install aliases for the relevant .monitoring-{product}-mb index template(s). That may mean that to test it you have to trigger a rollover so a new index will be created with the right mappings. Users wouldn't have to worry about this bc a new Kibana version would start writing to a new versioned index IIUC.

@sayden can you confirm/help @matschaffer confirm this works?

@matschaffer
Copy link
Contributor

Just testing on elastic/kibana@fca8cbf and 84e668c locally. Fresh yarn es, kibana clean, metricbeat with just a few config modifications to hit my local ES & kibana.

I see data in the expected indices:
Screen Shot 2021-11-08 at 14 43 06

But SM UI doesn't load:
Screen Shot 2021-11-08 at 14 43 24

The SM clusters API queries like this:

POST *:.monitoring-es-6-*,*:.monitoring-es-7-*,.monitoring-es-6-*,.monitoring-es-7-*,metricbeat-*,*:metricbeat-*/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "bool": {
            "should": [
              {
                "term": {
                  "type": "cluster_stats"
                }
              },
              {
                "term": {
                  "metricset.name": "cluster_stats"
                }
              }
            ]
          }
        },
        {
          "range": {
            "timestamp": {
              "gte": "now-15m"
            }
          }
        }
      ]
    }
  },
  "collapse": {
    "field": "cluster_uuid"
  },
  "sort": {
    "timestamp": {
      "order": "desc",
      "unmapped_type": "long"
    }
  }
}

But doesn't get any hits. I can see the mapping at least has probably enough fields:

{
  ".monitoring-es-7-mb-2021.11.08" : {
    "mappings" : {
      "cluster_uuid" : {
        "full_name" : "cluster_uuid",
        "mapping" : {
          "cluster_uuid" : {
            "type" : "keyword"
          }
        }
      },
      "type" : {
        "full_name" : "type",
        "mapping" : {
          "type" : {
            "type" : "keyword"
          }
        }
      },
      "timestamp" : {
        "full_name" : "timestamp",
        "mapping" : {
          "timestamp" : {
            "type" : "date",
            "format" : "date_time"
          }
        }
      }
    }
  }
}

But the structure of the docs in the index are quite different. Here's a random example. I can't actually pull the most recent since it uses @timestamp but the mapping only has timestamp.

GET .monitoring-es-7-mb-2021.11.08/_doc/2oEe_nwBovl19JwQViIv
{
  "_index" : ".monitoring-es-7-mb-2021.11.08",
  "_id" : "2oEe_nwBovl19JwQViIv",
  "_version" : 1,
  "_seq_no" : 102,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "@timestamp" : "2021-11-08T05:56:12.649Z",
    "host" : {
      "hostname" : "matschaffer-mbp2019.lan",
      "architecture" : "x86_64",
      "os" : {
        "version" : "11.6.1",
        "family" : "darwin",
        "name" : "macOS",
        "kernel" : "20.6.0",
        "build" : "20G224",
        "type" : "macos",
        "platform" : "darwin"
      },
      "id" : "815EB661-322B-5B8B-A0BA-C83E911AC99A",
      "ip" : [
        "fe80::aede:48ff:fe00:1122",
        "fe80::1828:42ad:ec20:58d4",
        "192.168.86.27",
        "fe80::48e5:17ff:fe07:b4bb",
        "fe80::48e5:17ff:fe07:b4bb",
        "fe80::2b35:5154:b3c2:84f9",
        "fe80::6f74:bd52:cda2:f936",
        "fe80::e50b:d4d:1e8e:406"
      ],
      "name" : "matschaffer-mbp2019.lan",
      "mac" : [
        "ac:de:48:00:11:22",
        "3e:22:fb:a2:28:5f",
        "3c:22:fb:a2:28:5f",
        "4a:e5:17:07:b4:bb",
        "4a:e5:17:07:b4:bb",
        "82:c1:12:47:1c:00",
        "82:c1:12:47:1c:01",
        "82:c1:12:47:1c:05",
        "82:c1:12:47:1c:04",
        "82:c1:12:47:1c:01"
      ]
    },
    "agent" : {
      "id" : "499ba867-33ed-47ca-afd2-067cde5b680f",
      "name" : "matschaffer-mbp2019.lan",
      "type" : "metricbeat",
      "version" : "8.1.0",
      "ephemeral_id" : "3821e274-0fd5-4fc7-89c6-403b4615d3fa"
    },
    "service" : {
      "address" : "http://localhost:9200",
      "type" : "elasticsearch"
    },
    "elasticsearch" : {
      "cluster" : {
        "name" : "elasticsearch",
        "id" : "Q3Lnz1DZT6SbwJab1tuZOQ",
        "stats" : {
          "nodes" : {
            "count" : 1,
            "master" : 1,
            "fs" : {
              "total" : {
                "bytes" : 1000240963584
              },
              "available" : {
                "bytes" : 799146590208
              }
            },
            "jvm" : {
              "max_uptime" : {
                "ms" : 1058386
              },
              "memory" : {
                "heap" : {
                  "used" : {
                    "bytes" : 299940336
                  },
                  "max" : {
                    "bytes" : 1610612736
                  }
                }
              }
            },
            "versions" : [
              "8.1.0"
            ]
          },
          "stack" : {
            "xpack" : {
              "ccr" : {
                "enabled" : true,
                "available" : true
              }
            }
          },
          "license" : {
            "type" : "trial",
            "expiry_date_in_millis" : 1.638941935363E12,
            "status" : "active"
          },
          "expiry_date_in_millis" : 1638941935363,
          "state" : {
            "master_node" : "Nz-zb0pKSRizekSCpukQrQ",
            "state_uuid" : "-uH9cBGRTZyQCYZSp1sVXg",
            "nodes" : {
              "Nz-zb0pKSRizekSCpukQrQ" : {
                "transport_address" : "127.0.0.1:9300",
                "attributes" : {
                  "ml.max_jvm_size" : "1610612736",
                  "ml.machine_memory" : "68719476736",
                  "xpack.installed" : "true"
                },
                "roles" : [
                  "data",
                  "data_cold",
                  "data_content",
                  "data_frozen",
                  "data_hot",
                  "data_warm",
                  "ingest",
                  "master",
                  "ml",
                  "remote_cluster_client",
                  "transform"
                ],
                "name" : "matschaffer-mbp2019.lan",
                "ephemeral_id" : "PxaqNpLYTyuZ8wcVZWh9gw"
              }
            },
            "nodes_hash" : -1921010294
          },
          "indices" : {
            "fielddata" : {
              "memory" : {
                "bytes" : 0
              }
            },
            "docs" : {
              "total" : 2953
            },
            "total" : 13,
            "shards" : {
              "primaries" : 13,
              "count" : 13
            },
            "store" : {
              "size" : {
                "bytes" : 8699865
              }
            }
          },
          "status" : "yellow"
        }
      },
      "version" : 102
    },
    "event" : {
      "module" : "elasticsearch",
      "duration" : 17695033,
      "dataset" : "elasticsearch.cluster.stats"
    },
    "metricset" : {
      "name" : "cluster_stats",
      "period" : 10000
    },
    "ecs" : {
      "version" : "8.0.0"
    }
  }
}

@jasonrhodes you say "trigger a rollover" but this index doesn't appear to be ILM managed at the moment. So I tried deleting it and it came back with the same mappings.

@matschaffer
Copy link
Contributor

matschaffer commented Nov 8, 2021

If I add these fields to the index mapping directly, I can get something that works sort of like the cluster API query:

PUT .monitoring-es-7-mb-2021.11.08/_mapping
(... original mapping from index ...)
    "@timestamp": {
      "type": "date",
      "format": "date_time"
    },
    "metricset": {
      "properties": {
        "name": {
          "type": "keyword"
        }
      }
    },
    "elasticsearch": {
      "properties": {
        "cluster": {
          "properties": {
            "id": {
              "type": "keyword"
            }
          }
        }
      }
    }
GET .monitoring-es-7-*/_search
{
    "query": {
    "bool": {
      "filter": [
        {
          "bool": {
            "should": [
              {
                "term": {
                  "type": "cluster_stats"
                }
              },
              {
                "term": {
                  "metricset.name": "cluster_stats"
                }
              }
            ]
          }
        },
        {
          "range": {
            "@timestamp": {
              "gte": "now-15m"
            }
          }
        }
      ]
    }
  },
  "collapse": {
    "field": "elasticsearch.cluster.id"
  },
  "sort": {
    "@timestamp": {
      "order": "desc",
      "unmapped_type": "long"
    }
  }
}

Response:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [
      {
        "_index" : ".monitoring-es-7-mb-2021.11.08",
        "_id" : "YoE3_nwBovl19JwQ0UTi",
        "_score" : null,
        "_source" : {
          "@timestamp" : "2021-11-08T06:24:02.726Z",
          "metricset" : {
            "name" : "cluster_stats",
            "period" : 10000
          },
          "service" : {
            "address" : "http://localhost:9200",
            "type" : "elasticsearch"
          },
          "elasticsearch" : {
            "cluster" : {
              "name" : "elasticsearch",
              "id" : "Q3Lnz1DZT6SbwJab1tuZOQ",
              "stats" : {
                "indices" : {
                  "docs" : {
                    "total" : 8473
                  },
                  "total" : 13,
                  "shards" : {
                    "primaries" : 13,
                    "count" : 13
                  },
                  "store" : {
                    "size" : {
                      "bytes" : 10944059
                    }
                  },
                  "fielddata" : {
                    "memory" : {
                      "bytes" : 0
                    }
                  }
                },
                "status" : "yellow",
                "nodes" : {
                  "jvm" : {
                    "max_uptime" : {
                      "ms" : 2728383
                    },
                    "memory" : {
                      "heap" : {
                        "max" : {
                          "bytes" : 1610612736
                        },
                        "used" : {
                          "bytes" : 315065344
                        }
                      }
                    }
                  },
                  "versions" : [
                    "8.1.0"
                  ],
                  "count" : 1,
                  "master" : 1,
                  "fs" : {
                    "total" : {
                      "bytes" : 1000240963584
                    },
                    "available" : {
                      "bytes" : 798194380800
                    }
                  }
                },
                "stack" : {
                  "xpack" : {
                    "ccr" : {
                      "available" : true,
                      "enabled" : true
                    }
                  }
                },
                "license" : {
                  "status" : "active",
                  "type" : "trial",
                  "expiry_date_in_millis" : 1.638941935363E12
                },
                "expiry_date_in_millis" : 1638941935363,
                "state" : {
                  "nodes" : {
                    "Nz-zb0pKSRizekSCpukQrQ" : {
                      "roles" : [
                        "data",
                        "data_cold",
                        "data_content",
                        "data_frozen",
                        "data_hot",
                        "data_warm",
                        "ingest",
                        "master",
                        "ml",
                        "remote_cluster_client",
                        "transform"
                      ],
                      "name" : "matschaffer-mbp2019.lan",
                      "ephemeral_id" : "PxaqNpLYTyuZ8wcVZWh9gw",
                      "transport_address" : "127.0.0.1:9300",
                      "attributes" : {
                        "ml.machine_memory" : "68719476736",
                        "xpack.installed" : "true",
                        "ml.max_jvm_size" : "1610612736"
                      }
                    }
                  },
                  "nodes_hash" : -1921010294,
                  "master_node" : "Nz-zb0pKSRizekSCpukQrQ",
                  "state_uuid" : "DqItJY6QQG-B0ujVY_32Yw"
                }
              }
            },
            "version" : 109
          },
          "host" : {
            "os" : {
              "type" : "macos",
              "platform" : "darwin",
              "version" : "11.6.1",
              "family" : "darwin",
              "name" : "macOS",
              "kernel" : "20.6.0",
              "build" : "20G224"
            },
            "id" : "815EB661-322B-5B8B-A0BA-C83E911AC99A",
            "ip" : [
              "fe80::aede:48ff:fe00:1122",
              "fe80::1828:42ad:ec20:58d4",
              "192.168.86.27",
              "fe80::48e5:17ff:fe07:b4bb",
              "fe80::48e5:17ff:fe07:b4bb",
              "fe80::2b35:5154:b3c2:84f9",
              "fe80::6f74:bd52:cda2:f936",
              "fe80::e50b:d4d:1e8e:406"
            ],
            "mac" : [
              "ac:de:48:00:11:22",
              "3e:22:fb:a2:28:5f",
              "3c:22:fb:a2:28:5f",
              "4a:e5:17:07:b4:bb",
              "4a:e5:17:07:b4:bb",
              "82:c1:12:47:1c:00",
              "82:c1:12:47:1c:01",
              "82:c1:12:47:1c:05",
              "82:c1:12:47:1c:04",
              "82:c1:12:47:1c:01"
            ],
            "name" : "matschaffer-mbp2019.lan",
            "hostname" : "matschaffer-mbp2019.lan",
            "architecture" : "x86_64"
          },
          "agent" : {
            "name" : "matschaffer-mbp2019.lan",
            "type" : "metricbeat",
            "version" : "8.1.0",
            "ephemeral_id" : "3821e274-0fd5-4fc7-89c6-403b4615d3fa",
            "id" : "499ba867-33ed-47ca-afd2-067cde5b680f"
          },
          "ecs" : {
            "version" : "8.0.0"
          },
          "event" : {
            "module" : "elasticsearch",
            "duration" : 36039175,
            "dataset" : "elasticsearch.cluster.stats"
          }
        },
        "fields" : {
          "elasticsearch.cluster.id" : [
            "Q3Lnz1DZT6SbwJab1tuZOQ"
          ]
        },
        "sort" : [
          1636352642726
        ]
      }
    ]
  }
}

@matschaffer
Copy link
Contributor

Mind you I'm just working on assumptions here. If the idea is that metricbeat should be publishing this way, and SM UI needs updates that's fine. Just not sure where that work is filed.

Though it does seem strange that the document structure doesn't line up with the index mapping very well.

@jasonrhodes
Copy link
Member

Mind you I'm just working on assumptions here. If the idea is that metricbeat should be publishing this way, and SM UI needs updates that's fine. Just not sure where that work is filed.

Though it does seem strange that the document structure doesn't line up with the index mapping very well.

This all should be powered by field aliases applied to these mappings, so all of our investigation should be pointed at that. No Stack Monitoring UI changes are expected, but source data also won't look exactly right all the time. Queries that rely on fields for aggregations, filters, etc. should work though, because the aliases should be in place.

If the UI doesn't load with these changes, we still have work to do, because I expected that it would.

@jasonrhodes
Copy link
Member

@sayden @masci can you all confirm that this PR is meant to resolve the updated AC of #26480 ? If so, can we link them in the description? Thanks! I will help @matschaffer and team dig into this to see if it does what I expected for powering the UI in 8.0/main.

@matschaffer
Copy link
Contributor

@jasonrhodes gotcha. I didn't see any field aliases in my testing and SM UI did not load. Seems like a disconnect somewhere. Good to know that SM UI code isn't expected to change so we'll need to work out what's up on the beats end.

@matschaffer
Copy link
Contributor

Hey y'all, per @andresrc 's request I re-ran the test in #28365 (comment) and got the same results on latest master/main for beats & kibana.

Then I also tested with xpack.enabled: false - this configuration we get the aliases we need on the metricbeat-* index.

Screen Shot 2021-11-11 at 11 22 36

But I can't find any cluster_stats documents which is what SM UI expects in order to identify possible clusters.

Screen Shot 2021-11-11 at 11 23 00

I also recorded a video, I'll send that to the email thread we have going on this issue as well.

@matschaffer
Copy link
Contributor

Just wanted to record that I tried an override template:

PUT _template/.monitoring-es-mb
{
  "order": 1,
  "version": 7140099,
  "index_patterns": [
    ".monitoring-es-*-mb-*"
  ],
  "mappings": {
    "properties": {
      "cluster_uuid": {
        "type": "alias",
        "path": "elasticsearch.cluster.id"
      },
      "elasticsearch": {
        "properties": {
          "cluster": {
            "properties": {
              "id": {
                "type": "keyword"
              }
            }
          }
        }
      },
      "timestamp": {
        "type": "alias",
        "path": "@timestamp"
      },
      "@timestamp": {
        "type": "date"
      }
    }
  }
}

This fails because the .monitoring-es template (installed by ES) has timestamp mapped with a format attribute, which alias won't accept.

@kvch kvch added the backport-v8.0.0 Automated backport with mergify label Dec 7, 2021
kvch pushed a commit that referenced this pull request Dec 7, 2021
…on standalone metricbeat (#28365)

(cherry picked from commit 462f42f)
@jasonrhodes
Copy link
Member

Updating this ticket since we were talking here about missing mappings/aliases:

I spoke with @jbaiera and we are going to let Elasticsearch install these mappings/aliases for us, targeted only at the Metricbeat standalone indices. We just need to provide ES with the JSON files for the templates we need (they apparently need to be component templates composed into an index template since these .monitoring indices will really be data streams, but the ES APIs for adding templates at start up allows for component templates).

@sayden is working now on getting the JSON together for this and will provide it as soon as it's ready. ping @matschaffer / @klacabane

kvch pushed a commit that referenced this pull request Dec 7, 2021
…on standalone metricbeat (#28365) (#29314)

(cherry picked from commit 462f42f)

Co-authored-by: Mario Castro <[email protected]>
@sayden sayden deleted the sm/elasticsearch/write_to_old_index branch August 25, 2022 12:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify backport-v8.0.0 Automated backport with mergify bug Feature:Stack Monitoring Team:Integrations Label for the Integrations team v8.0.0
Projects
None yet
7 participants