-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: "failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0: can not find client of node 114" #36593
Comments
@1292253144 could you please try to search with the collection name instead of the collection alias? this would help us to know if the alias is not work |
/assign @1292253144 |
My guess is that func (m *MetaCache) update(ctx context.Context, database, collectionName string, collectionID UniqueID) (*collectionInfo, error) is not handling alias corrrectly. @SimFG please help on it |
/assign @SimFG |
it is easy to reproduce if there are multiple replicas... @SimFG |
the error is different, and the search requests recover in 1 second
|
is this a recoverable error? |
Yeah, it'll recover on its own in about half an hour |
What might be the cause of this, and how can I avoid it? |
there is a actually a bug on rocksmq(Only for stanalone). So each 200ms it will only consume 1k message. Since your cluster doesn't insert for very long time, all the data in rocksmq is timetick, so it takes relatively long time to consume all the timeticks. With this bug fixed, the watch DML should be recovered in 1-2 minutes |
In which version was this bug fixed |
@1292253144 |
Ignore me, I thought this is a bug related to alias where some meta cache is not updated |
@1292253144 could you please help to collect the etcd backup for investigation? Please refer to this doc: https:/milvus-io/birdwatcher to backup etcd backup with birdwatcher /assign @1292253144 |
According to the information in the log and issue description, can you help confirm:
|
@1292253144 the search has been reporting this error for half an hour, right? |
The duration of the error may not be certain, but it is more than half an hour |
@1292253144 Could you please confirm the three questions above? Also, could you provide a complete log of the error period? |
Is there an existing issue for this?
Environment
Current Behavior
用别名aa访问集合vector_info_day_2024_09_29报错:failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0。但是从attu上查看此集合根本没有用到query ID为114
但是别名aa绑定的上一个集合vector_info_day_2024_09_28用到了query ID为114
Expected Behavior
No response
Steps To Reproduce
No response
Milvus Log
[2024/09/29 02:09:47.858 +00:00] [WARN] [proxy/lb_policy.go:169] ["search/query channel failed, node not available"] [traceID=b71e1ccf
9e642c9247dc8286866a195d] [collectionID=452131967330518986] [collectionName=ads_aic_app_album_photo_vector_info_day] [channelName=by-d
ev-rootcoord-dml_12_452131967330518986v0] [nodeID=114] [error="can not find client of node 114"]
[2024/09/29 02:09:47.858 +00:00] [WARN] [retry/retry.go:46] ["retry func failed"] [traceID=b71e1ccf9e642c9247dc8286866a195d] [retried=
0] [error="failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0: can not find client of node 114"] [er
rorVerbose="failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0: can not find client of node 114\n(1)
attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry.func1\n |
t/go/src/github.com/milvus-io/milvus/internal/proxy/lb_policy.go:176\n | github.com/milvus-io/milvus/pkg/util/retry.Do\n | \t/go/src
/github.com/milvus-io/milvus/pkg/util/retry/retry.go:44\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRet
ry\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/lb_policy.go:154\n | github.com/milvus-io/milvus/internal/proxy.(*LBPoli
cyImpl).Execute.func2\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/lb_policy.go:218\n | golang.org/x/sync/errgroup.(*Gro
up).Go.func1\n | \t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75\n | runtime.goexit\n | \t/usr/local/go/src/runtime/
asm_amd64.s:1598\nWraps: (2) failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0\nWraps: (3) can not
find client of node 114\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString"]
[2024/09/29 02:09:48.035 +00:00] [INFO] [proxy/meta_cache.go:395] ["meta update success"] [database=default] [collectionName=ads_aic_a
pp_smallvideo_title_vector_info_day_v2_2024_09_29] [collectionID=452131967344872926]
[2024/09/29 02:09:48.038 +00:00] [INFO] [proxy/meta_cache.go:395] ["meta update success"] [database=default] [collectionName=ads_aic_a
pp_smallvideo_title_vector_info_day_v2_2024_09_29] [collectionID=452131967344872926]
[2024/09/29 02:09:48.040 +00:00] [INFO] [proxy/meta_cache.go:395] ["meta update success"] [database=default] [collectionName=ads_aic_a
pp_smallvideo_title_vector_info_day_v2_2024_09_29] [collectionID=452131967344872926]
[2024/09/29 02:09:48.042 +00:00] [INFO] [proxy/meta_cache.go:395] ["meta update success"] [database=default] [collectionName=ads_aic_a
pp_smallvideo_title_vector_info_day_v2_2024_09_29] [collectionID=452131967344872926]
[2024/09/29 02:09:48.059 +00:00] [INFO] [proxy/meta_cache.go:994] ["clearing shard cache for collection"] [collectionName=ads_aic_app_
album_photo_vector_info_day]
[2024/09/29 02:09:48.061 +00:00] [WARN] [proxy/lb_policy.go:126] ["no available shard delegator found"] [traceID=b71e1ccf9e642c9247dc8
286866a195d] [collectionID=452131967330518986] [collectionName=ads_aic_app_album_photo_vector_info_day] [channelName=by-dev-rootcoord-
dml_12_452131967330518986v0] [nodes="[114]"] [excluded="[114]"]
[2024/09/29 02:09:48.061 +00:00] [WARN] [proxy/lb_policy.go:157] ["failed to select node for shard"] [traceID=b71e1ccf9e642c9247dc8286
866a195d] [collectionID=452131967330518986] [collectionName=ads_aic_app_album_photo_vector_info_day] [channelName=by-dev-rootcoord-dml
_12_452131967330518986v0] [nodeID=-1] [error="channel not available[channel=no available shard delegator found]"]
[2024/09/29 02:09:48.061 +00:00] [WARN] [proxy/task_search.go:511] ["search execute failed"] [traceID=b71e1ccf9e642c9247dc8286866a195d
] [nq=1] [error="failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0: can not find client of node 114
"] [errorVerbose="failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0: can not find client of node 11
4\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry.func1
n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/lb_policy.go:176\n | github.com/milvus-io/milvus/pkg/util/retry.Do\n | \t/
go/src/github.com/milvus-io/milvus/pkg/util/retry/retry.go:44\n | github.com/milvus-io/milvus/internal/proxy.(LBPolicyImpl).ExecuteW
ithRetry\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/lb_policy.go:154\n | github.com/milvus-io/milvus/internal/proxy.(
LBPolicyImpl).Execute.func2\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/lb_policy.go:218\n | golang.org/x/sync/errgroup
.(*Group).Go.func1\n | \t/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75\n | runtime.goexit\n | \t/usr/local/go/src/ru
ntime/asm_amd64.s:1598\nWraps: (2) failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0\nWraps: (3) ca
n not find client of node 114\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString"]
[2024/09/29 02:09:48.062 +00:00] [WARN] [proxy/task_scheduler.go:469] ["Failed to execute task: "] [traceID=b71e1ccf9e642c9247dc828686
6a195d] [error="failed to search: failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0: can not find c
lient of node 114"] [errorVerbose="failed to search: failed to get delegator 114 for channel by-dev-rootcoord-dml_12_45213196733051898
6v0: can not find client of node 114\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/proxy.(*se
archTask).Execute\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/task_search.go:512\n | github.com/milvus-io/milvus/intern
al/proxy.(*taskScheduler).processTask\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:466\n | github.com/
milvus-io/milvus/internal/proxy.(*taskScheduler).queryLoop.func1\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/task_schedu
ler.go:545\n | github.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n | \t/go/src/github.com/milvus-io/milvus/pkg/uti
l/conc/pool.go:81\n | [...repeated from below...]\nWraps: (2) failed to search\nWraps: (3) attached stack trace\n -- stack trace:\n
| github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry.func1\n | \t/go/src/github.com/milvus-io/milvus/intern
al/proxy/lb_policy.go:176\n | github.com/milvus-io/milvus/pkg/util/retry.Do\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/retry
/retry.go:44\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry\n | \t/go/src/github.com/milvus-io/milv
us/internal/proxy/lb_policy.go:154\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).Execute.func2\n | \t/go/src/github
.com/milvus-io/milvus/internal/proxy/lb_policy.go:218\n | golang.org/x/sync/errgroup.(*Group).Go.func1\n | \t/go/pkg/mod/golang.org/
x/[email protected]/errgroup/errgroup.go:75\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (4) failed to get
delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0\nWraps: (5) can not find client of node 114\nError types: (1) *
withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *errors.errorString"]
[2024/09/29 02:09:48.062 +00:00] [WARN] [proxy/impl.go:2919] ["Search failed to WaitToFinish"] [traceID=b71e1ccf9e642c9247dc8286866a19
5d] [role=proxy] [db=default] [collection=ads_aic_app_album_photo_vector_info_day] [partitions="[]"] [dsl="series_name in ["胜达"]"] [le
n(PlaceholderGroup)=3084] [OutputFields="[photo_url,photo_weight,id]"] [search_params="[{"key":"anns_field","value":"vector"},
{"key":"topk","value":"5"},{"key":"metric_type","value":"L2"},{"key":"round_decimal","value":"-1"},{"key":"
offset","value":"0"},{"key":"params","value":"{\"nprobe\": 10}"}]"] [guarantee_timestamp=1727575782844] [nq=1] [error
="failed to search: failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0: can not find client of node
114"] [errorVerbose="failed to search: failed to get delegator 114 for channel by-dev-rootcoord-dml_12_452131967330518986v0: can not f
ind client of node 114\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/proxy.(*searchTask).Exec
ute\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/task_search.go:512\n | github.com/milvus-io/milvus/internal/proxy.(*tas
kScheduler).processTask\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:466\n | github.com/milvus-io/milv
us/internal/proxy.(*taskScheduler).queryLoop.func1\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/task_scheduler.go:545\n
| github.com/milvus-io/milvus/pkg/util/conc.(*Pool[...]).Submit.func1\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/conc/pool.go
:81\n | [...repeated from below...]\nWraps: (2) failed to search\nWraps: (3) attached stack trace\n -- stack trace:\n | github.com/
milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry.func1\n | \t/go/src/github.com/milvus-io/milvus/internal/proxy/lb_po
licy.go:176\n | github.com/milvus-io/milvus/pkg/util/retry.Do\n | \t/go/src/github.com/milvus-io/milvus/pkg/util/retry/retry.go:44\n
| github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).ExecuteWithRetry\n | \t/go/src/github.com/milvus-io/milvus/internal/pr
oxy/lb_policy.go:154\n | github.com/milvus-io/milvus/internal/proxy.(*LBPolicyImpl).Execute.func2\n | \t/go/src/github.com/milvus-io
/milvus/internal/proxy/lb_policy.go:218\n | golang.org/x/sync/errgroup.(*Group).Go.func1\n | \t/go/pkg/mod/golang.org/x/[email protected]/
errgroup/errgroup.go:75\n | runtime.goexit\n | \t/usr/local/go/src/runtime/asm_amd64.s:1598\nWraps: (4) failed to get delegator 114
for channel by-dev-rootcoord-dml_12_452131967330518986v0\nWraps: (5) can not find client of node 114\nError types: (1) *withstack.with
Stack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *errors.errorString"]
[2024/09/29 02:09:48.067 +00:00] [INFO] [proxy/meta_cache.go:395] ["meta update success"] [database=default] [collectionName=ads_aic_a
pp_smallvideo_text_vector_info_day_v2_2024_09_29] [collectionID=452131967332546943]
Anything else?
No response
The text was updated successfully, but these errors were encountered: