Skip to content
This repository has been archived by the owner on Jun 6, 2024. It is now read-only.

Built error, internal-storage-create is not ready yet. Please wait for a moment! #4951

Closed
cdd1993 opened this issue Oct 12, 2020 · 20 comments · Fixed by #4962
Closed

Built error, internal-storage-create is not ready yet. Please wait for a moment! #4951

cdd1993 opened this issue Oct 12, 2020 · 20 comments · Fixed by #4962
Assignees

Comments

@cdd1993
Copy link

cdd1993 commented Oct 12, 2020

Hi,All:
When I execute the quick-start-kubespray.sh, the following error appears:
WeChat Screenshot_20201012171623
WeChat Screenshot_20201012171640

Hope to get answers from everyone, very grateful

@hchoi405
Copy link

I met the same error for v1.2.0...

@hzy46
Copy link
Contributor

hzy46 commented Oct 13, 2020

Hi, could you provide the following material?

  1. On your master node, run df -h and give me the result.

  2. On your dev box machine, run

kubectl logs `kubectl get po  | grep internal-storage | awk '{print $1}'`

and give me the log.

@cdd1993
Copy link
Author

cdd1993 commented Oct 13, 2020

嗨,您能提供以下材料吗?

  1. 在您的主节点上,运行df -h并给我结果。
  2. 在开发机上运行
kubectl logs `kubectl get po  | grep internal-storage | awk '{print $1}'`

并给我日志。
Hi,hzy46,the output of the commands are as follows,i am not sure the second one is what you want or not
master
dier

@hzy46
Copy link
Contributor

hzy46 commented Oct 13, 2020

Could you run kubectl get ds? Maybe the pod is not created yet.

@cdd1993
Copy link
Author

cdd1993 commented Oct 13, 2020

你可以跑步kubectl get ds吗?可能尚未创建广告连播。
Hi,zhiyuan,the outputs are as follows:
output

@hzy46
Copy link
Contributor

hzy46 commented Oct 13, 2020

The ds is created but the pod is not created. Maybe your nodes are not correctly labeled. Please help me confirm by running kubectl get node --show-labels and give me the result.

@cdd1993
Copy link
Author

cdd1993 commented Oct 13, 2020

kubectl get node --show-labels
Hi,zhiyuan,thanks for your reply, the outputs of kubectl get node --show-labels are as follows:
output1

@hzy46
Copy link
Contributor

hzy46 commented Oct 13, 2020

The root cause is the nodes are not correctly labeled. I have fixed this issue in #4925. Are you installing v1.2.0? I think v1.2.0 won't have this problem.

@cdd1993
Copy link
Author

cdd1993 commented Oct 13, 2020

The root cause is the nodes are not correctly labeled. I have fixed this issue in #4925. Are you installing v1.2.0? I think v1.2.0 won't have this problem.

I think the version i installed is V1.0.0,so I need to change the branch_name and docker_image_tag in the config?

@hzy46
Copy link
Contributor

hzy46 commented Oct 13, 2020

Yes, please change them and re-run the installation script.

@cdd1993
Copy link
Author

cdd1993 commented Oct 14, 2020

Yes, please change them and re-run the installation script.

Hi,zhiyuan, I have change the version to V1.2.0, but got an another error, the detail are shown in the picture
output3

@hzy46
Copy link
Contributor

hzy46 commented Oct 14, 2020

@Binyang2014 Could you have a look this issue? I think it has relation to the resource calculation.

@Binyang2014
Copy link
Contributor

Binyang2014 commented Oct 14, 2020

@cdd1993 This is a bug, Could change the code like this:

reserved_cpu = min(node_resource_allocatable[key]["cpu-resource"] * Decimal(PAI_RESERVE_RESOURCE_PERCENTAGE), Decimal(PAI_MAX_RESERVE_CPU_PER_NODE))
reserved_mem = min(node_resource_allocatable[key]["mem-resource"] * Decimal(PAI_RESERVE_RESOURCE_PERCENTAGE), Decimal(PAI_MAX_RESERVE_MEMORY_PER_NODE))
.
Here is the related PR #4962
Diff can be found at: https:/microsoft/pai/pull/4962/files

@hzy46
Copy link
Contributor

hzy46 commented Oct 14, 2020

@Binyang2014 Could you provide a branch (v1.2.0 + your commit ) for the user? In the quick start script, it's hard for the user to modify code.

@Binyang2014
Copy link
Contributor

Binyang2014 commented Oct 14, 2020

@hzy46 created a branch named binyli/pai-1.2-config-generate.
@cdd1993 Please try branch binyli/pai-1.2-config-generate to deploy PAI.

@fanyangCS
Copy link
Contributor

@Binyang2014 , should we deploy a hotfix?

@fanyangCS fanyangCS reopened this Oct 15, 2020
@Binyang2014
Copy link
Contributor

@fanyangCS will sync with @yiyione about this. Since it's a deploy issue and 1.3 release still under development, it's better to provide a hot-fix.

@fanyangCS
Copy link
Contributor

ok. please do so.

Binyang2014 added a commit that referenced this issue Oct 15, 2020
@Binyang2014
Copy link
Contributor

@cdd1993 We provide a hot-fix in 1.2.1 release. Please try to use this version

@fanyangCS
Copy link
Contributor

Feel free to reopen this if running into further issues.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants