Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic GC times out #3436

Closed
ghost opened this issue Nov 29, 2016 · 4 comments · Fixed by #3494
Closed

Automatic GC times out #3436

ghost opened this issue Nov 29, 2016 · 4 comments · Fixed by #3494
Labels
kind/bug A bug in existing code (including security flaws) topic/perf Performance topic/repo Topic repo
Milestone

Comments

@ghost
Copy link

ghost commented Nov 29, 2016

Version information: 0.4.5-dev-e4be1b2

Type: Bug

Priority: P2

Description:

The automatic GC times out. StorageMax is 54G, the repo is 60G big, so the timeout for GC should be something like 6 minutes.

From jupiter:

23:41:29.807 WARNI   corerepo: pre-GC: Maximum storage limit exceeded. Maybe unpin some files? gc.go:187
23:41:29.807  INFO   corerepo: Watermark exceeded. Starting repo GC... gc.go:191
23:42:58.626 ERROR   corerepo: context deadline exceeded gc.go:165

Note how the context times out after after less than 1.5m already.

@ghost ghost added kind/bug A bug in existing code (including security flaws) topic/perf Performance topic/repo Topic repo labels Nov 29, 2016
@ghost ghost self-assigned this Nov 29, 2016
@whyrusleeping whyrusleeping added the status/ready Ready to be worked label Nov 29, 2016
@ghost ghost mentioned this issue Dec 9, 2016
@ghost ghost added this to the ipfs 0.4.5 milestone Dec 9, 2016
@ghost
Copy link
Author

ghost commented Dec 9, 2016

The context that's timing out is in GC.maybeGC():

		log.Info("Watermark exceeded. Starting repo GC...")

		// 1 minute is sufficient for ~1GB unlink() blocks each of 100kb in SSD
		_ctx, cancel := context.WithTimeout(ctx, time.Duration(gc.SlackGB)*time.Minute)
		defer cancel()

		if err := GarbageCollect(gc.Node, _ctx); err != nil {
			return err
		}

StorageMax is 60G, and StorageGC is 54G, so SlackGB is 6G, and thus the timeout hits after 6 minutes. The time between the "starting gc" log line and the context timeout was only about 1.5 minutes when I witnessed this live.

@whyrusleeping @kevina @Kubuxu what would you say about just upping the timeout? HDDs and Flash will be much much slower than what's anticipated here, and apparantly even DigitalOcean's shared SSDs are too slow.

@ghost
Copy link
Author

ghost commented Dec 9, 2016

Or maybe just make it configurable.

@whyrusleeping
Copy link
Member

That metric just seems so wrong? why does the slack space get to dictate how long this has to run?

Why does this even need a timeout? that seems odd...

@ghost
Copy link
Author

ghost commented Dec 9, 2016

Okay I'll look into removing the timeout, or making it something absurdly high

@ghost ghost added status/in-progress In progress and removed status/ready Ready to be worked labels Dec 9, 2016
@whyrusleeping whyrusleeping removed the status/in-progress In progress label Dec 16, 2016
@ghost ghost mentioned this issue Dec 16, 2016
3 tasks
@ajnavarro ajnavarro mentioned this issue Aug 24, 2022
72 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) topic/perf Performance topic/repo Topic repo
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant