Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitswap: don't re-provide blocks we've provided very recently #3105

Merged
merged 1 commit into from
Aug 25, 2016

Conversation

whyrusleeping
Copy link
Member

Lots of operations will store the same block multiple times in rapid succession, we have optimizations to avoid putting those to disk multiple times, but nothing to avoid providing them multiple times.

This is a solution. The other option is to use the same logic as the blockstore (if we have the block already, noop) for providing. The issue with that is that you would no longer be able to re-add a file as a way to re-provide it.

cc @jbenet @Kubuxu for feedback

License: MIT
Signed-off-by: Jeromy [email protected]

@whyrusleeping
Copy link
Member Author

@jbenet the fact that adding a file again makes a reprovide happen is currently a side effect, and isnt exactly documented expected behaviour. I don't believe that changing that, and adding a separate command to reprovide a hash would count as a 'breaking change' do you?

@jbenet
Copy link
Member

jbenet commented Aug 21, 2016

I don't believe that changing that, and adding a separate command to reprovide a hash would count as a 'breaking change' do you?

It's not officially one, no. it's undocumented behavior. that said, it's THE main way that i get something to work often-- meaning many times adding again will cause provides to actually get to more nodes and help the gateway find me. I would worry that this solution would decrease significantly the UX, i guess making it clear to use the other command may be ok. In general we just need to improve content routing as a whole.

@jbenet
Copy link
Member

jbenet commented Aug 21, 2016

We should quantify how much BW this is expected to improve on real workloads.

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Aug 21, 2016

In a few tests i ran, it made less than half the number of provide RPCs over the course of a workload modeled after the one @parkan described in that other issue. I can measure bandwidth soon.

@whyrusleeping
Copy link
Member Author

in a few tests i ran using iptb, in a cluster of 10 nodes, where one of them adds the go-ipfs source directory:

  • on master (averages over three runs)
    • outgoing bandwidth: 12.2MB
    • incoming bandwidth: 4.5MB
  • on this branch:
    • outgoing bandwidth: 7.9MB
    • incoming bandwidth: 2.9MB

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Aug 24, 2016

Ran more detailed tests, The process was a fresh repo each time, connecting to the main network and adding the ipfs source tree. Bandwidth utilization was recorded by nethogs, i ran the test two times per ipfs version (would have done more, but results were fairly stable and thats a lot of data)

ipfs version in out
0.4.2 1.3GB 297MB
0.4.3-rc3 951MB 153MB
This PR 217MB 50MB

@whyrusleeping
Copy link
Member Author

@Kubuxu CR?

@Kubuxu
Copy link
Member

Kubuxu commented Aug 25, 2016

LGTM.

@whyrusleeping
Copy link
Member Author

travis CI OSX tests appear to be failling for all our repos even though they don't report any actual failures.

@whyrusleeping whyrusleeping merged commit 5a8b79b into master Aug 25, 2016
@whyrusleeping whyrusleeping deleted the feat/bitswap/dup-prov-cache branch August 25, 2016 19:31
@whyrusleeping whyrusleeping removed the status/in-progress In progress label Aug 25, 2016
@whyrusleeping whyrusleeping mentioned this pull request Sep 21, 2016
11 tasks
if err != nil {
return nil, err
}

var ks []key.Key
for _, b := range bs {
for _, b := range toput {
if err := s.Exchange.HasBlock(b); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@whyrusleeping, this change modifies the return value so that only the keys that don't already exist are returned. Was this intentional?

kevina added a commit that referenced this pull request Sep 23, 2016
Redo how pull request #3105 is implemented.  Instead of calling Has()
is the blockservice modify the blockstore interface to return the
blocks actually added.

License: MIT
Signed-off-by: Kevin Atkinson <[email protected]>
@ghost ghost mentioned this pull request Dec 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/review Needs a review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants