Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neighbor sampler behaves differently in CPU and GPU #4583

Closed
RManLuo opened this issue Sep 19, 2022 · 1 comment
Closed

Neighbor sampler behaves differently in CPU and GPU #4583

RManLuo opened this issue Sep 19, 2022 · 1 comment
Assignees

Comments

@RManLuo
Copy link

RManLuo commented Sep 19, 2022

🐛 Bug

I try to use dgl.dataloading.MultiLayerFullNeighborSampler to sample the blocks for a set of seed_nodes which contains the duplicated items. If I sample them in CPU, the returned mfg would contain inconsistent results as shown in #4512. However, if I sample them in GPU, the duplicated seed nodes would not been removed.

To Reproduce

Steps to reproduce the behavior:

import torch
import dgl

src = torch.LongTensor(
    [0, 0, 0, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 7, 7, 8, 9, 10,
     1, 2, 3, 3, 3, 4, 5, 5, 6, 5, 8, 6, 8, 9, 8, 11, 11, 10, 11])
dst = torch.LongTensor(
    [1, 2, 3, 3, 3, 4, 5, 5, 6, 5, 8, 6, 8, 9, 8, 11, 11, 10, 11,
     0, 0, 0, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 7, 7, 8, 9, 10])
g = dgl.graph((src, dst))

sampler = dgl.dataloading.MultiLayerFullNeighborSampler(2)

# Sample in CPU
idx = torch.LongTensor([8,8])
src_nodes, dst_nodes, mfgs = sampler.sample_blocks(g, idx)
print(dst_nodes) # tensor([8, 8])
print(mfgs[-1].num_dst_nodes()) # 1
print(mfgs[-1].dstdata) # {'_ID': tensor([8, 8])}
# Inconsistant

# Sample in GPU
device = torch.device('cuda:0')
src_nodes, dst_nodes, mfgs = sampler.sample_blocks(g.to(device), idx.to(device))
print(dst_nodes) # tensor([8, 8], device='cuda:0')
print(mfgs[-1].num_dst_nodes()) # 2
print(mfgs[-1].dstdata) # {'_ID': tensor([8, 8], device='cuda:0')}
# Consistant

Expected behavior

Sample results at different devices should be the same.

Environment

  • DGL Version (e.g., 1.0): dgl-cuda11.3 0.9.0
  • Backend Library & Version (e.g., PyTorch 0.4.1, MXNet/Gluon 1.3): pytorch 1.11.0
  • OS (e.g., Linux): Linux
  • How you installed DGL (conda, pip, source): conda
  • Build command you used (if compiling from source):
  • Python version: 3.8
  • CUDA/cuDNN version (if applicable): cuda 11.3 cudnn 8.2.0_0
  • GPU models and configuration (e.g. V100): RTX3090
  • Any other relevant information:

Additional context

@RManLuo RManLuo changed the title Neighbor sampler behavior differently in CPU and GPU Neighbor sampler behaves differently in CPU and GPU Sep 19, 2022
@BarclayII
Copy link
Collaborator

Duplicate of #4512. Let's discuss there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants