Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NCCL Library Integration #156

Merged
merged 1 commit into from
Jun 3, 2016

Conversation

drnikolaev
Copy link

No description provided.

@@ -118,6 +118,7 @@ function(caffe_print_configuration_summary)
caffe_status(" USE_LEVELDB : ${USE_LEVELDB}")
caffe_status(" USE_LMDB : ${USE_LMDB}")
caffe_status(" ALLOW_LMDB_NOLOCK : ${ALLOW_LMDB_NOLOCK}")
caffe_status(" USE_NCCL : ${USE_NCCL}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this down below with the other CUDA options?
https:/NVIDIA/caffe/blob/v0.15.1/cmake/Summary.cmake#L144-L148

@lukeyeager
Copy link
Member

CMake didn't auto-detect and use the deb package installation of NCCL automatically. I really like that feature for cuDNN (one less flag to set).

@drnikolaev
Copy link
Author

@lukeyeager NCCL does not come as a Deb package. This is old school make && sudo make install thing. And Cmake does find it well.

@@ -424,7 +424,7 @@ RegisterBrewFunction(time);

int main(int argc, char** argv) {
// Print output to stderr (while still logging).
FLAGS_alsologtostderr = 1;
FLAGS_alsologtostderr = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this causing a problem?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, just a purist habit... :) This flag is boolean.

@lukeyeager
Copy link
Member

lukeyeager commented Jun 2, 2016

NCCL does not come as a Deb package.

https:/NVIDIA/nccl/releases

Cmake does find it well.

It can find it, but it doesn't use it until I specify -DUSE_NCCL=On. And since multi-GPU is disabled if you don't have NCCL, we should definitely use it by default if it's available.

@slayton58
Copy link

Agreed, if found it should be the default. If not found it should probably warn at that point that multi gpu will be disabled.

On Jun 1, 2016, at 8:25 PM, Luke Yeager [email protected] wrote:

NCCL does not come as a Deb package.

https:/NVIDIA/nccl/releases

Cmake does find it well.

It can find it, but it doesn't use it until I specify -DUSE_NCCL=On. And since multi-GPU is disabled if you don't have NCCL, we should definitely use it by default if it's available.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@drnikolaev
Copy link
Author

Notes from Luke and Simon have been addressed, thank you.

# ---[ NCCL
if(USE_NCCL)
add_definitions(-DUSE_NCCL)
find_package(NCCL REQUIRED)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I can't build Caffe with CUDA unless I have NCCL. I'd like to see this behavior (same as with USE_CUDNN):

USE_NCCL found NCCL? use?
On Yes Yes
On No ERROR
unset Yes Yes
unset No No
Off * No

@lukeyeager
Copy link
Member

CMake Error at CMakeLists.txt:40 (caffe_option):
caffe_option Function invoked with incorrect arguments for function named:
caffe_option

@drnikolaev
Copy link
Author

@lukeyeager yes, I've seen it and fixing it now.

@drnikolaev drnikolaev force-pushed the caffe-0.15-nccl2 branch 3 times, most recently from bb8c03f to ad43d82 Compare June 3, 2016 06:09
@drnikolaev
Copy link
Author

All redundant changes removed, CMake is green.

@drnikolaev drnikolaev merged commit a3b0deb into NVIDIA:caffe-0.15 Jun 3, 2016
@drnikolaev drnikolaev deleted the caffe-0.15-nccl2 branch June 3, 2016 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants