Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The memory usage skyrocket each time it saves #70

Open
ljyloo opened this issue Dec 22, 2016 · 3 comments
Open

The memory usage skyrocket each time it saves #70

ljyloo opened this issue Dec 22, 2016 · 3 comments

Comments

@ljyloo
Copy link

ljyloo commented Dec 22, 2016

And it doesn't free the memory. I executed this bash command

th train.lua --cuda --dataset 50000 --hiddenSize 1000

First epoch it consumed 2GiB Ram, and second it consumed 5GiB, then 10GiB and finally my memory was full at 11th epoch. (My computer have 32 GiB of ram)

This issue disappeared when I commented out line 156 to 171 in train.lua(The ram usage is always at 1.2GiB)

  if minMeanError == nil or errors:mean() < minMeanError then
    print("\n(Saving model ...)")
    params, gradParams = nil,nil
    collectgarbage()
    -- Model is saved as CPU
    model:float()
    torch.save("data/model.t7", model)
    collectgarbage()
    if options.cuda then
      model:cuda()
    elseif options.opencl then
      model:cl()
    end
    collectgarbage()
    minMeanError = errors:mean()
  end

So I conclude the saving process may be the problem

@Namburgesas
Copy link

Seems to occur in the calls to model:float(). My workaround was to just save in GPU format:

  if minMeanError == nil or errors:mean() < minMeanError then
    print("\n(Saving model ...)")
    params, gradParams = nil,nil
    collectgarbage()
    torch.save("data/model.t7", model)
    collectgarbage()
    minMeanError = errors:mean()
  end

I then added require 'cudnn' to the top of eval.lua in order to be able to load the saved model. If you want to save the model in CPU format, you could write a quick script to load the model, call model:float(), and save it again.

@ljyloo
Copy link
Author

ljyloo commented Jan 10, 2017

Thanks for your simple solution, @Namburgesas . Hope there's a fix in the future

@biggerlambda
Copy link

Did you try doing clearState() before using model:float(). It clears the intermediary states in the model (not needed for prediction)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants