Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add_generator performance #221

Closed
mtbakerguy opened this issue Nov 12, 2019 · 4 comments
Closed

add_generator performance #221

mtbakerguy opened this issue Nov 12, 2019 · 4 comments

Comments

@mtbakerguy
Copy link

If someone is using a large number of generators, the creation of the simmer object can dwarf the simulation runtime, see below for an example:


numagents <- 12500

testsystem <- simmer(log_level=0)

testsystem %>%
   add_resource('phone',capacity=25000,queue_size=0)

workflow <- trajectory('Workflow') %>% timeout(1)

for(sfx in seq(1,numagents)) {
   name <- paste('Agent',sfx,sep='-')
   testsystem %>% add_generator(name,workflow,function () ceiling(runif(1,1,1200)))
}

ts <- testsystem %>% run(until=3600)

On a reasonably current Macbook, profiling shows the above example takes approximately 20 seconds to run with the majority of the time spent in add_generator_. Taking a look at the source, beyond the check_args work, it looks like a short drop into the C++ code so it's non-obvious on how to make this faster.

@Enchufa2
Copy link
Member

Yeap, if you use so many generators, it takes time to set them up. Most of the time is spent in:

  • Argument checking with check_args. This is obviously necessary, not only to ensure type correctness and raise meaningful errors, but also because it performs some conversions that are necessary for the backend (e.g., Inf -> -1 and things like that).

  • Making generators resetable with make_resetable. This is also necessary in general. If you define a generator that depends on some global variable (e.g. function() rexp(1, x), where x is defined somewhere in a parent environment), make_resetable copies the initial value of x (and any other variable needed) and sets up a mechanism to reset that value. Then, when you call reset(), you can start over your simulation and everything works as expected.

You could bypass these functions if you know what you are doing and you understand the implications, but I don't recommend it at all. These are simmer internals, subject to change.

One thing you can do is to avoid using the pipe in loops. So use add_generator(testsystem, ...), instead of testsystem %>% add_generator(...), to save some time.


What we can do is to enhance add_generator to accept vectors of parameters. In this case, we could completely avoid the loop with something like

testsystem %>%
  add_generator(paste("Agent", seq_len(numagents), sep="-"), 
                workflow, function() ceiling(runif(1,1,1200)))

so that check_args and make_resetable are executed once.

@mtbakerguy
Copy link
Author

Two comments:

  1. Thanks for the suggestion to avoid the pipe. This resulted in ~10% improved performance.
  2. Your idea for an enhancement looks like the correct one to me. It would fit reasonably well with the current architecture and the semantics of other calls.

@Enchufa2
Copy link
Member

Thanks, you can test it from the master branch.

@mtbakerguy
Copy link
Author

I'm not finished testing it yet but my actual simulation's (not the test one I put in above) runtime dropped to almost 10% of previous.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants