Home

Motivation

Memoization (or caching in general) is a very common method to optimize algorithms such as the Fibonacci sequence calculator, pathfinder, etc. In a common example, it could be pretty obvious where memoization can be used and whether it should be used. However, in some cases, it could be unintuitive and hard to see whether we can memoize any of the calls. Our tool, MemoFinder aims to help users to understand whether memoization or caching, in general, can help optimize their algorithm. The tool works as a dynamic code analysis that tracks function calls, and processing time and uses that data to create a visualization to help the user understand where memoization can be appropriate. Furthermore, the tool is able to add suggestions as to where most likely memoization will help to optimize the code.

The main reason we went with the dynamic path is that for some cases it is impossible to know if memoization will help until we actually run the code. For instance, take a recursive/brute force solution for calculating fib numbers. If the application only tries to calculate the first 2 numbers, memoization will actually not have any effect. This example could be seen as an edge case for this kind of problem, but this happens very often with web applications or any application that serves data to the user. For example, let's say you have a video streaming service and it takes a long time to render videos on the server. Hence, you decide to cache some of the videos but how can you know which ones to cache? Our application will be able to track and suggest the correct place for caching so your service can run smoothly.

User studies

User Study 1

Notes

During the first user studies, the main goal was to understand the usefulness of the tool and whether it actually helps users to understand whether memoization helps

Summary

The tree graph was successful in helping users understand whether memoization is necessary
Users liked the ease of use of the graph and its intuitiveness
However, they found it too hard to use when the graph became too big/wide

Changes Based on Feedback

We decided to add a new grid view
The grid consisted of nodes that were grouped by the function calls with the same attributes and returns
The size of the nodes represented amount of calls and time it took to execute
Extra information also appeared on hover over
This should solve the case when the tree graph gets too big to use
We also added a table view which listed information from our analysis in a table
This should give the user even more information to work with
After the tool has finished running, we also add a comment to the line that is causing the function to require memoization

User study 2

Notes

The 2nd user study aimed to see the usefulness of new grid view and table view

Summary

Users found the grid view to be not very intuitive at first, but after hovering over for extra information, they quickly understood the use
All the users found the table view extremely helpful and intuitive
Some users were confused by the meaning of Memoization Score
The grid and tree view were hard to read when the function names and argument list were too long

Changes Based on Feedback

Text wraps for both grid and tree view to ensure readability with long function names
Add a description for how we determine which function need to be memorized
changed the table view so now it's more readable and intuitive, hide the memorization score and give user direct suggestion on which function need to be memorized
Changed the analysis comments to be more verbose and give more information

Implementation Overview

The analysis happens in 3 stages:

We augment Python AST to track function calls and processing time. The overhead is extremely minimal as this part only utilizes Python built-in modules
Once the data is available, it is filtered and evaluated. Many things are calculated at this stage such as the memoization score. The memoization score represents how memoizing (or caching) a function call will actually improve the performance of the algorithm. More on how this is done here
Using the evaluated data, we produce different types of graphs which should allow for a better understanding of function calls. We also insert comments to a position that we believe would be the most appropriate place to memoize the code

Final evaluation

We found the final version of the tool to be easy to use, versatile and fast. All the views were intuitive and the extra hover over information gives extra insight. Although the tool was originally meant to be used for memoization, we were able to make it work for cases where general caching is required. (Look at Video Service problem as an example). Finally, because the tool introduces a very small overhead to the original code, it is extremely fast.

There are definitely some points we wish we could work on more, however, the time constraint didn't allow it. Here are some of them:

Our tool does not consider the case when a function changes global state. If the function does change global state, we cannot memoize it/cache as it might change the final output
The tree view could use some work such as ensuring it's always zoomed in at the center and having extra tools to make it usable when the tree is extremely big. However, since it is still usable in its current state, we decided to focus on other views
Our current estimate for time savings is a very crude lower bound. A more sophisticated algorithm could have given us a better lower bound however we found that it would be too slow, especially for larger programs. Still, a better estimate could have been found if given more time
Our probability model to determine whether or not a function is memoized or not and our memoization score was not rigorously proven to be the best model/equation to determine these questions. As such there are times when we get false-positives and false-negatives. If we had more time, we could have further fine-tuned these to get the most consistent results

Team management

The team has decided to split into sub-teams and focus on specific aspects of the implementation. Approximate team setup:

AST team - create Python AST augment algorithm
- Kobi
- Akim
API/backend
- Valdi
Frontend
- Asad
- Alex

Although everyone was separated into sub-teams, we still continued helping each other through code review and debugging help.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Motivation

User studies

User Study 1

Summary

Changes Based on Feedback

User study 2

Summary

Changes Based on Feedback

Implementation Overview

Final evaluation

Team management

Clone this wiki locally