Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: run SaveGameStats in a job #5934

Draft
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

mwerle
Copy link
Contributor

@mwerle mwerle commented Oct 9, 2024

##' Bugs / Feature requests

Attempts to fix #5929

Background

The idea is to run this expensive task in an asynchronous job and notify the Lua side once it is completed. This is to prevent the stutter from occurring every time the SaveGameStats API is called from Lua.

Details

The code was modified to call the supposedly expensive part of the SaveGameStats API (loading and parsing the game file) into a Job. Once finished, the data is marshalled into a LuaTable and a LuaEvent is triggered to pass the data to the Lua side.

The Lua side registers an Event listener, which parses the data and updates the SaveGame cache entry for the data.

In practise, the jobs all seem to complete almost instantaneously which then results in a mass of LuaEvents sent to Lua at the same time. So instead of a set of small stutters, the interface experiences one big stutter.

So it appears as if the cause of the stutter wasn't the disk access and JSON parsing in the C++ side, but the logic invocation and data transfer between C++ and Lua.

As always, it's better to profile first and optimise later.. that said, I learned a lot.

Copy link
Member

@Web-eWorks Web-eWorks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial feedback - instead of adding the new LuaEvent class to piecewise construct an argument table, construct a LuaTable object inside of l_game_savegame_stats_callback, add values to it as normal, and pass that LuaTable handle as the args parameter to LuaEvent::Queue. (Don't forget to dispose of the underlying table from the stack - a LuaTable doesn't auto-pop its table.)

If that doesn't currently compile, let me know and I'll do the template metaprogrammery to get it to work.

A note - we tend to reserve the l_<namespace>_<method> naming convention for functions following the Lua_Function API prototype, i.e. static int l_name(lua_State *l). This is just a "regular" callback and can have any particular name.

Regarding profiling, we have an in-process profiler for C++ which can be used via the PROFILE_SCOPED() macro (with the appropriate include of profiler/Profiler.h). Details on usage are in the dev docs.

src/lua/LuaGame.cpp Outdated Show resolved Hide resolved
@Web-eWorks
Copy link
Member

Thanks for working on this! We unfortunately don't have good (or any) Lua-side profiling utilities at current, but I'll pull the branch and take a closer look to see what's responsible for the stutters.

@Web-eWorks
Copy link
Member

Finally, as a suggestion - you may want to move the JobSet into a more central API like Pi::App() or Lua::manager as a shared "Lua Jobs" interface. That would avoid the need to keep a static object at file scope, and allow object deinitialization to be controlled as well as initialization.

@Web-eWorks
Copy link
Member

Hacked in a manual lua-side profiler to test performance. Note that printing to the console has an overhead of about 0.065ms - I've disabled most of the print statements for the sake of measuring real performance.

With hot caches:

Info: PROFILE | SaveLoadWindow:makeFilteredList() took 0.0085ms
Info: PROFILE | SaveLoadWindow:makeFileList() took 0.1471ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0033ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0036ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0038ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0040ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0039ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0050ms
Info: error in JSON file 'savefiles/_autosave9': [json.exception.parse_error.110] parse error at 1: unexpected end of input
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0049ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0038ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0029ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0041ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0054ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0034ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0037ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0034ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0041ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0113ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0042ms

With fully cold cache (echo 3 | sudo tee /proc/sys/vm/drop_caches):

Info: PROFILE | SaveLoadWindow:makeFilteredList() took 0.0092ms
Info: PROFILE | SaveLoadWindow:makeFileList() took 5.8037ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0030ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0033ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0043ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0036ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0028ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0048ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0069ms
Info: error in JSON file 'savefiles/_autosave9': [json.exception.parse_error.110] parse error at 1: unexpected end of input
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0035ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0060ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0035ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0037ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0035ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0027ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0031ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0045ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0039ms
Info: PROFILE | SaveLoadWindow:onSaveGameStats() took 0.0042ms

Looking at the profile from that run (Engine.RequestProfileFrame()), we're almost entirely disk bound inside of SaveGameManager::ListSaves.
image

With the manual rate-limiting removed from SaveLoadWindow:update(), loading information about saves is almost instant as all saves are processed at once. I'm not seeing the major stutter you mentioned either - even with cold caches loading the list of saves from disk (with an N of 18) causes at most a single frame skip at 60Hz.

This does raise the follow-on question of also moving Game.ListSaves() to an off-thread operation - as I imagine a user with a decent number of savefiles (say, 150) in combination with a legacy HDD or eMMC boot drive could potentially have an observable stutter in that function lasting a significant fraction of a second.

Would you like me to push my profiler implementation to this branch?

@mwerle
Copy link
Contributor Author

mwerle commented Oct 10, 2024

Thank you for having a look and providing feedback!

Initial feedback - instead of adding the new LuaEvent class to piecewise construct an argument table, construct a LuaTable object inside of l_game_savegame_stats_callback, add values to it as normal, and pass that LuaTable handle as the args parameter to LuaEvent::Queue.

I'm afraid I still have extremely limited knowledge of how the Lua stack works and what can and can't be constructed on it and passed around. It's not really made any clearer (to me) by the wrappers you guys have developed which have various restrictions such as the ScopedTable. If there is a book or other training material you can point me at so I can learn more about it that would be appreciated.

So I didn't think a LuaTable could be constructed on the stack and then passed around and modified etc before invoking a Lua function while other parts of the code also potentially modify the Lua stack (ie, another Job in another thread).

Inside l_game_savegame_stats_callback I don't have a lua_State which can be used to create a table.

In the "normal" functions the lua_State comes directly as a parameter, but I didn't know if that state would survive being passed into another thread and used/modified there while other Lua functions run in other threads. I kinda figured the Lua stuff would probably all need to be on the same thread without adding a lot of synchronisation.

This is why I made that LuaEvent class to construct the table and manage the lua_State. But looking at it now, it probably boils down to the same thing.. as I said, C++ isn't my forte, and the whole Lua thing is brand-new to me.

(Don't forget to dispose of the underlying table from the stack - a LuaTable doesn't auto-pop its table.)

???

A note - we tend to reserve the l_<namespace>_<method> naming convention for functions following the Lua_Function API prototype, i.e. static int l_name(lua_State *l). This is just a "regular" callback and can have any particular name.

Noted.

Regarding profiling, we have an in-process profiler for C++ which can be used via the PROFILE_SCOPED() macro (with the appropriate include of profiler/Profiler.h). Details on usage are in the dev docs.

I saw the macro, but haven't had time to look into it. Thanks for the link, I'll have a read.

@mwerle
Copy link
Contributor Author

mwerle commented Oct 10, 2024

Would you like me to push my profiler implementation to this branch?

Yes please. No need for me to reinvent the wheel..

@Web-eWorks
Copy link
Member

I'm afraid I still have extremely limited knowledge of how the Lua stack works and what can and can't be constructed on it and passed around.

This is partially why the Lua*** wrappers were developed, to simplify interacting with Lua from the programmer's perspective. I'm sorry they're not clear and easy to use either - we (at this point, I) probably should write a primer on interacting with Lua and add it to the dev docs.

So I didn't think a LuaTable could be constructed on the stack and then passed around and modified etc before invoking a Lua function while other parts of the code also potentially modify the Lua stack (ie, another Job in another thread).

but I didn't know if that state would survive being passed into another thread and used/modified there while other Lua functions run in other threads.

Lua is implicitly and explicitly single-threaded only. Only the main thread is allowed to call any Lua API functions on the main Lua state (Lua::manager->GetLuaState()). Inside of a Job / Task's main execution method, you generally will never see any Lua API being called.

I should have mentioned this when I was originally describing the problem to you, sorry. 😅

The OnFinish method of the Job class specifically always runs on the main thread for this reason, to allow interaction with non-threaded resources like OpenGL or Lua.

Other Lua states may be constructed for use on separate threads, but they operate under a share-nothing model in relation to the main Lua state from both a C and Lua-side perspective.

As mentioned in another reply, from the "main" thread you can lua_State *l = Lua::manager->GetLuaState(); to interact with the game's Lua context at (almost*) any time.

* Startup and shutdown excluded, for obvious reasons.

This is why I made that LuaEvent class to construct the table and manage the lua_State. But looking at it now, it probably boils down to the same thing.. as I said, C++ isn't my forte, and the whole Lua thing is brand-new to me.

Don't worry about it at all - this is why I'm taking the time to review and provide feedback in such depth on your PRs. My hope is to help smooth along the learning process; feel free to mention me (as sturnclaw) in IRC with questions if that's a more expedient way to learn.

(Don't forget to dispose of the underlying table from the stack - a LuaTable doesn't auto-pop its table.)

???

Here's an annotated example:

// Create a new table on the Lua stack.
// This is a shorthand to call lua_newtable(l); LuaTable(l, -1).
// Passing a stack index to the constructor instead aliases an existing table.
LuaTable tab { l };

// Interact with the table as normal, don't need to bother with the underlying Lua API.
tab.Set("myKey", myVal);

// Pass this table to a Lua event. The LuaTable wrapper aliases the underlying table on
// the Lua stack when copied.
// LuaEvent::Queue pushes all parameters to Lua and then destroys its arguments. You
// generally don't have to worry about lifetime concerns as long as you're not passing
// a ScopedTable handle.
// LuaEvent::Queue is guaranteed to leave the stack in the same state as when you called it.
LuaEvent::Queue("myEvent", tab);

// Because LuaTable aliases a table on the stack, it does not remove that value from
// the stack when it is destroyed. Instead, you have to manually call lua_pop() to remove
// the table implicitly created via LuaTable(lua_State *l).
lua_pop(l, 1);

data/pigui/modules/saveloadgame.lua Outdated Show resolved Hide resolved
src/lua/LuaEvent.h Outdated Show resolved Hide resolved
data/pigui/modules/saveloadgame.lua Outdated Show resolved Hide resolved
@mwerle
Copy link
Contributor Author

mwerle commented Oct 11, 2024

Finally, as a suggestion - you may want to move the JobSet into a more central API like Pi::App() or Lua::manager as a shared "Lua Jobs" interface. That would avoid the need to keep a static object at file scope, and allow object deinitialization to be controlled as well as initialization.

Sure; I first wanted to see if it would work at all so just hacked it into the LuaGame.cpp.

I moved the Job itself into the SaveGameManager class and simply exposed a function to create a new job.

I moved the jobset into Lua::manager as suggested.

The issue is that the Lua::Manager implementation now has a #include "Pi.h" which I seem to remember is undesired..?

@Web-eWorks
Copy link
Member

I moved the Job itself into the SaveGameManager class and simply exposed a function to create a new job.

Looks good. Naming is a little stilted, I might personally try to name it something like LoadGameToJsonAsync and allow the return value type to do the talking on what it returns.

I moved the jobset into Lua::manager as suggested.

The issue is that the Lua::Manager implementation now has a #include "Pi.h" which I seem to remember is undesired..?

this is fine

More seriously, this isn't ideal. I suspect the editor will likely crash on startup because of this, as Pi::App doesn't exist outside of "main game" contexts. Just one of those details about the codebase you'll trip over in the process of learning it.

Preferably, the LuaManager constructor takes a JobQueue *asyncJobQueue parameter instead. As a knock-on effect, this means that Lua::Init() now needs to either be passed a LuaManager for it to take ownership of on startup, or be passed a pointer to the running application's AsyncJobQueue to then pass to LuaManager. The latter is likely much easier to implement and thus what I'd suggest.

@mwerle
Copy link
Contributor Author

mwerle commented Oct 14, 2024

Preferably, the LuaManager constructor takes a JobQueue *asyncJobQueue parameter instead. As a knock-on effect, this means that Lua::Init() now needs to either be passed a LuaManager for it to take ownership of on startup, or be passed a pointer to the running application's AsyncJobQueue to then pass to LuaManager. The latter is likely much easier to implement and thus what I'd suggest.

This is proving to be marginally challenging as the other Applications do not have any Job Queues..

@Web-eWorks
Copy link
Member

JobQueue *GetAsyncJobQueue();

mwerle and others added 7 commits October 15, 2024 08:22
The idea is to run this expensive task in an asynchronous job
and notify the Lua side once it is completed. This is to prevent
the stutter from occurring every time the SaveGameStats API is
called from Lua.

In practise, the jobs all seem to complete almost instantaneously
which then results in a mass of LuaEvents sent to Lua at the same
time. So instead of a set of small stutters, the interface
experiences one big stutter.

So it appears as if the cause of the stutter wasn't the disk access
and JSON parsing in the C++ side, but the logic invocation and data
transfer between C++ and Lua.

As always, it's better to profile first and optimise later..
- Simple manual profiler for checking the time cost of certain Lua methods
- Add Engine.nowTime, which returns the exact time since an unspecified epoch at the moment it's evaluated
Remove the superfluous LuaEvent class and instead construct the LuaTable
directly in the callback.

Also pass Json objects by reference to avoid expensive memory copies.
* Remove debug output
* Comment out profiler
* Optimise lua table access
* remove dead-code
The jobset has been moved into the Lua::Manager class and can generically
run any job on the "Lua" job queue.

The job has been moved into the SaveGameManager class so that any code
can load game data as Json in a job.
Require a JobQueue to instantiate a LuaManager object. This removes the
requirement to include "Pi.h" in LuaManager.cpp and ensures that the
other applications (Editor) can also run.
@mwerle mwerle force-pushed the refactor/savegame_stats_job branch from 20a7010 to 63c95d1 Compare October 14, 2024 23:40
@mwerle
Copy link
Contributor Author

mwerle commented Oct 14, 2024

JobQueue *GetAsyncJobQueue();

No idea how or why I missed this.. I was grubbing around inside the implementation files and must have missed the base class.

Proposed fix in 63c95d1

@mwerle
Copy link
Contributor Author

mwerle commented Oct 14, 2024

Looks good. Naming is a little stilted, I might personally try to name it something like LoadGameToJsonAsync and allow the return value type to do the talking on what it returns.

Fixed in 1a871c8

@Web-eWorks
Copy link
Member

No idea how or why I missed this.. I was grubbing around inside the implementation files and must have missed the base class.

Proposed fix in 63c95d1

Looks good. Will give this a whirl tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improvement - move calls to Game.SaveGameStats into Jobs
2 participants