refactor batch item file management #51
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Refactor all algos and extensions so that output data that corresponds to a batch item are stored within their respective subdirs within the batch dir (which is the same dir as a batch file). Raw input movies remain where they are.
Created
PathsDataFrameExtension
andPathsSeriesExtension
inbatch_utils
module to simplify resolving or splitting paths of input and output files for batch items.The following extensions will automatically resolve the full path, first searching within the batch dir and then within the parent dir of the current
parent_raw_data_path
.df.paths.resolve(<path>)
, also works for Series for example:df.iloc[0].resolve(<path>
This functionality require the current
main
branch ofpandas
as of June 19, 2022. Specifically it requires that thedict
pd.DataFrame.attrs
gets propagated to childpd.Series.attrs
. This functionality will apparently be released in pandas v1.5.0 on June 30, 2022. For more info see: pandas-dev/pandas#46101Example:
also
df.caiman.run()
now usessubprocess
backend by default and doesn't required thebatch_path
as a kwarg, it gets it fromdf.paths.get_batch_path()