You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It was introduced the support of the dateframes from the polars package. This is available in the following modules: data (Dataset, SequenceTokenizer, SequentialDataset) for working with transformers, metrics, preprocessing and splitters. The new format allows to achieve multiple acceleration of calculations relative to the Pandas and PySpark dataframes. You can see more details about usage in the examples.
Removed dependencies on seaborn and matplotlib. Removed functions replay.utils.distributions.plot_item_dist and replay.utils.distributions.plot_user_dist.
Added functions to get and set embeddings in transformers - get_all_embeddings, set_item_embeddings_by_size, set_item_embeddings_by_tensor, append_item_embeddings. You can see more details about their use in the examples.
Added a QueryEmbeddingsPredictionCallback to get query embeddings at the inference stage in transformers. You can see more details about usage in the examples.
Added support for numerical features in SequenceTokenizer and TorchSequentialDataset. It becomes possible to use numerical features inside transformers.
Auto padding for inference stage of transformer-based models in a single-user mode is supported.
Added a callback to calculate cardinality in TensorSchema. Now it is not necessary to pass the cardinality parameter, the value will be calculated automatically.
Added the core_count parameter to replay.utils.session_handler.get_spark_session. If nothing is specified, the env variables REPLAY_SPARK_CORE_COUNT and REPLAY_SPARK_MEMORY are taken into account. If they are not specified, the value is set to -1.
Corrected the behavior of the item_count parameter in ValidationMetricsCallback. If you are not going to calculate the Coverage metric, then you do not need to pass this parameter.
The calculation of the Coverage metric on Pandas and PySpark has been aligned.
Removed conversion from PySpark to Pandas in some models. Added the allow_collect_to_master parameter, False by default.
100% test coverage has been achieved.
Undetectable type correction during fit in LabelEncoder. The problem occurred when using multiple tuples with null values.
Changes in the experimental part:
Python 3.10 is supported
Interface updates due to the d3rlpy version update
Adding a DesicionTransformer
This discussion was created from the release v0.16.0.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
polars
package. This is available in the following modules: data (Dataset, SequenceTokenizer, SequentialDataset) for working with transformers, metrics, preprocessing and splitters. The new format allows to achieve multiple acceleration of calculations relative to thePandas
andPySpark
dataframes. You can see more details about usage in the examples.seaborn
andmatplotlib
. Removed functionsreplay.utils.distributions.plot_item_dist
andreplay.utils.distributions.plot_user_dist
.get_all_embeddings
,set_item_embeddings_by_size
,set_item_embeddings_by_tensor
,append_item_embeddings
. You can see more details about their use in the examples.QueryEmbeddingsPredictionCallback
to get query embeddings at the inference stage in transformers. You can see more details about usage in the examples.SequenceTokenizer
andTorchSequentialDataset
. It becomes possible to use numerical features inside transformers.cardinality
inTensorSchema
. Now it is not necessary to pass thecardinality
parameter, the value will be calculated automatically.core_count
parameter toreplay.utils.session_handler.get_spark_session
. If nothing is specified, the env variablesREPLAY_SPARK_CORE_COUNT
andREPLAY_SPARK_MEMORY
are taken into account. If they are not specified, the value is set to-1
.item_count
parameter inValidationMetricsCallback
. If you are not going to calculate theCoverage
metric, then you do not need to pass this parameter.Coverage
metric onPandas
andPySpark
has been aligned.PySpark
toPandas
in some models. Added theallow_collect_to_master
parameter,False
by default.LabelEncoder
. The problem occurred when using multiple tuples with null values.This discussion was created from the release v0.16.0.
Beta Was this translation helpful? Give feedback.
All reactions