Updated documentation to reflect the genericity of the function argum…

…ents.
ml-unito · Apr 10, 2024 · b1ca7ba · b1ca7ba · boborbt · Apr 10, 2024
1 parent b1ebd22
commit b1ca7ba
Show file tree

Hide file tree

Showing 10 changed files with 90 additions and 140 deletions.
diff --git a/docs/build/.documenter-siteinfo.json b/docs/build/.documenter-siteinfo.json
@@ -1 +1 @@
-{"documenter":{"julia_version":"1.10.2","generation_timestamp":"2024-04-08T09:25:36","documenter_version":"1.3.0"}}
+{"documenter":{"julia_version":"1.10.2","generation_timestamp":"2024-04-10T02:16:19","documenter_version":"1.3.0"}}
diff --git a/docs/build/examples/example/index.html b/docs/build/examples/example/index.html
diff --git a/docs/build/index.html b/docs/build/index.html
diff --git a/docs/build/objects.inv b/docs/build/objects.inv
@@ -1,5 +1,5 @@
 # Sphinx inventory version 2
 # Project: PartitionedLS.jl
-# Version: 1.0.1
+# Version: 1.0.3
 # The remainder of this file is compressed using zlib.
 x����j�0 ������-����:RK_@���ñ2�����ґ&�ݝ��}����t��Fr����Б���������z,��<[��O�h�J>I�:/�π�ܠk�H�w�0!a a$�}K�cɹ���h$Z)��d�i��E�]�vE��ݳ��o�ZU�i�Cd 102�S�y���;r��c��X����*n�K߲=o��-IU{�Л&�z�3�bz��\�W^�*�N.�A𧞮�sf9�Q��l�4Z�E��JSw[i�������t)�1C�F

diff --git a/docs/build/search_index.js b/docs/build/search_index.js
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -20,8 +20,11 @@ The Partitioned Least Squares model is formally defined as:
 
 where: 
 
-- ``\mathbf{X}`` is ``N \times M`` data matrix;
-- ``\mathbf{P}`` is a user-defined partition matrix having ``K`` columns (one for each element of the partition), ``M`` rows, and containing ``1`` in ``P_{i,j}`` if the ``i``-th attribute belongs to the ``j``-th partition and ``0`` otherwise;
+* ``\mathbf{X}`` is a ``N × M`` matrix or table with `Continuous` element scitype containing the 
+ examples for which the predictions are sought. Check column scitypes 
+ of a table `X` with `schema(X)`.
+* ``\mathbf{y}`` is a ``N`` vector with `Continuous` element scitype. Check scitype with `scitype(y)`. 
+* ``\mathbf{P}`` is a ``M × K`` `Int` matrix specifying how to partition the ``M`` attributes into ``K`` subsets. ``P_{m,k}`` should be 1 if attribute number ``m`` belongs to partition ``k``.
 - ``\mathbf{\beta}`` is a vector weighting the importance of each set of attributes in the partition;
 - ``\mathbf{\alpha}`` is a vector weighting the importance of each attribute within one of the sets in the partition. Note that the constraints imply that for each set in the partition the weights of the corresponding ``\alpha`` variables are all positive and sum to ``1``.
 

diff --git a/src/PartitionedLS.jl b/src/PartitionedLS.jl
@@ -64,7 +64,8 @@ Rewrites X and P in homogeneous coordinates. The result is a tuple (Xo, Po) wher
 homogeneous version of X and Po is the homogeneous version of P.
 
 ## Arguments
- - `X`: the data matrix
+ - `X`: any matrix or table with `Continuous` element scitype. 
+ Check column scitypes of a table `X` with `schema(X)`. 
  - `P`: the partition matrix
 
 ## Return
@@ -84,8 +85,9 @@ objective function as a sum of squares of the α variables. The regularization
 parameter η controls the strength of the regularization.
 
 ## Arguments
- - `X`: the data matrix
- - `y`: the target vector
+ - `X`: any matrix or table with `Continuous` element scitype. 
+ Check column scitypes of a table `X` with `schema(X)`.
+ - `y`: any vector with `Continuous` element scitype. Check scitype with `scitype(y)`. 
  - `P`: the partition matrix
  - `η`: the regularization parameter
 
@@ -139,7 +141,9 @@ Make predictions for the datataset `X` using the PartialLS model `model`.
 
 ## Arguments
  - `model`: a [PartLSFitResult](@ref)
- - `X`: the data containing the examples for which the predictions are sought
+ - `X`: any matrix or table with `Continuous` element scitype containing the 
+ examples for which the predictions are sought. Check column scitypes 
+ of a table `X` with `schema(X)`.
 
 ## Return
  the predictions of the given model on examples in X.
@@ -179,8 +183,9 @@ In MLJ or MLJBase, bind an instance `model` to data with
 
 where
 
-- `X`: any matrix with element type `<:AbstractFloat`, or any table with columns of type `<:AbstractFloat`
-
+ - `X`: any matrix or table with `Continuous` element scitype. 
+ Check column scitypes of a table `X` with `schema(X)`.
+ 
 Train the machine using `fit!(mach)`.
 
 ## Hyper-parameters
@@ -301,8 +306,9 @@ It conforms to the MLJ interface.
 ## Arguments
 - `m`: A [`PartLS`](@ref) model to fit
 - `verbosity`: the verbosity level
-- `X`: the data matrix
-- `y`: the target vector
+- `X`: any matrix or table with `Continuous` element scitype. 
+ Check column scitypes of a table `X` with `schema(X)`.
+- `y`: any vector with `Continuous` element scitype. Check scitype with `scitype(y)`. 
 
 """
 function MMI.fit(m::PartLS, verbosity, X, y)
@@ -332,7 +338,6 @@ end
 Make predictions for the datataset `X` using the PartitionedLS model `model`.
 It conforms to the MLJ interface.
 """
-
 function MMI.predict(model::PartLS, fitresult, X)
  X = MMI.matrix(X)
  return PartitionedLS.predict(fitresult, X)
@@ -355,78 +360,3 @@ MMI.metadata_model(PartLS,
  load_path = "PartitionedLS.PartLS"
  )
 end
-
-
-"""
-#(MMI.doc_header(PartLS))
-
-Use this model to fit a partitioned least squares model to data.
-
-# Training data
-
-In MLJ or MLJBase, bind an instance `model` to data with
-
- mach = machine(model, X, y)
-
-where
-
-- `X`: any matrix with element scitype `<:AbstractFloat,2`
-
-Train the machine using `fit!(mach)`.
-
-# Hyper-parameters
-
-- `Optimizer`: the optimization algorithm to use. It can be `Opt`, `Alt` or `BnB`.
-- `P`: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column
- corresponds to a feature. The element `P_{k, i} = 1` if feature `i` belongs to partition `k`.
-- `η`: the regularization parameter. It controls the strength of the regularization.
-- `ϵ`: the tolerance parameter. It is used to determine when the Alt optimization algorithm has converged. Only used by the `Alt` algorithm.
-- `T`: the maximum number of iterations. It is used to determine when to stop the Alt optimization algorithm has converged. Only used by the `Alt` algorithm.
-- `rng`: the random number generator to use.
- - If `nothing`, the global random number generator `rand` is used.
- - If an integer, the global number generator `rand` is used after seeding it with the given integer.
- - If an object of type `AbstractRNG`, the given random number generator is used.
-
-# Operations
-
-- `predict(mach, Xnew)`: return the predictions of the model on new data `Xnew`
-
-
-# Fitted parameters
-
-The fields of `fitted_params(mach)` are:
-
-- `α`: the values of the α variables. For each partition `k`, it holds the values of the α variables
- are such that ``\\sum_{i \\in P_k} \\alpha_{k} = 1``.
-- `β`: the values of the β variables. For each partition `k`, `β_k` is the coefficient that multiplies the features in the k-th partition.
-- `t`: the intercept term of the model.
-- `P`: the partition matrix. It is a binary matrix where each row corresponds to a partition and each column
- corresponds to a feature. The element `P_{k, i} = 1` if feature `i` belongs to partition `k`.
-
-# Examples
-
-```julia
-PartLS = @load FooRegressor pkg=PartLS
-
-
-X = [[1. 2. 3.];
- [3. 3. 4.];
- [8. 1. 3.];
- [5. 3. 1.]]
-
-y = [1.;
- 1.;
- 2.;
- 3.]
-
-P = [[1 0];
- [1 0];
- [0 1]]
-
-
-# fit using the optimal algorithm
-result = fit(Opt, X, y, P, η = 0.0)
-y_hat = predict(result.model, X)
-```
-
-"""
diff --git a/src/PartitionedLSAlt.jl b/src/PartitionedLSAlt.jl
@@ -29,9 +29,11 @@ more numerically stable with respect to `fit(Alt, ...)``.
 
 ## Arguments
 
-* `X`: \$N × M\$ matrix describing the examples
-* `y`: \$N\$ vector with the output values for each example
-* `P`: \$M × K\$ matrix specifying how to partition the \$M\$ attributes into \$K\$ subsets. \$P_{m,k}\$ should be 1 if attribute number \$m\$ belongs to partition \$k\$.
+* `X`: \$N × M\$ matrix or table with `Continuous` element scitype containing the 
+ examples for which the predictions are sought. Check column scitypes 
+ of a table `X` with `schema(X)`.
+* `y`: \$N\$ vector with `Continuous` element scitype. Check scitype with `scitype(y)`. 
+* `P`: \$M × K\$ `Int` matrix specifying how to partition the \$M\$ attributes into \$K\$ subsets. \$P_{m,k}\$ should be 1 if attribute number \$m\$ belongs to partition \$k\$.
 * `η`: regularization factor, higher values implies more regularized solutions. Default is 0.0.
 * `T`: number of alternating loops to be performed. Default is 100.
 * `ϵ`: minimum relative improvement in the objective function before stopping the optimization. Default is 1e-6

diff --git a/src/PartitionedLSBnB.jl b/src/PartitionedLSBnB.jl
@@ -7,10 +7,11 @@ Implements the Branch and Bound algorithm to fit a Partitioned Least Squres mode
 
 ## Arguments
 
-* `X`: \$N × M\$ matrix describing the examples
-* `y`: \$N\$ vector with the output values for each example
-* `P`: \$M × K\$ matrix specifying how to partition the \$M\$ attributes into \$K\$ subsets. \$P_{m,k}\$ should be 1 if attribute number \$m\$ belongs to
-partition \$k\$.
+* `X`: \$N × M\$ matrix or table with `Continuous` element scitype containing the 
+ examples for which the predictions are sought. Check column scitypes 
+ of a table `X` with `schema(X)`.
+* `y`: \$N\$ vector with `Continuous` element scitype. Check scitype with `scitype(y)`. 
+* `P`: \$M × K\$ `Int` matrix specifying how to partition the \$M\$ attributes into \$K\$ subsets. \$P_{m,k}\$ should be 1 if attribute number \$m\$ belongs to partition \$k\$.
 * `η`: regularization factor, higher values implies more regularized solutions (default: 0.0)
 * nnlsalg: the kind of nnls algorithm to be used during solving. Possible values are :pivot, :nnls, :fnnls (default: :nnls)
 

diff --git a/src/PartitionedLSOpt.jl b/src/PartitionedLSOpt.jl
@@ -51,10 +51,11 @@ It uses a coplete enumeration strategy which is exponential in K, but guarantees
 
 ## Arguments
 
-* `X`: \$N × M\$ matrix describing the examples
-* `y`: \$N\$ vector with the output values for each example
-* `P`: \$M × K\$ matrix specifying how to partition the \$M\$ attributes into \$K\$ subsets. \$P_{m,k}\$ should be 1 if attribute number \$m\$ belongs to
-partition \$k\$.
+* `X`: \$N × M\$ matrix or table with `Continuous` element scitype containing the 
+ examples for which the predictions are sought. Check column scitypes 
+ of a table `X` with `schema(X)`.
+* `y`: \$N\$ vector with `Continuous` element scitype. Check scitype with `scitype(y)`. 
+* `P`: \$M × K\$ `Int` matrix specifying how to partition the \$M\$ attributes into \$K\$ subsets. \$P_{m,k}\$ should be 1 if attribute number \$m\$ belongs to partition \$k\$.
 * `η`: regularization factor, higher values implies more regularized solutions (default: 0.0)
 * `returnAllSolutions`: if true an additional output is appended to the resulting tuple containing all solutions found during the algorithm.
 * `nnlsalg`: the kind of nnls algorithm to be used during solving. Possible values are :pivot, :nnls, :fnnls (default: :nnls)