MinimumNaNSplit

class MinimumNaNSplit(n_splits: int, n_repeats: int = 10, random_state: int = None, min_non_nan: int = 2, which: str = 'train')[source][source]

A Repeated Stratified KFold iterator that splits the data into sections

This class splits the data into sections, checking that the training set never has fewer than the specified number of non-NaN values. :param n_splits: The number of splits. :type n_splits: int :param n_repeats: The number of times to repeat the splits, by default 10. :type n_repeats: int, optional :param random_state: The random state to use, by default None. :type random_state: int, optional

Examples

>>> import numpy as np
>>> np.random.seed(0)
>>> X = np.vstack((np.arange(1, 9).reshape(4, 2), np.full((4, 2), np.nan)))
>>> y = np.array([0, 0, 1, 1, 0, 0, 1, 1])
>>> msn = MinimumNaNSplit(2, 3)
>>> for train, test in msn.split(X, y):
...     print("train:", train, "test:", test)
train: [2 3 4 5] test: [0 1 6 7]
train: [0 1 6 7] test: [2 3 4 5]
train: [2 3 4 5] test: [0 1 6 7]
train: [0 1 6 7] test: [2 3 4 5]
train: [2 3 4 5] test: [0 1 6 7]
train: [0 1 6 7] test: [2 3 4 5]
>>> msn = MinimumNaNSplit(2, 3, which='test', min_non_nan=1)
>>> for train, test in msn.split(X, y):
...     print("train:", train, "test:", test)
train: [1 3 4 7] test: [0 2 5 6]
train: [0 2 5 6] test: [1 3 4 7]
train: [0 3 5 7] test: [1 2 4 6]
train: [1 2 4 6] test: [0 3 5 7]
train: [1 2 5 6] test: [0 3 4 7]
train: [0 3 4 7] test: [1 2 5 6]
Parameters:
  • n_splits (int)

  • n_repeats (int)

  • random_state (int)

  • min_non_nan (int)

  • which (str)

static oversample(arr: ~numpy.ndarray, func: callable = <function mixup>, axis: int = 1, copy: bool = True, seed=None) ndarray[source][source]

Oversample nan rows using func

Parameters:
  • arr (array) – The data to oversample.

  • func (callable) – The function to use to oversample the data.

  • axis (int) – The axis along which to apply func.

  • copy (bool) – Whether to copy the data before oversampling.

Return type:

ndarray

Examples

>>> np.random.seed(0)
>>> arr = np.array([[1, 2], [4, 5], [7, 8],
... [float("nan"), float("nan")]])
>>> MinimumNaNSplit.oversample(arr, norm, 0)
array([[1.        , 2.        ],
       [4.        , 5.        ],
       [7.        , 8.        ],
       [8.32102813, 5.98018098]])
>>> MinimumNaNSplit.oversample(arr, mixup, 0, seed=42)
array([[1.        , 2.        ],
       [4.        , 5.        ],
       [7.        , 8.        ],
       [5.24946679, 6.24946679]])
shuffle_labels(arr: ndarray, labels: ndarray, trials_ax: int = 0, min_trials: int = 1)[source][source]

Shuffle the labels while making sure that the minimum non nan trials are kept

Parameters:
  • arr (array) – The data to shuffle.

  • labels (array) – The labels to shuffle.

  • trials_ax (int) – The axis along which to apply func.

  • min_trials (int) – The minimum number of non-nan trials to keep. By default, self.n_splits

Examples

>>> np.random.seed(0)
>>> arr = np.array([[[1, 2], [4, 5], [7, 8],
... [float("nan"), float("nan")]]])
>>> labels = np.array([0, 0, 1, 1])
>>> MinimumNaNSplit(1).shuffle_labels(arr, labels, 1, 1)
>>> labels
array([1, 1, 0, 0])
split(X, y=None, groups=None)[source][source]

Generate indices to split data into training and test set.

Parameters:
  • X (array-like of shape (n_samples, n_features)) –

    Training data, where n_samples is the number of samples and n_features is the number of features.

    Note that providing y is sufficient to generate the splits and hence np.zeros(n_samples) may be used as a placeholder for X instead of actual training data.

  • y (array-like of shape (n_samples,)) – The target variable for supervised learning problems. Stratification is done based on the y labels.

  • groups (object) – Always ignored, exists for compatibility.

Yields:
  • train (ndarray) – The training set indices for that split.

  • test (ndarray) – The testing set indices for that split.

Notes

Randomized CV splitters may return different results for each call of split. You can make the results identical by setting random_state to an integer.

Examples using ieeg.calc.oversample.MinimumNaNSplit

PCA-LDA Decoding

PCA-LDA Decoding