onlinehd.OnlineHD¶
-
class
onlinehd.OnlineHD(classes: int, features: int, dim: int = 4000)¶ Bases:
objectHyperdimensional classification algorithm. OnlineHD utilizes a (c, d) sized tensor for the model initialized with zeros. Every d-sized vector on this matrix will be the high dimensional representation of each class, called class hypervector.
- Parameters
Example
>>> import onlinehd >>> dim = 10000 >>> n_samples = 1000 >>> features = 100 >>> clusters = 5 >>> x = torch.randn(n_samples, features) # dummy data >>> y = torch.randint(0, classes, [n_samples]) # dummy data >>> model = onlinehd.OnlineHD(classes, features, dim=dim) >>> if torch.cuda.is_available(): ... print('Training on GPU!') ... model = model.to('cuda') ... x = x.to('cuda') ... y = y.to('cuda') ... Training on GPU! >>> model.fit(x, y, epochs=10) >>> ypred = model(x) >>> ypred.size() torch.Size([1000])
-
__call__(x: torch.Tensor, encoded: bool = False)¶ Returns the predicted class of each data point in x.
- Parameters
x (
torch.Tensor) – The data points to predict. Must have size (n?, dim) if encoded=False, otherwise must have size (n?, features).encoded (bool) – Specifies if input data is already encoded.
- Returns
The predicted class of each data point. Has size (n?,).
- Return type
-
encode(x: torch.Tensor)¶ Encodes input data
See also
onlinehd.Encoderfor more information.
-
fit(x: torch.Tensor, y: torch.Tensor, encoded: bool = False, lr: float = 0.035, epochs: int = 120, batch_size: Union[int, None, float] = 1024, one_pass_fit: bool = True, bootstrap: Union[float, str] = 0.01)¶ Starts learning process using datapoints x as input points and y as their labels.
- Parameters
x (
torch.Tensor) – Input data points. Must have size (n?, dim) if encoded=False, otherwise must have size (n?, features).encoded (bool) – Specifies if input data is already encoded.
lr (float, > 0) – Learning rate.
epochs (int, > 0) – Max number of epochs allowed.
batch_size (int, > 0 and <= n?, or float, > 0 and <= 1, or None) – If int, the number of samples to use in each batch. If float, the fraction of the samples to use in each batch. If none the whole dataset will be used per epoch (same if used 1.0 or n?).
one_pass_fit (bool) – Whether to use onepass learning process or not. If true, iterative method will be used after one pass fit anyways for the number of epochs specified.
bootstrap (float, > 0, <= 1 or 'single-per-class') – In order to initialize class hypervectors, OnlineHD does naive accumulation with a small fragment of data. This portion is determined by this argument. If ‘single-per-class’ is used, a single datapoint per class will be used as starting class hypervector.
Warning
Using one_pass_fit is not advisable for very large data or while using GPU. It is expected to see high memory usage using this option and it does not benefit from paralellization.
- Returns
self
- Return type
-
predict(x: torch.Tensor, encoded: bool = False)¶ Returns the predicted cluster of each data point in x. See
__call__()for details.
-
probabilities(x: torch.Tensor, encoded: bool = False)¶ Returns the probabilities of belonging to a certain class for each data point in x.
- Parameters
x (
torch.Tensor) – The data points to use. Must have size (n?, dim) if encoded=False, otherwise must have size (n?, features).encoded (bool) – Specifies if input data is already encoded.
- Returns
The class probability of each data point. Has size (n?, classes).
- Return type
-
scores(x: torch.Tensor, encoded: bool = False)¶ Returns pairwise cosine similarity between datapoints in x and each class hypervector. Calling model.scores(x, encoded=True) is the same as spatial.cos_cdist(x, model.model).
- Parameters
x (
torch.Tensor) – The data points to score. Must have size (n?, dim) if encoded=False, otherwise must have size (n?, features).encoded (bool) – Specifies if input data is already encoded.
- Returns
The cosine similarity between encoded input data and class hypervectors.
- Return type
See also
spatial.cos_cdist()for details.
-
to(*args)¶ Moves data to the device specified, e.g. cuda, cpu or changes dtype of the data representation, e.g. half or double. Because the internal data is saved as torch.tensor, the parameter can be anything that torch accepts. The change is done in-place.
- Parameters
device (str or
torch.torch.device) –- Returns
self
- Return type