Logic.core.classification package#

Logic.core.classification.basic_classifier module#

class BasicClassifier#

Bases: object

fit(x, y)#
get_percent_of_positive_reviews(sentences)#

Get the percentage of positive reviews in the given sentences :param sentences: The list of sentences to get the percentage of positive reviews :type sentences: list

Returns:

The percentage of positive reviews

Return type:

float

predict(x)#
prediction_report(x, y)#

Logic.core.classification.data_loader module#

class ReviewLoader(file_path: str)#

Bases: object

get_embeddings()#

Get the embeddings for the reviews using the fasttext model.

load_data()#

Load the data from the csv file and preprocess the text. Then save the normalized tokens and the sentiment labels. Also, load the fasttext model.

split_data(test_data_ratio=0.2)#

Split the data into training and testing data.

Parameters:

test_data_ratio (float) – The ratio of the test data

Returns:

Return the training and testing data for the embeddings and the sentiments. in the order of x_train, x_test, y_train, y_test

Return type:

np.ndarray, np.ndarray, np.ndarray, np.ndarray

Logic.core.classification.deep module#

class DeepModelClassifier(in_features, num_classes, batch_size, num_epochs=50)#

Bases: BasicClassifier

fit(x, y)#

Fit the model on the given train_loader and test_loader for num_epochs epochs. You have to call set_test_dataloader before calling the fit function. :param x: The training embeddings :type x: np.ndarray :param y: The training labels :type y: np.ndarray

Return type:

self

predict(x)#

Predict the labels on the given test_loader :param x: The test embeddings :type x: np.ndarray

Returns:

predicted_labels – The predicted labels

Return type:

list

prediction_report(x, y)#

Get the classification report on the given test set :param x: The test embeddings :type x: np.ndarray :param y: The test labels :type y: np.ndarray

Returns:

The classification report

Return type:

str

set_test_dataloader(X_test, y_test)#

Set the test dataloader. This is used to evaluate the model on the test set while training :param X_test: The test embeddings :type X_test: np.ndarray :param y_test: The test labels :type y_test: np.ndarray

Returns:

Returns self

Return type:

self

class MLPModel(*args: Any, **kwargs: Any)#

Bases: Module

forward(xb)#
class ReviewDataSet(*args: Any, **kwargs: Any)#

Bases: Dataset

Logic.core.classification.knn module#

class KnnClassifier(n_neighbors)#

Bases: BasicClassifier

fit(x, y)#

Fit the model using X as training data and y as target values use the Euclidean distance to find the k nearest neighbors Warning: Maybe you need to reduce the size of X to avoid memory errors

Parameters:
  • x (np.ndarray) – An m * n matrix - m is count of docs and n is embedding size

  • y (np.ndarray) – The real class label for each doc

Returns:

Returns self as a classifier

Return type:

self

predict(x)#
Parameters:

x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size

Returns:

Return the predicted class for each doc with the highest probability (argmax)

Return type:

np.ndarray

prediction_report(x, y)#
Parameters:
  • x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size

  • y (np.ndarray) – The real class label for each doc

Returns:

Return the classification report

Return type:

str

Logic.core.classification.naive_bayes module#

class NaiveBayes(count_vectorizer, alpha=1)#

Bases: BasicClassifier

fit(x, y)#

Fit the features and the labels Calculate prior and feature probabilities

Parameters:
  • x (np.ndarray) – An m * n matrix - m is count of docs and n is embedding size

  • y (np.ndarray) – The real class label for each doc

Returns:

Returns self as a classifier

Return type:

self

get_percent_of_positive_reviews(sentences)#

You have to override this method because we are using a different embedding method in this class.

predict(x)#
Parameters:

x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size

Returns:

Return the predicted class for each doc with the highest probability (argmax)

Return type:

np.ndarray

prediction_report(x, y)#
Parameters:
  • x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size

  • y (np.ndarray) – The real class label for each doc

Returns:

Return the classification report

Return type:

str

Logic.core.classification.svm module#

class SVMClassifier#

Bases: BasicClassifier

fit(x, y)#
Parameters:
  • x (np.ndarray) – An m * n matrix - m is count of docs and n is embedding size

  • y (np.ndarray) – The real class label for each doc

predict(x)#
Parameters:

x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size

Returns:

Return the predicted class for each doc with the highest probability (argmax)

Return type:

np.ndarray

prediction_report(x, y)#
Parameters:
  • x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size

  • y (np.ndarray) – The real class label for each doc

Returns:

Return the classification report

Return type:

str