Logic.core.classification package

Logic.core.classification package#

Logic.core.classification.basic_classifier module#

class BasicClassifier#

Bases: object

fit(x, y)#

get_percent_of_positive_reviews(sentences)#

Get the percentage of positive reviews in the given sentences :param sentences: The list of sentences to get the percentage of positive reviews :type sentences: list

Returns:: The percentage of positive reviews
Return type:: float

predict(x)#

prediction_report(x, y)#

Logic.core.classification.data_loader module#

class ReviewLoader(file_path: str)#

Bases: object

get_embeddings()#: Get the embeddings for the reviews using the fasttext model.

load_data()#: Load the data from the csv file and preprocess the text. Then save the normalized tokens and the sentiment labels. Also, load the fasttext model.

split_data(test_data_ratio=0.2)#

Split the data into training and testing data.

Parameters:: test_data_ratio (float) – The ratio of the test data
Returns:: Return the training and testing data for the embeddings and the sentiments. in the order of x_train, x_test, y_train, y_test
Return type:: np.ndarray, np.ndarray, np.ndarray, np.ndarray

Logic.core.classification.deep module#

class DeepModelClassifier(in_features, num_classes, batch_size, num_epochs=50)#

Bases: BasicClassifier

fit(x, y)#

Fit the model on the given train_loader and test_loader for num_epochs epochs. You have to call set_test_dataloader before calling the fit function. :param x: The training embeddings :type x: np.ndarray :param y: The training labels :type y: np.ndarray

Return type:: self

predict(x)#

Predict the labels on the given test_loader :param x: The test embeddings :type x: np.ndarray

Returns:: predicted_labels – The predicted labels
Return type:: list

prediction_report(x, y)#

Get the classification report on the given test set :param x: The test embeddings :type x: np.ndarray :param y: The test labels :type y: np.ndarray

Returns:: The classification report
Return type:: str

set_test_dataloader(X_test, y_test)#

Set the test dataloader. This is used to evaluate the model on the test set while training :param X_test: The test embeddings :type X_test: np.ndarray :param y_test: The test labels :type y_test: np.ndarray

Returns:: Returns self
Return type:: self

class MLPModel(*args: Any, **kwargs: Any)#

Bases: Module

forward(xb)#

class ReviewDataSet(*args: Any, **kwargs: Any)#: Bases: Dataset

Logic.core.classification.knn module#

class KnnClassifier(n_neighbors)#

Bases: BasicClassifier

fit(x, y)#

Fit the model using X as training data and y as target values use the Euclidean distance to find the k nearest neighbors Warning: Maybe you need to reduce the size of X to avoid memory errors

Parameters:

x (np.ndarray) – An m * n matrix - m is count of docs and n is embedding size
y (np.ndarray) – The real class label for each doc

Returns:

Returns self as a classifier

Return type:

self

predict(x)#

Parameters:: x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size
Returns:: Return the predicted class for each doc with the highest probability (argmax)
Return type:: np.ndarray

prediction_report(x, y)#

Parameters:

x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size
y (np.ndarray) – The real class label for each doc

Returns:

Return the classification report

Return type:

str

Logic.core.classification.naive_bayes module#

class NaiveBayes(count_vectorizer, alpha=1)#

Bases: BasicClassifier

fit(x, y)#

Fit the features and the labels Calculate prior and feature probabilities

Parameters:

x (np.ndarray) – An m * n matrix - m is count of docs and n is embedding size
y (np.ndarray) – The real class label for each doc

Returns:

Returns self as a classifier

Return type:

self

get_percent_of_positive_reviews(sentences)#: You have to override this method because we are using a different embedding method in this class.

predict(x)#

Parameters:: x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size
Returns:: Return the predicted class for each doc with the highest probability (argmax)
Return type:: np.ndarray

prediction_report(x, y)#

Parameters:

x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size
y (np.ndarray) – The real class label for each doc

Returns:

Return the classification report

Return type:

str

Logic.core.classification.svm module#

class SVMClassifier#

Bases: BasicClassifier

fit(x, y)#

Parameters:

x (np.ndarray) – An m * n matrix - m is count of docs and n is embedding size
y (np.ndarray) – The real class label for each doc

predict(x)#

Parameters:: x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size
Returns:: Return the predicted class for each doc with the highest probability (argmax)
Return type:: np.ndarray

prediction_report(x, y)#

Parameters:

x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size
y (np.ndarray) – The real class label for each doc

Returns:

Return the classification report

Return type:

str

Logic.core.classification package

Contents

Logic.core.classification package#

Logic.core.classification.basic_classifier module#

Logic.core.classification.data_loader module#

Logic.core.classification.deep module#

Logic.core.classification.knn module#

Logic.core.classification.naive_bayes module#

Logic.core.classification.svm module#