Logic.core.classification package#
Logic.core.classification.basic_classifier module#
- class BasicClassifier#
Bases:
object
- fit(x, y)#
- get_percent_of_positive_reviews(sentences)#
Get the percentage of positive reviews in the given sentences :param sentences: The list of sentences to get the percentage of positive reviews :type sentences: list
- Returns:
The percentage of positive reviews
- Return type:
- predict(x)#
- prediction_report(x, y)#
Logic.core.classification.data_loader module#
- class ReviewLoader(file_path: str)#
Bases:
object
- get_embeddings()#
Get the embeddings for the reviews using the fasttext model.
- load_data()#
Load the data from the csv file and preprocess the text. Then save the normalized tokens and the sentiment labels. Also, load the fasttext model.
- split_data(test_data_ratio=0.2)#
Split the data into training and testing data.
- Parameters:
test_data_ratio (float) – The ratio of the test data
- Returns:
Return the training and testing data for the embeddings and the sentiments. in the order of x_train, x_test, y_train, y_test
- Return type:
np.ndarray, np.ndarray, np.ndarray, np.ndarray
Logic.core.classification.deep module#
- class DeepModelClassifier(in_features, num_classes, batch_size, num_epochs=50)#
Bases:
BasicClassifier
- fit(x, y)#
Fit the model on the given train_loader and test_loader for num_epochs epochs. You have to call set_test_dataloader before calling the fit function. :param x: The training embeddings :type x: np.ndarray :param y: The training labels :type y: np.ndarray
- Return type:
self
- predict(x)#
Predict the labels on the given test_loader :param x: The test embeddings :type x: np.ndarray
- Returns:
predicted_labels – The predicted labels
- Return type:
- prediction_report(x, y)#
Get the classification report on the given test set :param x: The test embeddings :type x: np.ndarray :param y: The test labels :type y: np.ndarray
- Returns:
The classification report
- Return type:
- set_test_dataloader(X_test, y_test)#
Set the test dataloader. This is used to evaluate the model on the test set while training :param X_test: The test embeddings :type X_test: np.ndarray :param y_test: The test labels :type y_test: np.ndarray
- Returns:
Returns self
- Return type:
self
- class ReviewDataSet(*args: Any, **kwargs: Any)#
Bases:
Dataset
Logic.core.classification.knn module#
- class KnnClassifier(n_neighbors)#
Bases:
BasicClassifier
- fit(x, y)#
Fit the model using X as training data and y as target values use the Euclidean distance to find the k nearest neighbors Warning: Maybe you need to reduce the size of X to avoid memory errors
- Parameters:
x (np.ndarray) – An m * n matrix - m is count of docs and n is embedding size
y (np.ndarray) – The real class label for each doc
- Returns:
Returns self as a classifier
- Return type:
self
- predict(x)#
- Parameters:
x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size
- Returns:
Return the predicted class for each doc with the highest probability (argmax)
- Return type:
np.ndarray
Logic.core.classification.naive_bayes module#
- class NaiveBayes(count_vectorizer, alpha=1)#
Bases:
BasicClassifier
- fit(x, y)#
Fit the features and the labels Calculate prior and feature probabilities
- Parameters:
x (np.ndarray) – An m * n matrix - m is count of docs and n is embedding size
y (np.ndarray) – The real class label for each doc
- Returns:
Returns self as a classifier
- Return type:
self
- get_percent_of_positive_reviews(sentences)#
You have to override this method because we are using a different embedding method in this class.
- predict(x)#
- Parameters:
x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size
- Returns:
Return the predicted class for each doc with the highest probability (argmax)
- Return type:
np.ndarray
Logic.core.classification.svm module#
- class SVMClassifier#
Bases:
BasicClassifier
- fit(x, y)#
- Parameters:
x (np.ndarray) – An m * n matrix - m is count of docs and n is embedding size
y (np.ndarray) – The real class label for each doc
- predict(x)#
- Parameters:
x (np.ndarray) – An k * n matrix - k is count of docs and n is embedding size
- Returns:
Return the predicted class for each doc with the highest probability (argmax)
- Return type:
np.ndarray