Logic.core package#

Subpackages#

class SearchEngine#

Bases: object

aggregate_scores(weights, scores, final_scores)#

Aggregates the scores of the fields.

Parameters:

find_scores_with_safe_ranking(query, method, weights, scores)#

Finds the scores of the documents using the safe ranking method.

Parameters:

query (List[str]) – The query to be scored
method (str ((n|l)(n|t)(n|c).(n|l)(n|t)(n|c)) | OkapiBM25) – The method to use for searching.
weights (dict) – The weights of the fields.
scores (dict) – The scores of the documents.

find_scores_with_unigram_model(query, smoothing_method, weights, scores, alpha=0.5, lamda=0.5)#

Calculates the scores for each document based on the unigram model.

Parameters:

query (str) – The query to search for.
smoothing_method (str (bayes | naive | mixture)) – The method used for smoothing the probabilities in the unigram model.
weights (dict) – A dictionary mapping each field (e.g., ‘stars’, ‘genres’, ‘summaries’) to its weight in the final score. Fields with a weight of 0 are ignored.
scores (dict) – The scores of the documents.
alpha (float, optional) – The parameter used in bayesian smoothing method. Defaults to 0.5.
lamda (float, optional) – The parameter used in some smoothing methods to balance between the document probability and the collection probability. Defaults to 0.5.

find_scores_with_unsafe_ranking(query, method, weights, max_results, scores)#

Finds the scores of the documents using the unsafe ranking method using the tiered index.

Parameters:

query (List[str]) – The query to be scored
method (str ((n|l)(n|t)(n|c).(n|l)(n|t)(n|c)) | OkapiBM25) – The method to use for searching.
weights (dict) – The weights of the fields.
max_results (int) – The maximum number of results to return.
scores (dict) – The scores of the documents.

merge_scores(scores1, scores2)#

Merges two dictionaries of scores.

Parameters:

Returns:

The merged dictionary of scores.

Return type:

dict

search(query, method, weights, safe_ranking=True, max_results=10, smoothing_method=None, alpha=0.5, lamda=0.5)#

searches for the query in the indexes.

Parameters:

query (str) – The query to search for.
method (str ((n|l)(n|t)(n|c).(n|l)(n|t)(n|c)) | OkapiBM25 | Unigram) – The method to use for searching.
weights (dict) – The weights of the fields.
safe_ranking (bool) – If True, the search engine will search in whole index and then rank the results. If False, the search engine will search in tiered index.
max_results (int) – The maximum number of results to return. If None, all results are returned.
smoothing_method (str (bayes | naive | mixture)) – The method used for smoothing the probabilities in the unigram model.
alpha (float, optional) – The parameter used in bayesian smoothing method. Defaults to 0.5.
lamda (float, optional) – The parameter used in some smoothing methods to balance between the document probability and the collection probability. Defaults to 0.5.

Returns:

A list of tuples containing the document IDs and their scores sorted by their scores.

Return type:

list