memoryscope.core.utils

class memoryscope.core.utils.DatetimeHandler(dt: datetime | str | int | float | None = None)[source]

Bases: object

Handles operations related to datetime such as parsing, extraction, and formatting, with support for both Chinese and English contexts including weekday names and specialized text parsing for date components.

__init__(dt: datetime | str | int | float | None = None)[source]

Initialize the DatetimeHandler instance with a datetime object, string, integer, or float representation of a timestamp. If no argument is provided, the current time is used.

Parameters:: dt (datetime.datetime | str | int | float, optional) – The datetime to be handled. Can be a datetime object, a timestamp string, or a numeric timestamp. Defaults to None, which sets the instance to the current datetime.

self._dt

The internal datetime representation of the input.

Type:: datetime.datetime

self._dt_info_dict

A dictionary containing parsed datetime information, defaults to None.

Type:: dict | None

get_dt_info_dict(language: LanguageEnum)[source]

Property method to get the dictionary containing parsed datetime information. If None, initialize using _parse_dt_info.

Returns:: A dictionary with parsed datetime information.
Return type:: dict

classmethod extract_date_parts_cn(input_string: str) → dict[source]

Extracts various components of a date (year, month, day, etc.) from an input string based on Chinese formats.

This method identifies year, month, day, weekday, and hour components within the input string based on predefined patterns. It supports relative terms like ‘每’ (every) and translates weekday names into numeric representations.

Parameters:

input_string (str) – The Chinese text containing date and time information.

Returns:

A dictionary with keys ‘year’, ‘month’, ‘day’, ‘weekday’, and ‘hour’,: each holding the corresponding extracted value. If a component is not found, it will not be included in the dictionary. For relative terms like ‘每’ (every), the value is set to -1.

Return type:

dict

classmethod extract_date_parts_en(input_string: str) → dict[source]

Extracts various components of a date (year, month, day, etc.) from an input string based on English formats.

This method employs regex patterns to identify and parse different date and time elements within the provided text. It supports extraction of year, month name, day, 12-hour and 24-hour time formats, and weekdays.

Parameters:: input_string (str) – The English text containing date and time information.
Returns:: A dictionary containing the extracted date parts with default values of -1 where components are not found. Keys include ‘year’, ‘month’, ‘day’, ‘hour’, ‘minute’, ‘second’, and ‘weekday’.
Return type:: dict

classmethod extract_date_parts(input_string: str, language: LanguageEnum) → dict[source]

Extracts various date components from the input string based on the current language context.

This method dynamically selects a language-specific function to parse the input string and extract date parts such as year, month, day, etc. If the function for current language context does not exist, a warning is logged and an empty dictionary is returned.

Parameters:

input_string (str) – The string containing date information to be parsed.
language (str) – current language.

Returns:

A dictionary containing extracted date components, or an empty dictionary if parsing fails.

Return type:

dict

classmethod has_time_word_cn(query: str, datetime_word_list: List[str]) → bool[source]

Check if the input query contains any datetime-related words based on the cn language context.

Parameters:

query (str) – The input string to check for datetime-related words.
datetime_word_list (list[str]) – datetime keywords

Returns:

True if the query contains at least one datetime-related word, False otherwise.

Return type:

bool

classmethod has_time_word_en(query: str, datetime_word_list: List[str]) → bool[source]

Check if the input query contains any datetime-related words based on the en language context.

Parameters:

query (str) – The input string to check for datetime-related words.
datetime_word_list (list[str]) – datetime keywords

Returns:

True if the query contains at least one datetime-related word, False otherwise.

Return type:

bool

classmethod has_time_word(query: str, language: LanguageEnum) → bool[source]

datetime_format(dt_format: str = '%Y%m%d') → str[source]

Format the stored datetime object into a string based on the provided format.

Parameters:: dt_format (str, optional) – The datetime format string. Defaults to “%Y%m%d”.
Returns:: A formatted datetime string.
Return type:: str

string_format(string_format: str, language: LanguageEnum) → str[source]

Format the datetime information stored in the instance using a custom string format.

Parameters:

string_format (str) – A format string where placeholders are keys from dt_info_dict.
language (str) – current language.

Returns:

A formatted datetime string.

Return type:

str

property timestamp: int

Get the timestamp representation of the stored datetime.

Returns:: A timestamp value.
Return type:: int

class memoryscope.core.utils.Logger(name: str, level: int = 20, format_style: str = '%(asctime)s %(levelname)s [%(module)s:%(lineno)d] %(message)s', date_format_style: str = '%Y-%m-%d %H:%M:%S', to_stream: bool = False, to_file: bool = True, file_mode: str = 'w', file_type: str = 'log', dir_path: str = 'log', max_bytes: int = 1073741824, backup_count: int = 10)[source]

Bases: Logger

The Logger class handle the stream of information or errors in activities.

__init__(name: str, level: int = 20, format_style: str = '%(asctime)s %(levelname)s [%(module)s:%(lineno)d] %(message)s', date_format_style: str = '%Y-%m-%d %H:%M:%S', to_stream: bool = False, to_file: bool = True, file_mode: str = 'w', file_type: str = 'log', dir_path: str = 'log', max_bytes: int = 1073741824, backup_count: int = 10)[source]

Initializes the Logger instance, setting up handlers for console and file logging based on provided parameters.

Parameters:

name (str) – Identifier for the logger.
level (int, optional) – Logging level. Defaults to logging.INFO.
format_style (str, optional) – Log message format. Defaults to LOG_FORMAT constant.
date_format_style (str, optional) – Date format for logs. Defaults to DATE_FORMAT constant.
to_stream (bool, optional) – Enables console logging. Defaults to True.
to_file (bool, optional) – Enables file logging. Defaults to True.
file_mode (str, optional) – File open mode. Defaults to ‘w’.
file_type (str, optional) – Log file extension type. Defaults to ‘log’.
dir_path (str, optional) – Directory for log files. Defaults to ‘log’.
max_bytes (int, optional) – Maximum log file size before rotation. Defaults to 1GB.
backup_count (int, optional) – Number of rotated log files to retain. Defaults to 10.

log_dictionary_info(dictionary, title='')[source]

format_current_context(context, title='')[source]

wrap_in_box(context)[source]

format_chat_message(message)[source]

format_rank_message(model_response)[source]

close()[source]

Closes all handlers associated with this logger instance.

This method iterates over the handlers attached to the logger and calls their close method to ensure that any system resources used by the handlers are freed properly.

clear()[source]: Clears all handlers from the logger.

set_trace_id(trace_id: str)[source]

Sets the trace ID for the logger. If the provided trace ID is longer than 8 characters, it will be truncated to the first 8 characters.

Parameters:: trace_id (str) – The trace identifier to be associated with the logs.

makeRecord(name, level, fn, lno, msg, args, exc_info, func=None, extra=None, sinfo=None)[source]

Creates a log record with additional trace_id included in the extra information.

This method extends the default behavior of creating a log record by adding a trace_id from the logger instance to the record’s extra data, allowing for traceability within logged data.

Parameters:

name (str) – The name of the logger.
level (int) – The logging level of the record.
fn (str) – The name of the function containing the logging call.
lno (int) – The line number at which the logging call was made.
msg (str) – The logged message, before formatting.
args (tuple) – The arguments to the log message.
exc_info (tuple) – Exception information or None.
func (function) – The function where the logging call was made. Defaults to None.
extra (dict) – Additional information for the log record. Defaults to None.
sinfo (str) – Stack trace information or None.

Returns:

The created log record with potentially enriched ‘extra’ field.

Return type:

logging.LogRecord

classmethod get_logger(name: str | None = None, **kwargs)[source]

Retrieves or creates a logger instance with the specified name and configurations.

If no name is provided, it defaults to the first registered logger’s name or ‘default’ if none exist. This method ensures that only one logger instance exists per name by reusing existing instances stored in LOGGER_DICT.

Parameters:

name (str, optional) – The name of the logger. Defaults to None, which triggers auto-naming logic.
**kwargs – Additional keyword arguments to configure the logger.

Returns:

The requested or newly created logger instance.

Return type:

Logger

class memoryscope.core.utils.PromptHandler(class_path: str, language: LanguageEnum | str, prompt_file: str = '', prompt_dict: dict | None = None, **kwargs)[source]

Bases: object

The PromptHandler class manages prompt messages by loading them from YAML or JSON files and dictionaries, supporting language selection based on a context, and providing dictionary-like access to the prompt messages.

__init__(class_path: str, language: LanguageEnum | str, prompt_file: str = '', prompt_dict: dict | None = None, **kwargs)[source]

Initializes the PromptHandler with paths to prompt sources and additional keyword arguments.

Parameters:

class_path (str) – The path to the class where prompts are utilized.
prompt_file (str, optional) – The path to an external file containing prompts. Defaults to “”.
prompt_dict (dict, optional) – A dictionary directly containing prompt definitions. Defaults to None.
language (LanguageEnum, str) – context language.
**kwargs – Additional keyword arguments that might be used in prompt handling.

static file_path_completion(file_path: str, raise_exception: bool = True) → str[source]

Attempts to complete the given file path by appending either a .yaml or .json extension based on the existence of the respective file. If neither exists, an exception is raised.

Parameters:

file_path (str) – The base path of the file to be completed.
raise_exception (bool) – If the file cannot be found, report an error.

Returns:

The completed file path with the appropriate extension.

Return type:

str

Raises:

RuntimeError – If neither the .yaml nor .json file exists at the given path.

add_prompt_file(file_path: str, raise_exception: bool = True)[source]

Adds prompt messages from a YAML or JSON file to the internal dictionary.

This method supports loading prompts from files ending with ‘.yaml’ or ‘.json’. It uses the respective libraries to parse the content and merge it into the current prompt dictionary.

Parameters:

file_path (str) – The path to the YAML or JSON file containing the prompts.
raise_exception (bool) – If the file cannot be found, report an error.

add_prompt_dict(prompt_dict: dict)[source]

Adds prompt messages from a dictionary, ensuring each message has a valid entry for the current language.

Parameters:: prompt_dict (dict) – A dictionary where keys represent prompt identifiers and values are nested dictionaries containing language-specific prompt messages.
Raises:: RuntimeError – If a prompt message for the current language is not found.

property prompt_dict: dict

Retrieves the internal dictionary containing all prompt messages.

Returns:: The dictionary of prompt messages with keys as identifiers and values as prompt strings.
Return type:: dict

class memoryscope.core.utils.Registry(name: str)[source]

Bases: object

A registry to manage and instantiate various modules by their names, ensuring the uniqueness of registered entries. It supports both individual and bulk registration of modules, as well as retrieval of modules by name.

name

The name of the registry.

Type:: str

module_dict

A dictionary holding registered modules where keys are module names and values are the modules themselves.

Type:: Dict[str, Any]

__init__(name: str)[source]

Initializes the Registry with a given name.

Parameters:: name (str) – The name to identify this registry.

register(module_name: str | None = None, module: Any | None = None)[source]

Registers module in the registry in a single call.

Parameters:

module_name (str) – The name of module to be registered.
module (List[Any] | Dict[str, Any]) – The module to be registered.

Raises:

NotImplementedError – If the input is already registered.

batch_register(modules: List[Any] | Dict[str, Any])[source]

Registers multiple modules in the registry in a single call. Accepts either a list of modules or a dictionary: mapping names to modules.

Parameters:: modules (List[Any] | Dict[str, Any]) – A list of modules or a dictionary mapping module names to the modules.
Raises:: NotImplementedError – If the input is neither a list nor a dictionary.

class memoryscope.core.utils.ResponseTextParser(response_text: str, language: LanguageEnum, logger_prefix: str = '')[source]

Bases: object

The ResponseTextParser class is designed to parse and process response texts. It provides methods to extract patterns from the text and filter out unnecessary information, while also logging the processing steps and outcomes.

PATTERN_V1 = re.compile('<(.*?)>')

__init__(response_text: str, language: LanguageEnum, logger_prefix: str = '')[source]

parse_v1() → List[List[str]][source]

Extract specific patterns from the text which match content within angle brackets.

Returns:: Contents match the specific patterns.

parse_v2() → List[str][source]

Extract lines which contain NONE_WORD.

Returns:: Contents match the specific patterns.

class memoryscope.core.utils.Timer(name: str, time_log_type: Literal['end', 'wrap', 'none'] = 'end', use_ms: bool = True, stack_level: int = 2, float_precision: int = 4, **kwargs)[source]

Bases: object

A class used to measure the execution time of code blocks. It supports logging the elapsed time and can be customized to display time in seconds or milliseconds.

__init__(name: str, time_log_type: Literal['end', 'wrap', 'none'] = 'end', use_ms: bool = True, stack_level: int = 2, float_precision: int = 4, **kwargs)[source]

Initializes the Timer instance with the provided args and sets up a logger

Parameters:

name (str) – The log name.
time_log_type (str) – The log type. Defaults to ‘End’.
use_ms (bool) – Use ‘ms’ as the timescale or not. Defaults to True.
stack_level (int) – The stack level of log. Defaults to 2.
float_precision (int) – The precision of cost time. Defaults to 4.

property cost_str: Represent the cost time into a formatted string.

memoryscope.core.utils.underscore_to_camelcase(name: str, is_first_title: bool = True) → str[source]

Converts an underscore_notation string to CamelCase.

Parameters:

name (str) – The underscore_notation string to be converted.
is_first_title (bool) – Title the first word or not. Defaults to True

Returns:

A CamelCase formatted string.

Return type:

str

memoryscope.core.utils.camelcase_to_underscore(name: str) → str[source]

Converts a CamelCase string to underscore_notation.

Parameters:: name (str) – The CamelCase formatted string to be converted.
Returns:: A converted string in underscore_notation.
Return type:: str

memoryscope.core.utils.init_instance_by_config(config: dict, default_class_dir: str = 'memoryscope', **kwargs)[source]

Initialize an instance of a class specified in the configuration dictionary.

This function dynamically imports a class from a module path, allowing for user-defined classes or default paths. It supports adding a suffix to the class name, merging additional keyword arguments with the config, and handling nested module paths.

Parameters:

config (dict) – A dictionary containing the configuration, including the ‘class’ key that specifies the class’s module path.
default_class_dir (str, optional) – The default module path prefix to use if not explicitly defined in ‘config’. Defaults to “memory_scope”.
**kwargs – Additional keyword arguments to pass to the class constructor.

Returns:

An instance initialized with the provided config and kwargs.

Return type:

instance

memoryscope.core.utils.prompt_to_msg(system_prompt: str, few_shot: str, user_query: str, concat_system_prompt: bool = True) → List[Message][source]

Converts input strings into a structured list of message objects suitable for AI interactions.

Parameters:

system_prompt (str) – The system-level instruction or context.
few_shot (str) – An example or demonstration input, often used for illustrating expected behavior.
user_query (str) – The actual user query or prompt to be processed.
concat_system_prompt (bool) – Concat system prompt again or not in the user message. A simple method to improve the effectiveness for some LLMs. Defaults to True.

Returns:

A list of Message objects, each representing a part of the conversation setup.

Return type:

List[Message]

memoryscope.core.utils.char_logo(words: str, seed: int = 1732168950976738338, color=None)[source]

Render the context of logo with colors

Parameters:

words – The context of logo.
seed – The random seed which generates colors if there is no specific color. Defaults to the current timestamp.
color – The specific color. Defaults to None.

Returns:

A rendered logo

memoryscope.core.utils.md5_hash(input_string: str) → str[source]

Computes a MD5 hash of the given input string.

Parameters:: input_string (str) – The string for which the MD5 hash needs to be computed.
Returns:: A hexadecimal MD5 hash representation.
Return type:: str

memoryscope.core.utils.contains_keyword(text, keywords) → bool[source]

Checks if the given text contains any of the specified keywords, ignoring case.

Parameters:

text (str) – The text to search within.
keywords (List[str]) – A list of keywords to look for in the text.

Returns:

True if any keyword is found in the text, False otherwise.

Return type:

bool

memoryscope.core.utils.cosine_similarity(query: List[float], documents: List[List[float]])[source]