Skip to content

cumulative

IterableCumulativePrincipleGenerator

Bases: IterativePrincipleGenerator

Iterative principle generator that combines evaluation, generation, and clustering.

Attributes:

Name Type Description
reward BaseListWisePrincipleReward

Reward module for principle-based evaluation

max_epochs int

Maximum number of iteration cycles

Source code in rm_gallery/core/reward/principle/cumulative.py
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
class IterableCumulativePrincipleGenerator(IterativePrincipleGenerator):
    """
    Iterative principle generator that combines evaluation, generation, and clustering.

    Attributes:
        reward: Reward module for principle-based evaluation
        max_epochs: Maximum number of iteration cycles
    """

    reward: BaseListWisePrincipleReward = Field(
        default=..., description="reward module"
    )
    max_epochs: int = Field(default=2, description="max epochs")
    cluster_template: Type[BaseGeneratorTemplate] = Field(
        default=PrincipleClusterTemplate,
        description="template for clustering principles",
    )

    def run_batch(
        self,
        samples: List[DataSample],
        thread_pool: ThreadPoolExecutor,
        principles: Dict[str, str] | None = None,
    ) -> Dict[str, str]:
        """
        Executes the iterative principle generation pipeline.

        Args:
            samples: List of initial data samples
            thread_pool: Executor for parallel processing

        Returns:
            Final optimized principles dictionary after iterations
        """
        if not principles:
            principles = super().run_batch(samples, thread_pool)

        bad_samples = samples

        for i in range(self.max_epochs):
            _samples = self.evaluate(deepcopy(samples), principles, thread_pool)
            bad_samples = self._split_samples(_samples)
            futures = [
                thread_pool.submit(self.generate_with_feedback, sample, principles)
                for sample in bad_samples
            ]
            wait(futures, return_when=ALL_COMPLETED)
            bad_samples = [future.result() for future in futures]
            principles.update(self.cluster_with_feedback(bad_samples, principles))

        return principles

run_batch(samples, thread_pool, principles=None)

Executes the iterative principle generation pipeline.

Parameters:

Name Type Description Default
samples List[DataSample]

List of initial data samples

required
thread_pool ThreadPoolExecutor

Executor for parallel processing

required

Returns:

Type Description
Dict[str, str]

Final optimized principles dictionary after iterations

Source code in rm_gallery/core/reward/principle/cumulative.py
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
def run_batch(
    self,
    samples: List[DataSample],
    thread_pool: ThreadPoolExecutor,
    principles: Dict[str, str] | None = None,
) -> Dict[str, str]:
    """
    Executes the iterative principle generation pipeline.

    Args:
        samples: List of initial data samples
        thread_pool: Executor for parallel processing

    Returns:
        Final optimized principles dictionary after iterations
    """
    if not principles:
        principles = super().run_batch(samples, thread_pool)

    bad_samples = samples

    for i in range(self.max_epochs):
        _samples = self.evaluate(deepcopy(samples), principles, thread_pool)
        bad_samples = self._split_samples(_samples)
        futures = [
            thread_pool.submit(self.generate_with_feedback, sample, principles)
            for sample in bad_samples
        ]
        wait(futures, return_when=ALL_COMPLETED)
        bad_samples = [future.result() for future in futures]
        principles.update(self.cluster_with_feedback(bad_samples, principles))

    return principles

PrincipleClusterTemplate

Bases: BaseGeneratorTemplate

Template class for clustering and organizing evaluation principles.

Methods:

Name Description
format

Formats a prompt for principle clustering and optimization.

Source code in rm_gallery/core/reward/principle/cumulative.py
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
class PrincipleClusterTemplate(BaseGeneratorTemplate):
    """
    Template class for clustering and organizing evaluation principles.

    Methods:
        format: Formats a prompt for principle clustering and optimization.
    """

    @classmethod
    def format(
        cls, examples: str, scenario: str, number: int, principles, **kwargs
    ) -> str:
        """
        Generates a structured prompt for principle clustering analysis.

        Args:
            examples: Pre-generated example principles for reference
            scenario: Contextual description of the evaluation scenario
            number: Maximum number of clustered principles to generate
            principles: Raw principles to be clustered and optimized
            **kwargs: Additional formatting parameters

        Returns:
            Formatted prompt string for principle clustering
        """
        return f"""## Overview
As an principle aggregation and analysis expert, your task is to conduct cluster analysis on a large collection of pre-generated principles for improvements based on examples and provide the optimization principles for each category in the scenario, that are different from the original principles.
**Specific Steps:**
1. Organize the provided improvement principles into distinct categories, ensuring that each category is unique and succinct.
2. Summarize the principles within each category into a sample set for that category, while retaining detailed information.

Another assistant will evaluate the completions in the scenario based on these principles.
When consolidating the principles, be sure to maintain the integrity, clarity, and conciseness of each category.

## Requirements for Principles
(1) Principles are presented from most important to least important.
(2) Principles should be as critical as possible.
(3) Each principle should consist of a brief phrase accompanied by a single sentence description.
(4) The number of final principles should be LESS THAN OR EQUAL TO {number}.
(5) Focus on summarizing recurring candidate principles.

## Input
### Scenario
{scenario}

### Original Principles
{principles}

### Examples
{examples}

## Output Format Requirements
{cls.schema(**kwargs)}
"""

format(examples, scenario, number, principles, **kwargs) classmethod

Generates a structured prompt for principle clustering analysis.

Parameters:

Name Type Description Default
examples str

Pre-generated example principles for reference

required
scenario str

Contextual description of the evaluation scenario

required
number int

Maximum number of clustered principles to generate

required
principles

Raw principles to be clustered and optimized

required
**kwargs

Additional formatting parameters

{}

Returns:

Type Description
str

Formatted prompt string for principle clustering

Source code in rm_gallery/core/reward/principle/cumulative.py
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
    @classmethod
    def format(
        cls, examples: str, scenario: str, number: int, principles, **kwargs
    ) -> str:
        """
        Generates a structured prompt for principle clustering analysis.

        Args:
            examples: Pre-generated example principles for reference
            scenario: Contextual description of the evaluation scenario
            number: Maximum number of clustered principles to generate
            principles: Raw principles to be clustered and optimized
            **kwargs: Additional formatting parameters

        Returns:
            Formatted prompt string for principle clustering
        """
        return f"""## Overview
As an principle aggregation and analysis expert, your task is to conduct cluster analysis on a large collection of pre-generated principles for improvements based on examples and provide the optimization principles for each category in the scenario, that are different from the original principles.
**Specific Steps:**
1. Organize the provided improvement principles into distinct categories, ensuring that each category is unique and succinct.
2. Summarize the principles within each category into a sample set for that category, while retaining detailed information.

Another assistant will evaluate the completions in the scenario based on these principles.
When consolidating the principles, be sure to maintain the integrity, clarity, and conciseness of each category.

## Requirements for Principles
(1) Principles are presented from most important to least important.
(2) Principles should be as critical as possible.
(3) Each principle should consist of a brief phrase accompanied by a single sentence description.
(4) The number of final principles should be LESS THAN OR EQUAL TO {number}.
(5) Focus on summarizing recurring candidate principles.

## Input
### Scenario
{scenario}

### Original Principles
{principles}

### Examples
{examples}

## Output Format Requirements
{cls.schema(**kwargs)}
"""