Scheduling Problems¶

Flexible Flow Shop Problem (FFSP)¶

FFSPEnv ¶

FFSPEnv(
    generator: FFSPGenerator = None,
    generator_params: dict = {},
    **kwargs
)

Bases: RL4COEnvBase

Flexible Flow Shop Problem (FFSP) environment. The goal is to schedule a set of jobs on a set of machines such that the makespan is minimized.

Observations

time index
sub time index
batch index
machine index
schedule
machine wait step
job location
job wait step
job duration

Constraints

each job has to be processed on each machine in a specific order
the machine has to be available to process the job
the job has to be available to be processed

Finish Condition

all jobs are scheduled

Reward

(minus) the makespan of the schedule

Parameters:

generator (FFSPGenerator, default: None ) –

FFSPGenerator instance as the data generator
generator_params (dict, default: {} ) –

parameters for the generator

Source code in rl4co/envs/scheduling/ffsp/env.py

def __init__(
    self,
    generator: FFSPGenerator = None,
    generator_params: dict = {},
    **kwargs,
):
    super().__init__(check_solution=False, dataset_cls=FastTdDataset, **kwargs)
    if generator is None:
        generator = FFSPGenerator(**generator_params)
    self.generator = generator

    self.num_stage = generator.num_stage
    self.num_machine = generator.num_machine
    self.num_job = generator.num_job
    self.num_machine_total = generator.num_machine_total
    self.tables = None
    self.step_cnt = None
    self.flatten_stages = generator.flatten_stages

    self._make_spec(generator)

FFSPGenerator ¶

FFSPGenerator(
    num_stage: int = 2,
    num_machine: int = 3,
    num_job: int = 4,
    min_time: int = 2,
    max_time: int = 10,
    flatten_stages: bool = True,
    **unused_kwargs
)

Bases: Generator

Data generator for the Flow Shop Scheduling Problem (FFSP).

Parameters:

num_stage (int, default: 2 ) –

number of stages
num_machine (int, default: 3 ) –

number of machines
num_job (int, default: 4 ) –

number of jobs
min_time (int, default: 2 ) –

minimum running time of each job on each machine
max_time (int, default: 10 ) –

maximum running time of each job on each machine
flatten_stages (bool, default: True ) –

whether to flatten the stages

Returns:

–

A TensorDict with the following key: run_time [batch_size, num_job, num_machine, num_stage]: running time of each job on each machine

Note

[IMPORTANT] This version of ffsp requires the number of machines in each stage to be the same

Source code in rl4co/envs/scheduling/ffsp/generator.py

def __init__(
    self,
    num_stage: int = 2,
    num_machine: int = 3,
    num_job: int = 4,
    min_time: int = 2,
    max_time: int = 10,
    flatten_stages: bool = True,
    **unused_kwargs,
):
    self.num_stage = num_stage
    self.num_machine = num_machine
    self.num_machine_total = num_machine * num_stage
    self.num_job = num_job
    self.min_time = min_time
    self.max_time = max_time
    self.flatten_stages = flatten_stages

    # FFSP environment doen't have any other kwargs
    if len(unused_kwargs) > 0:
        log.error(f"Found {len(unused_kwargs)} unused kwargs: {unused_kwargs}")

Flexible Job Shop Problem (FJSP)¶

FJSPEnv ¶

FJSPEnv(
    generator: FJSPGenerator = None,
    generator_params: dict = {},
    mask_no_ops: bool = True,
    check_mask: bool = False,
    stepwise_reward: bool = False,
    **kwargs
)

Bases: RL4COEnvBase

Flexible Job-Shop Scheduling Problem (FJSP) environment At each step, the agent chooses a job-machine combination. The operation to be processed next for the selected job is then executed on the selected machine. The reward is 0 unless the agent scheduled all operations of all jobs. In that case, the reward is (-)makespan of the schedule: maximizing the reward is equivalent to minimizing the makespan.

Observations

time: current time
next_op: next operation per job
proc_times: processing time of operation-machine pairs
pad_mask: specifies padded operations
start_op_per_job: id of first operation per job
end_op_per_job: id of last operation per job
start_times: start time of operation (defaults to 0 if not scheduled)
finish_times: finish time of operation (defaults to INIT_FINISH if not scheduled)
job_ops_adj: adjacency matrix specifying job-operation affiliation
ops_job_map: same as above but using ids of jobs to indicate affiliation
ops_sequence_order: specifies the order in which operations have to be processed
ma_assignment: specifies which operation has been scheduled on which machine
busy_until: specifies until when the machine will be busy
num_eligible: number of machines that can process an operation
job_in_process: whether job is currently being processed
job_done: whether the job is done

Constrains

the agent may not select:

machines that are currently busy
jobs that are done already
jobs that are currently processed
job-machine combinations, where the machine cannot process the next operation of the job

Finish condition

the agent has scheduled all operations of all jobs

Reward

the negative makespan of the final schedule

Parameters:

generator (FJSPGenerator, default: None ) –

FJSPGenerator instance as the data generator
generator_params (dict, default: {} ) –

parameters for the generator
mask_no_ops (bool, default: True ) –

if True, agent may not select waiting operation (unless instance is done)

Source code in rl4co/envs/scheduling/fjsp/env.py

def __init__(
    self,
    generator: FJSPGenerator = None,
    generator_params: dict = {},
    mask_no_ops: bool = True,
    check_mask: bool = False,
    stepwise_reward: bool = False,
    **kwargs,
):
    super().__init__(check_solution=False, **kwargs)
    if generator is None:
        if generator_params.get("file_path", None) is not None:
            generator = FJSPFileGenerator(**generator_params)
        else:
            generator = FJSPGenerator(**generator_params)
    self.generator = generator
    self._num_mas = generator.num_mas
    self._num_jobs = generator.num_jobs
    self._n_ops_max = generator.max_ops_per_job * self.num_jobs

    self.mask_no_ops = mask_no_ops
    self.check_mask = check_mask
    self.stepwise_reward = stepwise_reward
    self._make_spec(self.generator)

FJSPGenerator ¶

FJSPGenerator(
    num_jobs: int = 10,
    num_machines: int = 5,
    min_ops_per_job: int = 4,
    max_ops_per_job: int = 6,
    min_processing_time: int = 1,
    max_processing_time: int = 20,
    min_eligible_ma_per_op: int = 1,
    max_eligible_ma_per_op: int = None,
    same_mean_per_op: bool = True,
    **unused_kwargs
)

Bases: Generator

Data generator for the Flexible Job-Shop Scheduling Problem (FJSP).

Parameters:

num_stage –

number of stages
num_machine –

number of machines
num_job –

number of jobs
min_time –

minimum running time of each job on each machine
max_time –

maximum running time of each job on each machine
flatten_stages –

whether to flatten the stages

Returns:

–

A TensorDict with the following key: start_op_per_job [batch_size, num_jobs]: first operation of each job end_op_per_job [batch_size, num_jobs]: last operation of each job proc_times [batch_size, num_machines, total_n_ops]: processing time of ops on machines pad_mask [batch_size, total_n_ops]: not all instances have the same number of ops, so padding is used

Source code in rl4co/envs/scheduling/fjsp/generator.py

def __init__(
    self,
    num_jobs: int = 10,
    num_machines: int = 5,
    min_ops_per_job: int = 4,
    max_ops_per_job: int = 6,
    min_processing_time: int = 1,
    max_processing_time: int = 20,
    min_eligible_ma_per_op: int = 1,
    max_eligible_ma_per_op: int = None,
    same_mean_per_op: bool = True,
    **unused_kwargs,
):
    self.num_jobs = num_jobs
    self.num_mas = num_machines
    self.min_ops_per_job = min_ops_per_job
    self.max_ops_per_job = max_ops_per_job
    self.min_processing_time = min_processing_time
    self.max_processing_time = max_processing_time
    self.min_eligible_ma_per_op = min_eligible_ma_per_op
    self.max_eligible_ma_per_op = max_eligible_ma_per_op or num_machines
    # determines whether to use a fixed number of total operations or let it vary between instances
    # NOTE: due to the way rl4co builds datasets, we need a fixed size here
    self.n_ops_max = max_ops_per_job * num_jobs
    self.same_mean_per_op = same_mean_per_op
    # FFSP environment doen't have any other kwargs
    if len(unused_kwargs) > 0:
        log.error(f"Found {len(unused_kwargs)} unused kwargs: {unused_kwargs}")

Job Shop Scheduling Problem (JSSP)¶

JSSPEnv ¶

JSSPEnv(
    generator: JSSPGenerator = None,
    generator_params: dict = {},
    mask_no_ops: bool = True,
    **kwargs
)

Bases: FJSPEnv

Job-Shop Scheduling Problem (JSSP) environment At each step, the agent chooses a job. The operation to be processed next for the selected job is then executed on the associated machine. The reward is 0 unless the agent scheduled all operations of all jobs. In that case, the reward is (-)makespan of the schedule: maximizing the reward is equivalent to minimizing the makespan. NOTE: The JSSP is a special case of the FJSP, when the number of eligible machines per operation is equal to one for all operations. Therefore, this environment is a subclass of the FJSP environment. Observations:

- time: current time
- next_op: next operation per job
- proc_times: processing time of operation-machine pairs
- pad_mask: specifies padded operations
- start_op_per_job: id of first operation per job
- end_op_per_job: id of last operation per job
- start_times: start time of operation (defaults to 0 if not scheduled)
- finish_times: finish time of operation (defaults to INIT_FINISH if not scheduled)
- job_ops_adj: adjacency matrix specifying job-operation affiliation
- ops_job_map: same as above but using ids of jobs to indicate affiliation
- ops_sequence_order: specifies the order in which operations have to be processed
- ma_assignment: specifies which operation has been scheduled on which machine
- busy_until: specifies until when the machine will be busy
- num_eligible: number of machines that can process an operation
- job_in_process: whether job is currently being processed
- job_done: whether the job is done

Constrains

the agent may not select:

jobs that are done already
jobs that are currently processed

Finish condition

the agent has scheduled all operations of all jobs

Reward

the negative makespan of the final schedule

Parameters:

generator (JSSPGenerator, default: None ) –

JSSPGenerator instance as the data generator
generator_params (dict, default: {} ) –

parameters for the generator
mask_no_ops (bool, default: True ) –

if True, agent may not select waiting operation (unless instance is done)

Source code in rl4co/envs/scheduling/jssp/env.py

def __init__(
    self,
    generator: JSSPGenerator = None,
    generator_params: dict = {},
    mask_no_ops: bool = True,
    **kwargs,
):
    if generator is None:
        if generator_params.get("file_path", None) is not None:
            generator = JSSPFileGenerator(**generator_params)
        else:
            generator = JSSPGenerator(**generator_params)

    super().__init__(generator, generator_params, mask_no_ops, **kwargs)

JSSPGenerator ¶

JSSPGenerator(
    num_jobs: int = 6,
    num_machines: int = 6,
    min_ops_per_job: int = None,
    max_ops_per_job: int = None,
    min_processing_time: int = 1,
    max_processing_time: int = 99,
    one2one_ma_map: bool = True,
    **unused_kwargs
)

Bases: Generator

Data generator for the Job-Shop Scheduling Problem (JSSP)

Parameters:

num_stage –

number of stages
num_machine –

number of machines
num_job –

number of jobs
min_time –

minimum running time of each job on each machine
max_time –

maximum running time of each job on each machine
flatten_stages –

whether to flatten the stages
one2one_ma_map (bool, default: True ) –

whether each machine should have exactly one operation per job (common in jssp benchmark instances)

Returns:

–

A TensorDict with the following key: start_op_per_job [batch_size, num_jobs]: first operation of each job end_op_per_job [batch_size, num_jobs]: last operation of each job proc_times [batch_size, num_machines, total_n_ops]: processing time of ops on machines pad_mask [batch_size, total_n_ops]: not all instances have the same number of ops, so padding is used

Source code in rl4co/envs/scheduling/jssp/generator.py

def __init__(
    self,
    num_jobs: int = 6,
    num_machines: int = 6,
    min_ops_per_job: int = None,
    max_ops_per_job: int = None,
    min_processing_time: int = 1,
    max_processing_time: int = 99,
    one2one_ma_map: bool = True,
    **unused_kwargs,
):
    self.num_jobs = num_jobs
    self.num_mas = num_machines
    # quite common in jssp to have as many ops per job as there are machines
    self.min_ops_per_job = min_ops_per_job or self.num_mas
    self.max_ops_per_job = max_ops_per_job or self.num_mas
    self.min_processing_time = min_processing_time
    self.max_processing_time = max_processing_time
    self.one2one_ma_map = one2one_ma_map
    if self.one2one_ma_map:
        assert self.min_ops_per_job == self.max_ops_per_job == self.num_mas

    # determines whether to use a fixed number of total operations or let it vary between instances
    # NOTE: due to the way rl4co builds datasets, we need a fixed size here
    self.n_ops_max = self.max_ops_per_job * self.num_jobs

    # FFSP environment doen't have any other kwargs
    if len(unused_kwargs) > 0:
        log.error(f"Found {len(unused_kwargs)} unused kwargs: {unused_kwargs}")

Single Machine Total Weighted Tardiness Problem (SMTWTP)¶

SMTWTPEnv ¶

SMTWTPEnv(
    generator: SMTWTPGenerator = None,
    generator_params: dict = {},
    **kwargs
)

Bases: RL4COEnvBase

Single Machine Total Weighted Tardiness Problem environment as described in DeepACO (https://arxiv.org/pdf/2309.14032.pdf) SMTWTP is a scheduling problem in which a set of jobs must be processed on a single machine. Each job i has a processing time, a weight, and a due date. The objective is to minimize the sum of the weighted tardiness of all jobs, where the weighted tardiness of a job is defined as the product of its weight and the duration by which its completion time exceeds its due date. At each step, the agent chooses a job to process. The reward is 0 unless the agent processes all the jobs. In that case, the reward is (-)objective value of the processing order: maximizing the reward is equivalent to minimizing the objective.

Observation

job_due_time: the due time of each job
job_weight: the weight of each job
job_process_time: the process time of each job
current_node: the current node
action_mask: a mask of available actions
current_time: the current time

Constants

num_job: number of jobs
min_time_span: lower bound of jobs' due time. By default, jobs' due time is uniformly sampled from (min_time_span, max_time_span)
max_time_span: upper bound of jobs' due time. By default, it will be set to num_job / 2
min_job_weight: lower bound of jobs' weights. By default, jobs' weights are uniformly sampled from (min_job_weight, max_job_weight)
max_job_weight: upper bound of jobs' weights
min_process_time: lower bound of jobs' process time. By default, jobs' process time is uniformly sampled from (min_process_time, max_process_time)
max_process_time: upper bound of jobs' process time

Finishing condition

All jobs are processed

Reward

The reward is 0 unless the agent processes all the jobs.
In that case, the reward is (-)objective value of the processing order: maximizing the reward is equivalent to minimizing the objective.

Parameters:

generator (SMTWTPGenerator, default: None ) –

FFSPGenerator instance as the data generator
generator_params (dict, default: {} ) –

parameters for the generator

Source code in rl4co/envs/scheduling/smtwtp/env.py

def __init__(
    self,
    generator: SMTWTPGenerator = None,
    generator_params: dict = {},
    **kwargs,
):
    super().__init__(**kwargs)
    if generator is None:
        generator = SMTWTPGenerator(**generator_params)
    self.generator = generator
    self._make_spec(self.generator)

SMTWTPGenerator ¶

SMTWTPGenerator(
    num_job: int = 10,
    min_time_span: float = 0,
    max_time_span: float = None,
    min_job_weight: float = 0,
    max_job_weight: float = 1,
    min_process_time: float = 0,
    max_process_time: float = 1,
    **unused_kwargs
)

Bases: Generator

Data generator for the Single Machine Total Weighted Tardiness Problem (SMTWTP) environment

Parameters:

num_job (int, default: 10 ) –

number of jobs
min_time_span (float, default: 0 ) –

lower bound of jobs' due time. By default, jobs' due time is uniformly sampled from (min_time_span, max_time_span)
max_time_span (float, default: None ) –

upper bound of jobs' due time. By default, it will be set to num_job / 2
min_job_weight (float, default: 0 ) –

lower bound of jobs' weights. By default, jobs' weights are uniformly sampled from (min_job_weight, max_job_weight)
max_job_weight (float, default: 1 ) –

upper bound of jobs' weights
min_process_time (float, default: 0 ) –

lower bound of jobs' process time. By default, jobs' process time is uniformly sampled from (min_process_time, max_process_time)
max_process_time (float, default: 1 ) –

upper bound of jobs' process time

Returns:

–

A TensorDict with the following key: job_due_time [batch_size, num_job + 1]: the due time of each job job_weight [batch_size, num_job + 1]: the weight of each job job_process_time [batch_size, num_job + 1]: the process time of each job

Source code in rl4co/envs/scheduling/smtwtp/generator.py

def __init__(
    self,
    num_job: int = 10,
    min_time_span: float = 0,
    max_time_span: float = None, # will be set to num_job / 2 by default. In DeepACO, it is set to num_job, which would be too simple
    min_job_weight: float = 0,
    max_job_weight: float = 1,
    min_process_time: float = 0,
    max_process_time: float = 1,
    **unused_kwargs
):
    self.num_job = num_job
    self.min_time_span = min_time_span
    self.max_time_span = num_job / 2 if max_time_span is None else max_time_span
    self.min_job_weight = min_job_weight
    self.max_job_weight = max_job_weight
    self.min_process_time = min_process_time
    self.max_process_time = max_process_time

    # SMTWTP environment doen't have any other kwargs
    if len(unused_kwargs) > 0:
        log.error(f"Found {len(unused_kwargs)} unused kwargs: {unused_kwargs}")