Skip to content

Scheduling Problems

Flexible Flow Shop Problem (FFSP)

FFSPEnv

FFSPEnv(
    generator: FFSPGenerator = None,
    generator_params: dict = {},
    **kwargs
)

Bases: RL4COEnvBase

Flexible Flow Shop Problem (FFSP) environment. The goal is to schedule a set of jobs on a set of machines such that the makespan is minimized.

Observations
  • time index
  • sub time index
  • batch index
  • machine index
  • schedule
  • machine wait step
  • job location
  • job wait step
  • job duration
Constraints
  • each job has to be processed on each machine in a specific order
  • the machine has to be available to process the job
  • the job has to be available to be processed
Finish Condition
  • all jobs are scheduled
Reward
  • (minus) the makespan of the schedule

Parameters:

  • generator (FFSPGenerator, default: None ) –

    FFSPGenerator instance as the data generator

  • generator_params (dict, default: {} ) –

    parameters for the generator

Source code in rl4co/envs/scheduling/ffsp/env.py
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
def __init__(
    self,
    generator: FFSPGenerator = None,
    generator_params: dict = {},
    **kwargs,
):
    super().__init__(check_solution=False, dataset_cls=FastTdDataset, **kwargs)
    if generator is None:
        generator = FFSPGenerator(**generator_params)
    self.generator = generator

    self.num_stage = generator.num_stage
    self.num_machine = generator.num_machine
    self.num_job = generator.num_job
    self.num_machine_total = generator.num_machine_total
    self.tables = None
    self.step_cnt = None
    self.flatten_stages = generator.flatten_stages

    self._make_spec(generator)

FFSPGenerator

FFSPGenerator(
    num_stage: int = 2,
    num_machine: int = 3,
    num_job: int = 4,
    min_time: int = 2,
    max_time: int = 10,
    flatten_stages: bool = True,
    **unused_kwargs
)

Bases: Generator

Data generator for the Flow Shop Scheduling Problem (FFSP).

Parameters:

  • num_stage (int, default: 2 ) –

    number of stages

  • num_machine (int, default: 3 ) –

    number of machines

  • num_job (int, default: 4 ) –

    number of jobs

  • min_time (int, default: 2 ) –

    minimum running time of each job on each machine

  • max_time (int, default: 10 ) –

    maximum running time of each job on each machine

  • flatten_stages (bool, default: True ) –

    whether to flatten the stages

Returns:

  • A TensorDict with the following key: run_time [batch_size, num_job, num_machine, num_stage]: running time of each job on each machine

Note
  • [IMPORTANT] This version of ffsp requires the number of machines in each stage to be the same
Source code in rl4co/envs/scheduling/ffsp/generator.py
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
def __init__(
    self,
    num_stage: int = 2,
    num_machine: int = 3,
    num_job: int = 4,
    min_time: int = 2,
    max_time: int = 10,
    flatten_stages: bool = True,
    **unused_kwargs,
):
    self.num_stage = num_stage
    self.num_machine = num_machine
    self.num_machine_total = num_machine * num_stage
    self.num_job = num_job
    self.min_time = min_time
    self.max_time = max_time
    self.flatten_stages = flatten_stages

    # FFSP environment doen't have any other kwargs
    if len(unused_kwargs) > 0:
        log.error(f"Found {len(unused_kwargs)} unused kwargs: {unused_kwargs}")

Flexible Job Shop Problem (FJSP)

FJSPEnv

FJSPEnv(
    generator: FJSPGenerator = None,
    generator_params: dict = {},
    mask_no_ops: bool = True,
    check_mask: bool = False,
    stepwise_reward: bool = False,
    **kwargs
)

Bases: RL4COEnvBase

Flexible Job-Shop Scheduling Problem (FJSP) environment At each step, the agent chooses a job-machine combination. The operation to be processed next for the selected job is then executed on the selected machine. The reward is 0 unless the agent scheduled all operations of all jobs. In that case, the reward is (-)makespan of the schedule: maximizing the reward is equivalent to minimizing the makespan.

Observations
  • time: current time
  • next_op: next operation per job
  • proc_times: processing time of operation-machine pairs
  • pad_mask: specifies padded operations
  • start_op_per_job: id of first operation per job
  • end_op_per_job: id of last operation per job
  • start_times: start time of operation (defaults to 0 if not scheduled)
  • finish_times: finish time of operation (defaults to INIT_FINISH if not scheduled)
  • job_ops_adj: adjacency matrix specifying job-operation affiliation
  • ops_job_map: same as above but using ids of jobs to indicate affiliation
  • ops_sequence_order: specifies the order in which operations have to be processed
  • ma_assignment: specifies which operation has been scheduled on which machine
  • busy_until: specifies until when the machine will be busy
  • num_eligible: number of machines that can process an operation
  • job_in_process: whether job is currently being processed
  • job_done: whether the job is done
Constrains

the agent may not select:

  • machines that are currently busy
  • jobs that are done already
  • jobs that are currently processed
  • job-machine combinations, where the machine cannot process the next operation of the job
Finish condition
  • the agent has scheduled all operations of all jobs
Reward
  • the negative makespan of the final schedule

Parameters:

  • generator (FJSPGenerator, default: None ) –

    FJSPGenerator instance as the data generator

  • generator_params (dict, default: {} ) –

    parameters for the generator

  • mask_no_ops (bool, default: True ) –

    if True, agent may not select waiting operation (unless instance is done)

Source code in rl4co/envs/scheduling/fjsp/env.py
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
def __init__(
    self,
    generator: FJSPGenerator = None,
    generator_params: dict = {},
    mask_no_ops: bool = True,
    check_mask: bool = False,
    stepwise_reward: bool = False,
    **kwargs,
):
    super().__init__(check_solution=False, **kwargs)
    if generator is None:
        if generator_params.get("file_path", None) is not None:
            generator = FJSPFileGenerator(**generator_params)
        else:
            generator = FJSPGenerator(**generator_params)
    self.generator = generator
    self._num_mas = generator.num_mas
    self._num_jobs = generator.num_jobs
    self._n_ops_max = generator.max_ops_per_job * self.num_jobs

    self.mask_no_ops = mask_no_ops
    self.check_mask = check_mask
    self.stepwise_reward = stepwise_reward
    self._make_spec(self.generator)

FJSPGenerator

FJSPGenerator(
    num_jobs: int = 10,
    num_machines: int = 5,
    min_ops_per_job: int = 4,
    max_ops_per_job: int = 6,
    min_processing_time: int = 1,
    max_processing_time: int = 20,
    min_eligible_ma_per_op: int = 1,
    max_eligible_ma_per_op: int = None,
    same_mean_per_op: bool = True,
    **unused_kwargs
)

Bases: Generator

Data generator for the Flexible Job-Shop Scheduling Problem (FJSP).

Parameters:

  • num_stage

    number of stages

  • num_machine

    number of machines

  • num_job

    number of jobs

  • min_time

    minimum running time of each job on each machine

  • max_time

    maximum running time of each job on each machine

  • flatten_stages

    whether to flatten the stages

Returns:

  • A TensorDict with the following key: start_op_per_job [batch_size, num_jobs]: first operation of each job end_op_per_job [batch_size, num_jobs]: last operation of each job proc_times [batch_size, num_machines, total_n_ops]: processing time of ops on machines pad_mask [batch_size, total_n_ops]: not all instances have the same number of ops, so padding is used

Source code in rl4co/envs/scheduling/fjsp/generator.py
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
def __init__(
    self,
    num_jobs: int = 10,
    num_machines: int = 5,
    min_ops_per_job: int = 4,
    max_ops_per_job: int = 6,
    min_processing_time: int = 1,
    max_processing_time: int = 20,
    min_eligible_ma_per_op: int = 1,
    max_eligible_ma_per_op: int = None,
    same_mean_per_op: bool = True,
    **unused_kwargs,
):
    self.num_jobs = num_jobs
    self.num_mas = num_machines
    self.min_ops_per_job = min_ops_per_job
    self.max_ops_per_job = max_ops_per_job
    self.min_processing_time = min_processing_time
    self.max_processing_time = max_processing_time
    self.min_eligible_ma_per_op = min_eligible_ma_per_op
    self.max_eligible_ma_per_op = max_eligible_ma_per_op or num_machines
    # determines whether to use a fixed number of total operations or let it vary between instances
    # NOTE: due to the way rl4co builds datasets, we need a fixed size here
    self.n_ops_max = max_ops_per_job * num_jobs
    self.same_mean_per_op = same_mean_per_op
    # FFSP environment doen't have any other kwargs
    if len(unused_kwargs) > 0:
        log.error(f"Found {len(unused_kwargs)} unused kwargs: {unused_kwargs}")

Job Shop Scheduling Problem (JSSP)

JSSPEnv

JSSPEnv(
    generator: JSSPGenerator = None,
    generator_params: dict = {},
    mask_no_ops: bool = True,
    **kwargs
)

Bases: FJSPEnv

Job-Shop Scheduling Problem (JSSP) environment At each step, the agent chooses a job. The operation to be processed next for the selected job is then executed on the associated machine. The reward is 0 unless the agent scheduled all operations of all jobs. In that case, the reward is (-)makespan of the schedule: maximizing the reward is equivalent to minimizing the makespan. NOTE: The JSSP is a special case of the FJSP, when the number of eligible machines per operation is equal to one for all operations. Therefore, this environment is a subclass of the FJSP environment. Observations:

- time: current time
- next_op: next operation per job
- proc_times: processing time of operation-machine pairs
- pad_mask: specifies padded operations
- start_op_per_job: id of first operation per job
- end_op_per_job: id of last operation per job
- start_times: start time of operation (defaults to 0 if not scheduled)
- finish_times: finish time of operation (defaults to INIT_FINISH if not scheduled)
- job_ops_adj: adjacency matrix specifying job-operation affiliation
- ops_job_map: same as above but using ids of jobs to indicate affiliation
- ops_sequence_order: specifies the order in which operations have to be processed
- ma_assignment: specifies which operation has been scheduled on which machine
- busy_until: specifies until when the machine will be busy
- num_eligible: number of machines that can process an operation
- job_in_process: whether job is currently being processed
- job_done: whether the job is done
Constrains

the agent may not select:

  • jobs that are done already
  • jobs that are currently processed
Finish condition
  • the agent has scheduled all operations of all jobs
Reward
  • the negative makespan of the final schedule

Parameters:

  • generator (JSSPGenerator, default: None ) –

    JSSPGenerator instance as the data generator

  • generator_params (dict, default: {} ) –

    parameters for the generator

  • mask_no_ops (bool, default: True ) –

    if True, agent may not select waiting operation (unless instance is done)

Source code in rl4co/envs/scheduling/jssp/env.py
57
58
59
60
61
62
63
64
65
66
67
68
69
70
def __init__(
    self,
    generator: JSSPGenerator = None,
    generator_params: dict = {},
    mask_no_ops: bool = True,
    **kwargs,
):
    if generator is None:
        if generator_params.get("file_path", None) is not None:
            generator = JSSPFileGenerator(**generator_params)
        else:
            generator = JSSPGenerator(**generator_params)

    super().__init__(generator, generator_params, mask_no_ops, **kwargs)

JSSPGenerator

JSSPGenerator(
    num_jobs: int = 6,
    num_machines: int = 6,
    min_ops_per_job: int = None,
    max_ops_per_job: int = None,
    min_processing_time: int = 1,
    max_processing_time: int = 99,
    one2one_ma_map: bool = True,
    **unused_kwargs
)

Bases: Generator

Data generator for the Job-Shop Scheduling Problem (JSSP)

Parameters:

  • num_stage

    number of stages

  • num_machine

    number of machines

  • num_job

    number of jobs

  • min_time

    minimum running time of each job on each machine

  • max_time

    maximum running time of each job on each machine

  • flatten_stages

    whether to flatten the stages

  • one2one_ma_map (bool, default: True ) –

    whether each machine should have exactly one operation per job (common in jssp benchmark instances)

Returns:

  • A TensorDict with the following key: start_op_per_job [batch_size, num_jobs]: first operation of each job end_op_per_job [batch_size, num_jobs]: last operation of each job proc_times [batch_size, num_machines, total_n_ops]: processing time of ops on machines pad_mask [batch_size, total_n_ops]: not all instances have the same number of ops, so padding is used

Source code in rl4co/envs/scheduling/jssp/generator.py
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
def __init__(
    self,
    num_jobs: int = 6,
    num_machines: int = 6,
    min_ops_per_job: int = None,
    max_ops_per_job: int = None,
    min_processing_time: int = 1,
    max_processing_time: int = 99,
    one2one_ma_map: bool = True,
    **unused_kwargs,
):
    self.num_jobs = num_jobs
    self.num_mas = num_machines
    # quite common in jssp to have as many ops per job as there are machines
    self.min_ops_per_job = min_ops_per_job or self.num_mas
    self.max_ops_per_job = max_ops_per_job or self.num_mas
    self.min_processing_time = min_processing_time
    self.max_processing_time = max_processing_time
    self.one2one_ma_map = one2one_ma_map
    if self.one2one_ma_map:
        assert self.min_ops_per_job == self.max_ops_per_job == self.num_mas

    # determines whether to use a fixed number of total operations or let it vary between instances
    # NOTE: due to the way rl4co builds datasets, we need a fixed size here
    self.n_ops_max = self.max_ops_per_job * self.num_jobs

    # FFSP environment doen't have any other kwargs
    if len(unused_kwargs) > 0:
        log.error(f"Found {len(unused_kwargs)} unused kwargs: {unused_kwargs}")

Single Machine Total Weighted Tardiness Problem (SMTWTP)

SMTWTPEnv

SMTWTPEnv(
    generator: SMTWTPGenerator = None,
    generator_params: dict = {},
    **kwargs
)

Bases: RL4COEnvBase

Single Machine Total Weighted Tardiness Problem environment as described in DeepACO (https://arxiv.org/pdf/2309.14032.pdf) SMTWTP is a scheduling problem in which a set of jobs must be processed on a single machine. Each job i has a processing time, a weight, and a due date. The objective is to minimize the sum of the weighted tardiness of all jobs, where the weighted tardiness of a job is defined as the product of its weight and the duration by which its completion time exceeds its due date. At each step, the agent chooses a job to process. The reward is 0 unless the agent processes all the jobs. In that case, the reward is (-)objective value of the processing order: maximizing the reward is equivalent to minimizing the objective.

Observation
  • job_due_time: the due time of each job
  • job_weight: the weight of each job
  • job_process_time: the process time of each job
  • current_node: the current node
  • action_mask: a mask of available actions
  • current_time: the current time
Constants
  • num_job: number of jobs
  • min_time_span: lower bound of jobs' due time. By default, jobs' due time is uniformly sampled from (min_time_span, max_time_span)
  • max_time_span: upper bound of jobs' due time. By default, it will be set to num_job / 2
  • min_job_weight: lower bound of jobs' weights. By default, jobs' weights are uniformly sampled from (min_job_weight, max_job_weight)
  • max_job_weight: upper bound of jobs' weights
  • min_process_time: lower bound of jobs' process time. By default, jobs' process time is uniformly sampled from (min_process_time, max_process_time)
  • max_process_time: upper bound of jobs' process time
Finishing condition
  • All jobs are processed
Reward
  • The reward is 0 unless the agent processes all the jobs.
  • In that case, the reward is (-)objective value of the processing order: maximizing the reward is equivalent to minimizing the objective.

Parameters:

  • generator (SMTWTPGenerator, default: None ) –

    FFSPGenerator instance as the data generator

  • generator_params (dict, default: {} ) –

    parameters for the generator

Source code in rl4co/envs/scheduling/smtwtp/env.py
62
63
64
65
66
67
68
69
70
71
72
def __init__(
    self,
    generator: SMTWTPGenerator = None,
    generator_params: dict = {},
    **kwargs,
):
    super().__init__(**kwargs)
    if generator is None:
        generator = SMTWTPGenerator(**generator_params)
    self.generator = generator
    self._make_spec(self.generator)

SMTWTPGenerator

SMTWTPGenerator(
    num_job: int = 10,
    min_time_span: float = 0,
    max_time_span: float = None,
    min_job_weight: float = 0,
    max_job_weight: float = 1,
    min_process_time: float = 0,
    max_process_time: float = 1,
    **unused_kwargs
)

Bases: Generator

Data generator for the Single Machine Total Weighted Tardiness Problem (SMTWTP) environment

Parameters:

  • num_job (int, default: 10 ) –

    number of jobs

  • min_time_span (float, default: 0 ) –

    lower bound of jobs' due time. By default, jobs' due time is uniformly sampled from (min_time_span, max_time_span)

  • max_time_span (float, default: None ) –

    upper bound of jobs' due time. By default, it will be set to num_job / 2

  • min_job_weight (float, default: 0 ) –

    lower bound of jobs' weights. By default, jobs' weights are uniformly sampled from (min_job_weight, max_job_weight)

  • max_job_weight (float, default: 1 ) –

    upper bound of jobs' weights

  • min_process_time (float, default: 0 ) –

    lower bound of jobs' process time. By default, jobs' process time is uniformly sampled from (min_process_time, max_process_time)

  • max_process_time (float, default: 1 ) –

    upper bound of jobs' process time

Returns:

  • A TensorDict with the following key: job_due_time [batch_size, num_job + 1]: the due time of each job job_weight [batch_size, num_job + 1]: the weight of each job job_process_time [batch_size, num_job + 1]: the process time of each job

Source code in rl4co/envs/scheduling/smtwtp/generator.py
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
def __init__(
    self,
    num_job: int = 10,
    min_time_span: float = 0,
    max_time_span: float = None, # will be set to num_job / 2 by default. In DeepACO, it is set to num_job, which would be too simple
    min_job_weight: float = 0,
    max_job_weight: float = 1,
    min_process_time: float = 0,
    max_process_time: float = 1,
    **unused_kwargs
):
    self.num_job = num_job
    self.min_time_span = min_time_span
    self.max_time_span = num_job / 2 if max_time_span is None else max_time_span
    self.min_job_weight = min_job_weight
    self.max_job_weight = max_job_weight
    self.min_process_time = min_process_time
    self.max_process_time = max_process_time

    # SMTWTP environment doen't have any other kwargs
    if len(unused_kwargs) > 0:
        log.error(f"Found {len(unused_kwargs)} unused kwargs: {unused_kwargs}")