Constructive NonAutoregressive¶
DeepACO¶
Classes:
-
AntSystem
–Implements the Ant System algorithm: https://doi.org/10.1109/3477.484436.
AntSystem
¶
AntSystem(
log_heuristic: Tensor,
n_ants: int = 20,
alpha: float = 1.0,
beta: float = 1.0,
decay: float = 0.95,
Q: Optional[float] = None,
temperature: float = 0.1,
pheromone: Optional[Tensor] = None,
require_logprobs: bool = False,
use_local_search: bool = False,
use_nls: bool = False,
n_perturbations: int = 5,
local_search_params: dict = {},
perturbation_params: dict = {},
start_node: Optional[int] = None,
)
Implements the Ant System algorithm: https://doi.org/10.1109/3477.484436.
Parameters:
-
log_heuristic
(Tensor
) –Logarithm of the heuristic matrix.
-
n_ants
(int
, default:20
) –Number of ants to be used in the algorithm. Defaults to 20.
-
alpha
(float
, default:1.0
) –Importance of pheromone in the decision-making process. Defaults to 1.0.
-
beta
(float
, default:1.0
) –Importance of heuristic information in the decision-making process. Defaults to 1.0.
-
decay
(float
, default:0.95
) –Rate at which pheromone evaporates. Should be between 0 and 1. Defaults to 0.95.
-
Q
(Optional[float]
, default:None
) –Rate at which pheromone deposits. Defaults to
1 / n_ants
. -
temperature
(float
, default:0.1
) –Temperature for the softmax during decoding. Defaults to 0.1.
-
pheromone
(Optional[Tensor]
, default:None
) –Initial pheromone matrix. Defaults to
torch.ones_like(log_heuristic)
. -
require_logprobs
(bool
, default:False
) –Whether to require the log probability of actions. Defaults to False.
-
use_local_search
(bool
, default:False
) –Whether to use local_search provided by the env. Default to False.
-
use_nls
(bool
, default:False
) –Whether to use neural-guided local search provided by the env. Default to False.
-
n_perturbations
(int
, default:5
) –Number of perturbations to be used for nls. Defaults to 5.
-
local_search_params
(dict
, default:{}
) –Arguments to be passed to the local_search.
-
perturbation_params
(dict
, default:{}
) –Arguments to be passed to the perturbation used for nls.
Methods:
-
run
–Run the Ant System algorithm for a specified number of iterations.
-
local_search
–Perform local search on the actions and reward obtained.
-
get_logp
–Get the log probability (logprobs) values recorded during the execution of the algorithm.
Source code in rl4co/models/zoo/deepaco/antsystem.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 |
|
run
¶
run(
td_initial: TensorDict,
env: RL4COEnvBase,
n_iterations: int,
) -> Tuple[TensorDict, Tensor, Tensor]
Run the Ant System algorithm for a specified number of iterations.
Parameters:
-
td_initial
(TensorDict
) –Initial state of the problem.
-
env
(RL4COEnvBase
) –Environment representing the problem.
-
n_iterations
(int
) –Number of iterations to run the algorithm.
Returns:
-
td
(TensorDict
) –The final state of the problem.
-
actions
(Tensor
) –The final actions chosen by the algorithm.
-
reward
(Tensor
) –The final reward achieved by the algorithm.
Source code in rl4co/models/zoo/deepaco/antsystem.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 |
|
local_search
¶
local_search(
td: TensorDict, env: RL4COEnvBase, actions: Tensor
) -> Tuple[Tensor, Tensor]
Perform local search on the actions and reward obtained.
Parameters:
-
td
(TensorDict
) –Current state of the problem.
-
env
(RL4COEnvBase
) –Environment representing the problem.
-
actions
(Tensor
) –Actions chosen by the algorithm.
Returns:
Source code in rl4co/models/zoo/deepaco/antsystem.py
195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 |
|
get_logp
¶
get_logp()
Get the log probability (logprobs) values recorded during the execution of the algorithm.
Returns:
-
results
–Tuple containing the log probability values, actions chosen, rewards obtained, and mask values (if available).
Raises:
-
AssertionError
–If
require_logp
is not enabled.
Source code in rl4co/models/zoo/deepaco/antsystem.py
291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 |
|
Classes:
-
DeepACO
–Implements DeepACO: https://arxiv.org/abs/2309.14032.
DeepACO
¶
DeepACO(
env: RL4COEnvBase,
policy: Optional[DeepACOPolicy] = None,
baseline: REINFORCEBaseline | str = "no",
policy_kwargs: dict = {},
baseline_kwargs: dict = {},
**kwargs
)
Bases: REINFORCE
Implements DeepACO: https://arxiv.org/abs/2309.14032.
Parameters:
-
env
(RL4COEnvBase
) –Environment to use for the algorithm
-
policy
(Optional[DeepACOPolicy]
, default:None
) –Policy to use for the algorithm
-
baseline
(REINFORCEBaseline | str
, default:'no'
) –REINFORCE baseline. Defaults to exponential
-
policy_kwargs
(dict
, default:{}
) –Keyword arguments for policy
-
baseline_kwargs
(dict
, default:{}
) –Keyword arguments for baseline
-
**kwargs
–Keyword arguments passed to the superclass
Source code in rl4co/models/zoo/deepaco/model.py
21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
Classes:
-
DeepACOPolicy
–Implememts DeepACO policy based on :class:
NonAutoregressivePolicy
. Introduced by Ye et al. (2023): https://arxiv.org/abs/2309.14032.
DeepACOPolicy
¶
DeepACOPolicy(
encoder: Optional[NonAutoregressiveEncoder] = None,
env_name: str = "tsp",
temperature: float = 1.0,
aco_class: Optional[Type[AntSystem]] = None,
aco_kwargs: dict = {},
train_with_local_search: bool = True,
n_ants: Optional[int | dict] = None,
n_iterations: Optional[int | dict] = None,
ls_reward_aug_W: float = 0.95,
**encoder_kwargs
)
Bases: NonAutoregressivePolicy
Implememts DeepACO policy based on :class:NonAutoregressivePolicy
. Introduced by Ye et al. (2023): https://arxiv.org/abs/2309.14032.
This policy uses a Non-Autoregressive Graph Neural Network to generate heatmaps,
which are then used to run Ant Colony Optimization (ACO) to construct solutions.
Parameters:
-
encoder
(Optional[NonAutoregressiveEncoder]
, default:None
) –Encoder module. Can be passed by sub-classes
-
env_name
(str
, default:'tsp'
) –Name of the environment used to initialize embeddings
-
temperature
(float
, default:1.0
) –Temperature for the softmax during decoding. Defaults to 0.1.
-
aco_class
(Optional[Type[AntSystem]]
, default:None
) –Class representing the ACO algorithm to be used. Defaults to :class:
AntSystem
. -
aco_kwargs
(dict
, default:{}
) –Additional arguments to be passed to the ACO algorithm.
-
n_ants
(Optional[int | dict]
, default:None
) –Number of ants to be used in the ACO algorithm. Can be an integer or dictionary. Defaults to 20.
-
n_iterations
(Optional[int | dict]
, default:None
) –Number of iterations to run the ACO algorithm. Can be an integer or dictionary. Defaults to
dict(train=1, val=20, test=100)
. -
ls_reward_aug_W
(float
, default:0.95
) –Coefficient to be used for the reward augmentation with the local search. Defaults to 0.95.
-
encoder_kwargs
–Additional arguments to be passed to the encoder.
Methods:
-
forward
–Forward method. During validation and testing, the policy runs the ACO algorithm to construct solutions.
Source code in rl4co/models/zoo/deepaco/policy.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
|
forward
¶
forward(
td_initial: TensorDict,
env: Optional[str | RL4COEnvBase] = None,
calc_reward: bool = True,
phase: str = "train",
actions=None,
return_actions: bool = True,
return_hidden: bool = True,
**kwargs
)
Forward method. During validation and testing, the policy runs the ACO algorithm to construct solutions.
See :class:NonAutoregressivePolicy
for more details during the training phase.
Source code in rl4co/models/zoo/deepaco/policy.py
66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 |
|
NAR-GNN¶
Classes:
-
NARGNNPolicy
–Base Non-autoregressive policy for NCO construction methods.
NARGNNPolicy
¶
NARGNNPolicy(
encoder: Optional[NonAutoregressiveEncoder] = None,
decoder: Optional[NonAutoregressiveDecoder] = None,
embed_dim: int = 64,
env_name: str = "tsp",
init_embedding: Optional[Module] = None,
edge_embedding: Optional[Module] = None,
graph_network: Optional[Module] = None,
heatmap_generator: Optional[Module] = None,
num_layers_heatmap_generator: int = 5,
num_layers_graph_encoder: int = 15,
act_fn="silu",
agg_fn="mean",
linear_bias: bool = True,
train_decode_type: str = "multistart_sampling",
val_decode_type: str = "multistart_greedy",
test_decode_type: str = "multistart_greedy",
**constructive_policy_kw
)
Bases: NonAutoregressivePolicy
Base Non-autoregressive policy for NCO construction methods. This creates a heatmap of NxN for N nodes (i.e., heuristic) that models the probability to go from one node to another for all nodes.
The policy performs the following steps
- Encode the environment initial state into node embeddings
- Decode (non-autoregressively) to construct the solution to the NCO problem
Warning
The effectiveness of the non-autoregressive approach can vary significantly across different problem types and configurations. It may require careful tuning of the model architecture and decoding strategy to achieve competitive results.
Parameters:
-
encoder
(Optional[NonAutoregressiveEncoder]
, default:None
) –Encoder module. Can be passed by sub-classes
-
decoder
(Optional[NonAutoregressiveDecoder]
, default:None
) –Decoder module. Note that this moule defaults to the non-autoregressive decoder
-
embed_dim
(int
, default:64
) –Dimension of the embeddings
-
env_name
(str
, default:'tsp'
) –Name of the environment used to initialize embeddings
-
init_embedding
(Optional[Module]
, default:None
) –Model to use for the initial embedding. If None, use the default embedding for the environment
-
edge_embedding
(Optional[Module]
, default:None
) –Model to use for the edge embedding. If None, use the default embedding for the environment
-
graph_network
(Optional[Module]
, default:None
) –Model to use for the graph network. If None, use the default embedding for the environment
-
heatmap_generator
(Optional[Module]
, default:None
) –Model to use for the heatmap generator. If None, use the default embedding for the environment
-
num_layers_heatmap_generator
(int
, default:5
) –Number of layers in the heatmap generator
-
num_layers_graph_encoder
(int
, default:15
) –Number of layers in the graph encoder
-
act_fn
–Activation function to use in the encoder
-
agg_fn
–Aggregation function to use in the encoder
-
linear_bias
(bool
, default:True
) –Whether to use bias in the encoder
-
train_decode_type
(str
, default:'multistart_sampling'
) –Type of decoding during training
-
val_decode_type
(str
, default:'multistart_greedy'
) –Type of decoding during validation
-
test_decode_type
(str
, default:'multistart_greedy'
) –Type of decoding during testing
-
**constructive_policy_kw
–Unused keyword arguments
Source code in rl4co/models/zoo/nargnn/policy.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
|
Classes:
-
EdgeHeatmapGenerator
–MLP for converting edge embeddings to heatmaps.
-
NARGNNEncoder
–Anisotropic Graph Neural Network encoder with edge-gating mechanism as in Joshi et al. (2022), and used in DeepACO (Ye et al., 2023).
-
NARGNNNodeEncoder
–In this case, we just use the node embeddings from the graph
EdgeHeatmapGenerator
¶
EdgeHeatmapGenerator(
embed_dim: int,
num_layers: int,
act_fn: str | Callable = "silu",
linear_bias: bool = True,
undirected_graph: bool = True,
)
Bases: Module
MLP for converting edge embeddings to heatmaps.
Parameters:
-
embed_dim
(int
) –Dimension of the embeddings
-
num_layers
(int
) –The number of linear layers in the network.
-
act_fn
(str | Callable
, default:'silu'
) –Activation function. Defaults to "silu".
-
linear_bias
(bool
, default:True
) –Use bias in linear layers. Defaults to True.
-
undirected_graph
(bool
, default:True
) –Whether the graph is undirected. Defaults to True.
Source code in rl4co/models/zoo/nargnn/encoder.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
NARGNNEncoder
¶
NARGNNEncoder(
embed_dim: int = 64,
env_name: str = "tsp",
init_embedding: Optional[Module] = None,
edge_embedding: Optional[Module] = None,
graph_network: Optional[Module] = None,
heatmap_generator: Optional[Module] = None,
num_layers_heatmap_generator: int = 5,
num_layers_graph_encoder: int = 15,
act_fn="silu",
agg_fn="mean",
linear_bias: bool = True,
k_sparse: Optional[int] = None,
)
Bases: NonAutoregressiveEncoder
Anisotropic Graph Neural Network encoder with edge-gating mechanism as in Joshi et al. (2022), and used in DeepACO (Ye et al., 2023). This creates a heatmap of NxN for N nodes (i.e., heuristic) that models the probability to go from one node to another for all nodes. This model utilizes a multi-layer perceptron (MLP) approach to predict edge attributes directly from the input graph features, which are then transformed into a heatmap representation to facilitate the decoding of the solution. The decoding process is managed by a specified strategy which could vary from simple greedy selection to more complex sampling methods.
Tip
This decoder's performance heavily relies on the ability of the MLP to capture the dependencies between different parts of the solution without the iterative refinement provided by autoregressive models. It is particularly useful in scenarios where the solution space can be effectively explored in a parallelized manner or when the solution components are largely independent.
Parameters:
-
embed_dim
(int
, default:64
) –Dimension of the node embeddings
-
env_name
(str
, default:'tsp'
) –Name of the environment used to initialize embeddings
-
num_layers
–Number of layers in the encoder
-
init_embedding
(Optional[Module]
, default:None
) –Model to use for the initial embedding. If None, use the default embedding for the environment
-
edge_embedding
(Optional[Module]
, default:None
) –Model to use for the edge embedding. If None, use the default embedding for the environment
-
graph_network
(Optional[Module]
, default:None
) –Model to use for the graph network. If None, use the default network for the environment
-
heatmap_generator
(Optional[Module]
, default:None
) –Model to use for the heatmap generator. If None, use the default network for the environment
-
num_layers_heatmap_generator
(int
, default:5
) –Number of layers in the heatmap generator
-
num_layers_graph_encoder
(int
, default:15
) –Number of layers in the graph encoder
-
act_fn
–The activation function to use in each GNNLayer, see https://pytorch.org/docs/stable/nn.functional.html#non-linear-activation-functions for available options. Defaults to 'silu'.
-
agg_fn
–The aggregation function to use in each GNNLayer for pooling features. Options: 'add', 'mean', 'max'. Defaults to 'mean'.
-
linear_bias
(bool
, default:True
) –Use bias in linear layers. Defaults to True.
-
k_sparse
(Optional[int]
, default:None
) –Number of edges to keep for each node. Defaults to None.
Methods:
-
forward
–Forward pass of the encoder.
Source code in rl4co/models/zoo/nargnn/encoder.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
|
forward
¶
forward(td: TensorDict)
Forward pass of the encoder. Transform the input TensorDict into the latent representation.
Source code in rl4co/models/zoo/nargnn/encoder.py
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
|
NARGNNNodeEncoder
¶
NARGNNNodeEncoder(
embed_dim: int = 64,
env_name: str = "tsp",
init_embedding: Optional[Module] = None,
edge_embedding: Optional[Module] = None,
graph_network: Optional[Module] = None,
heatmap_generator: Optional[Module] = None,
num_layers_heatmap_generator: int = 5,
num_layers_graph_encoder: int = 15,
act_fn="silu",
agg_fn="mean",
linear_bias: bool = True,
k_sparse: Optional[int] = None,
)
Bases: NARGNNEncoder
In this case, we just use the node embeddings from the graph without transforming them into a heatmap.
Methods:
-
forward
–Forward pass of the encoder.
Source code in rl4co/models/zoo/nargnn/encoder.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
|
forward
¶
forward(td: TensorDict)
Forward pass of the encoder. Transform the input TensorDict into the latent representation.
Source code in rl4co/models/zoo/nargnn/encoder.py
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 |
|