Test Model on VRPLib¶
In this notebook, we will test the trained model's performance on the VRPLib benchmark. We will use the trained model from the previous notebook.
VRPLIB is a collection of instances related to the CVRP, which is a classic optimization challenge in the field of logistics and transportation.
Before we start¶
To use the VRPLib, we strongly recomment to use the Python vrplib
tool:
VRPLib is a Python package for working with Vehicle Routing Problem (VRP) instances. This tool can help us easily load the VRPLib instances and visualize the results.
Installation¶
Uncomment the following line to install the package from PyPI. Remember to choose a GPU runtime for faster training!
Note: You may need to restart the runtime in Colab after this
In [ ]:
Copied!
# !pip install rl4co[graph] # include torch-geometric
## NOTE: to install latest version from Github (may be unstable) install from source instead:
# !pip install git+https://github.com/ai4co/rl4co.git
# !pip install rl4co[graph] # include torch-geometric
## NOTE: to install latest version from Github (may be unstable) install from source instead:
# !pip install git+https://github.com/ai4co/rl4co.git
In [ ]:
Copied!
# Install the `tsplib95` package
# !pip install vrplib
# Install the `tsplib95` package
# !pip install vrplib
Imports¶
In [8]:
Copied!
%load_ext autoreload
%autoreload 2
import os
import re
import torch
import vrplib
from rl4co.envs import TSPEnv, CVRPEnv
from rl4co.models.zoo.am import AttentionModel
from rl4co.utils.trainer import RL4COTrainer
from rl4co.utils.decoding import get_log_likelihood
from rl4co.models.zoo import EAS, EASLay, EASEmb, ActiveSearch
from tqdm import tqdm
from math import ceil
from einops import repeat
%load_ext autoreload
%autoreload 2
import os
import re
import torch
import vrplib
from rl4co.envs import TSPEnv, CVRPEnv
from rl4co.models.zoo.am import AttentionModel
from rl4co.utils.trainer import RL4COTrainer
from rl4co.utils.decoding import get_log_likelihood
from rl4co.models.zoo import EAS, EASLay, EASEmb, ActiveSearch
from tqdm import tqdm
from math import ceil
from einops import repeat
The autoreload extension is already loaded. To reload it, use: %reload_ext autoreload
Load a trained model¶
In [15]:
Copied!
# Load from checkpoint; alternatively, simply instantiate a new model
# Note the model is trained for CVRP problem
checkpoint_path = "../cvrp-20.ckpt" # modify the path to your checkpoint file
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# load checkpoint
# checkpoint = torch.load(checkpoint_path)
lit_model = AttentionModel.load_from_checkpoint(checkpoint_path, load_baseline=False)
policy, env = lit_model.policy, lit_model.env
policy = policy.to(device)
# Load from checkpoint; alternatively, simply instantiate a new model
# Note the model is trained for CVRP problem
checkpoint_path = "../cvrp-20.ckpt" # modify the path to your checkpoint file
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
# load checkpoint
# checkpoint = torch.load(checkpoint_path)
lit_model = AttentionModel.load_from_checkpoint(checkpoint_path, load_baseline=False)
policy, env = lit_model.policy, lit_model.env
policy = policy.to(device)
/home/cbhua/miniconda3/envs/rl4co-user/lib/python3.10/site-packages/lightning/pytorch/utilities/parsing.py:198: Attribute 'env' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['env'])`. /home/cbhua/miniconda3/envs/rl4co-user/lib/python3.10/site-packages/lightning/pytorch/utilities/parsing.py:198: Attribute 'policy' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['policy'])`. /home/cbhua/miniconda3/envs/rl4co-user/lib/python3.10/site-packages/lightning/pytorch/core/saving.py:177: Found keys that are not in the model state dict but in the checkpoint: ['baseline.baseline.model.encoder.init_embedding.init_embed.weight', 'baseline.baseline.model.encoder.init_embedding.init_embed.bias', 'baseline.baseline.model.encoder.init_embedding.init_embed_depot.weight', 'baseline.baseline.model.encoder.init_embedding.init_embed_depot.bias', 'baseline.baseline.model.encoder.net.layers.0.0.module.Wqkv.weight', 'baseline.baseline.model.encoder.net.layers.0.0.module.Wqkv.bias', 'baseline.baseline.model.encoder.net.layers.0.0.module.out_proj.weight', 'baseline.baseline.model.encoder.net.layers.0.0.module.out_proj.bias', 'baseline.baseline.model.encoder.net.layers.0.1.normalizer.weight', 'baseline.baseline.model.encoder.net.layers.0.1.normalizer.bias', 'baseline.baseline.model.encoder.net.layers.0.1.normalizer.running_mean', 'baseline.baseline.model.encoder.net.layers.0.1.normalizer.running_var', 'baseline.baseline.model.encoder.net.layers.0.1.normalizer.num_batches_tracked', 'baseline.baseline.model.encoder.net.layers.0.2.module.0.weight', 'baseline.baseline.model.encoder.net.layers.0.2.module.0.bias', 'baseline.baseline.model.encoder.net.layers.0.2.module.2.weight', 'baseline.baseline.model.encoder.net.layers.0.2.module.2.bias', 'baseline.baseline.model.encoder.net.layers.0.3.normalizer.weight', 'baseline.baseline.model.encoder.net.layers.0.3.normalizer.bias', 'baseline.baseline.model.encoder.net.layers.0.3.normalizer.running_mean', 'baseline.baseline.model.encoder.net.layers.0.3.normalizer.running_var', 'baseline.baseline.model.encoder.net.layers.0.3.normalizer.num_batches_tracked', 'baseline.baseline.model.encoder.net.layers.1.0.module.Wqkv.weight', 'baseline.baseline.model.encoder.net.layers.1.0.module.Wqkv.bias', 'baseline.baseline.model.encoder.net.layers.1.0.module.out_proj.weight', 'baseline.baseline.model.encoder.net.layers.1.0.module.out_proj.bias', 'baseline.baseline.model.encoder.net.layers.1.1.normalizer.weight', 'baseline.baseline.model.encoder.net.layers.1.1.normalizer.bias', 'baseline.baseline.model.encoder.net.layers.1.1.normalizer.running_mean', 'baseline.baseline.model.encoder.net.layers.1.1.normalizer.running_var', 'baseline.baseline.model.encoder.net.layers.1.1.normalizer.num_batches_tracked', 'baseline.baseline.model.encoder.net.layers.1.2.module.0.weight', 'baseline.baseline.model.encoder.net.layers.1.2.module.0.bias', 'baseline.baseline.model.encoder.net.layers.1.2.module.2.weight', 'baseline.baseline.model.encoder.net.layers.1.2.module.2.bias', 'baseline.baseline.model.encoder.net.layers.1.3.normalizer.weight', 'baseline.baseline.model.encoder.net.layers.1.3.normalizer.bias', 'baseline.baseline.model.encoder.net.layers.1.3.normalizer.running_mean', 'baseline.baseline.model.encoder.net.layers.1.3.normalizer.running_var', 'baseline.baseline.model.encoder.net.layers.1.3.normalizer.num_batches_tracked', 'baseline.baseline.model.encoder.net.layers.2.0.module.Wqkv.weight', 'baseline.baseline.model.encoder.net.layers.2.0.module.Wqkv.bias', 'baseline.baseline.model.encoder.net.layers.2.0.module.out_proj.weight', 'baseline.baseline.model.encoder.net.layers.2.0.module.out_proj.bias', 'baseline.baseline.model.encoder.net.layers.2.1.normalizer.weight', 'baseline.baseline.model.encoder.net.layers.2.1.normalizer.bias', 'baseline.baseline.model.encoder.net.layers.2.1.normalizer.running_mean', 'baseline.baseline.model.encoder.net.layers.2.1.normalizer.running_var', 'baseline.baseline.model.encoder.net.layers.2.1.normalizer.num_batches_tracked', 'baseline.baseline.model.encoder.net.layers.2.2.module.0.weight', 'baseline.baseline.model.encoder.net.layers.2.2.module.0.bias', 'baseline.baseline.model.encoder.net.layers.2.2.module.2.weight', 'baseline.baseline.model.encoder.net.layers.2.2.module.2.bias', 'baseline.baseline.model.encoder.net.layers.2.3.normalizer.weight', 'baseline.baseline.model.encoder.net.layers.2.3.normalizer.bias', 'baseline.baseline.model.encoder.net.layers.2.3.normalizer.running_mean', 'baseline.baseline.model.encoder.net.layers.2.3.normalizer.running_var', 'baseline.baseline.model.encoder.net.layers.2.3.normalizer.num_batches_tracked', 'baseline.baseline.model.decoder.context_embedding.project_context.weight', 'baseline.baseline.model.decoder.project_node_embeddings.weight', 'baseline.baseline.model.decoder.project_fixed_context.weight', 'baseline.baseline.model.decoder.logit_attention.project_out.weight']
Download vrp problems¶
In [11]:
Copied!
problem_names = vrplib.list_names(low=50, high=200, vrp_type='cvrp')
instances = [] # Collect Set A, B, E, F, M datasets
for name in problem_names:
if 'A' in name:
instances.append(name)
elif 'B' in name:
instances.append(name)
elif 'E' in name:
instances.append(name)
elif 'F' in name:
instances.append(name)
elif 'M' in name and 'CMT' not in name:
instances.append(name)
# Modify the path you want to save
# Note: we don't have to create this folder in advance
path_to_save = './vrplib/'
try:
os.makedirs(path_to_save)
for instance in tqdm(instances):
vrplib.download_instance(instance, path_to_save)
vrplib.download_solution(instance, path_to_save)
except: # already exist
pass
problem_names = vrplib.list_names(low=50, high=200, vrp_type='cvrp')
instances = [] # Collect Set A, B, E, F, M datasets
for name in problem_names:
if 'A' in name:
instances.append(name)
elif 'B' in name:
instances.append(name)
elif 'E' in name:
instances.append(name)
elif 'F' in name:
instances.append(name)
elif 'M' in name and 'CMT' not in name:
instances.append(name)
# Modify the path you want to save
# Note: we don't have to create this folder in advance
path_to_save = './vrplib/'
try:
os.makedirs(path_to_save)
for instance in tqdm(instances):
vrplib.download_instance(instance, path_to_save)
vrplib.download_solution(instance, path_to_save)
except: # already exist
pass
0%| | 0/37 [00:00<?, ?it/s]
100%|██████████| 37/37 [00:42<00:00, 1.16s/it]
In [12]:
Copied!
# Utils function
def normalize_coord(coord:torch.Tensor) -> torch.Tensor:
x, y = coord[:, 0], coord[:, 1]
x_min, x_max = x.min(), x.max()
y_min, y_max = y.min(), y.max()
x_scaled = (x - x_min) / (x_max - x_min)
y_scaled = (y - y_min) / (y_max - y_min)
coord_scaled = torch.stack([x_scaled, y_scaled], dim=1)
return coord_scaled
# Utils function
def normalize_coord(coord:torch.Tensor) -> torch.Tensor:
x, y = coord[:, 0], coord[:, 1]
x_min, x_max = x.min(), x.max()
y_min, y_max = y.min(), y.max()
x_scaled = (x - x_min) / (x_max - x_min)
y_scaled = (y - y_min) / (y_max - y_min)
coord_scaled = torch.stack([x_scaled, y_scaled], dim=1)
return coord_scaled
Test the greedy¶
In [18]:
Copied!
for instance in instances:
problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
coords = torch.tensor(problem['node_coord']).float()
coords_norm = normalize_coord(coords)
demand = torch.tensor(problem['demand'][1:]).float()
capacity = problem['capacity']
n = coords.shape[0]
# Prepare the tensordict
batch_size = 2
td = env.reset(batch_size=(batch_size,)).to(device)
td['locs'] = repeat(coords_norm, 'n d -> b n d', b=batch_size, d=2)
td['demand'] = repeat(demand, 'n -> b n', b=batch_size) / capacity
td['visited'] = torch.zeros((batch_size, 1, n), dtype=torch.uint8)
action_mask = torch.ones(batch_size, n, dtype=torch.bool)
action_mask[:, 0] = False
td['action_mask'] = action_mask
# Get the solution from the policy
with torch.no_grad():
out = policy(td.clone(), decode_type='greedy', return_actions=True)
# Calculate the cost on the original scale
td['locs'] = repeat(coords, 'n d -> b n d', b=batch_size, d=2)
neg_reward = env.get_reward(td, out['actions'])
cost = ceil(-1 * neg_reward[0].item())
# Load the optimal cost
solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
optimal_cost = solution['cost']
# Calculate the gap and print
gap = (cost - optimal_cost) / optimal_cost
print(f'Problem: {instance:<15} Cost: {cost:<10} Optimal Cost: {optimal_cost:<10}\t Gap: {gap:.2%}')
for instance in instances:
problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
coords = torch.tensor(problem['node_coord']).float()
coords_norm = normalize_coord(coords)
demand = torch.tensor(problem['demand'][1:]).float()
capacity = problem['capacity']
n = coords.shape[0]
# Prepare the tensordict
batch_size = 2
td = env.reset(batch_size=(batch_size,)).to(device)
td['locs'] = repeat(coords_norm, 'n d -> b n d', b=batch_size, d=2)
td['demand'] = repeat(demand, 'n -> b n', b=batch_size) / capacity
td['visited'] = torch.zeros((batch_size, 1, n), dtype=torch.uint8)
action_mask = torch.ones(batch_size, n, dtype=torch.bool)
action_mask[:, 0] = False
td['action_mask'] = action_mask
# Get the solution from the policy
with torch.no_grad():
out = policy(td.clone(), decode_type='greedy', return_actions=True)
# Calculate the cost on the original scale
td['locs'] = repeat(coords, 'n d -> b n d', b=batch_size, d=2)
neg_reward = env.get_reward(td, out['actions'])
cost = ceil(-1 * neg_reward[0].item())
# Load the optimal cost
solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
optimal_cost = solution['cost']
# Calculate the gap and print
gap = (cost - optimal_cost) / optimal_cost
print(f'Problem: {instance:<15} Cost: {cost:<10} Optimal Cost: {optimal_cost:<10}\t Gap: {gap:.2%}')
Problem: A-n53-k7 Cost: 1371 Optimal Cost: 1010 Gap: 35.74% Problem: A-n54-k7 Cost: 1426 Optimal Cost: 1167 Gap: 22.19% Problem: A-n55-k9 Cost: 1333 Optimal Cost: 1073 Gap: 24.23% Problem: A-n60-k9 Cost: 1728 Optimal Cost: 1354 Gap: 27.62% Problem: A-n61-k9 Cost: 1297 Optimal Cost: 1034 Gap: 25.44% Problem: A-n62-k8 Cost: 1818 Optimal Cost: 1288 Gap: 41.15% Problem: A-n63-k9 Cost: 2166 Optimal Cost: 1616 Gap: 34.03% Problem: A-n63-k10 Cost: 1698 Optimal Cost: 1314 Gap: 29.22% Problem: A-n64-k9 Cost: 1805 Optimal Cost: 1401 Gap: 28.84% Problem: A-n65-k9 Cost: 1592 Optimal Cost: 1174 Gap: 35.60% Problem: A-n69-k9 Cost: 1641 Optimal Cost: 1159 Gap: 41.59% Problem: A-n80-k10 Cost: 2230 Optimal Cost: 1763 Gap: 26.49% Problem: B-n51-k7 Cost: 1270 Optimal Cost: 1032 Gap: 23.06% Problem: B-n52-k7 Cost: 994 Optimal Cost: 747 Gap: 33.07% Problem: B-n56-k7 Cost: 931 Optimal Cost: 707 Gap: 31.68% Problem: B-n57-k7 Cost: 1422 Optimal Cost: 1153 Gap: 23.33% Problem: B-n57-k9 Cost: 1889 Optimal Cost: 1598 Gap: 18.21% Problem: B-n63-k10 Cost: 1807 Optimal Cost: 1496 Gap: 20.79% Problem: B-n64-k9 Cost: 1150 Optimal Cost: 861 Gap: 33.57% Problem: B-n66-k9 Cost: 1746 Optimal Cost: 1316 Gap: 32.67% Problem: B-n67-k10 Cost: 1368 Optimal Cost: 1032 Gap: 32.56% Problem: B-n68-k9 Cost: 1737 Optimal Cost: 1272 Gap: 36.56% Problem: B-n78-k10 Cost: 1706 Optimal Cost: 1221 Gap: 39.72% Problem: E-n51-k5 Cost: 690 Optimal Cost: 521 Gap: 32.44% Problem: E-n76-k7 Cost: 1019 Optimal Cost: 682 Gap: 49.41% Problem: E-n76-k8 Cost: 1031 Optimal Cost: 735 Gap: 40.27% Problem: E-n76-k10 Cost: 1156 Optimal Cost: 830 Gap: 39.28% Problem: E-n76-k14 Cost: 1335 Optimal Cost: 1021 Gap: 30.75% Problem: E-n101-k8 Cost: 1265 Optimal Cost: 815 Gap: 55.21% Problem: E-n101-k14 Cost: 1567 Optimal Cost: 1067 Gap: 46.86% Problem: F-n72-k4 Cost: 425 Optimal Cost: 237 Gap: 79.32% Problem: F-n135-k7 Cost: 4219 Optimal Cost: 1162 Gap: 263.08% Problem: M-n101-k10 Cost: 1388 Optimal Cost: 820 Gap: 69.27% Problem: M-n121-k7 Cost: 1746 Optimal Cost: 1034 Gap: 68.86% Problem: M-n151-k12 Cost: 1906 Optimal Cost: 1015 Gap: 87.78% Problem: M-n200-k16 Cost: 2509 Optimal Cost: 1274 Gap: 96.94% Problem: M-n200-k17 Cost: 2339 Optimal Cost: 1275 Gap: 83.45%
Test the Augmentation¶
In [20]:
Copied!
# Import augmented utils
from rl4co.data.transforms import (
StateAugmentation as SymmetricStateAugmentation)
from rl4co.utils.ops import batchify, unbatchify
num_augment = 100
augmentation = SymmetricStateAugmentation(num_augment=num_augment)
for instance in instances:
problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
coords = torch.tensor(problem['node_coord']).float()
coords_norm = normalize_coord(coords)
demand = torch.tensor(problem['demand'][1:]).float()
capacity = problem['capacity']
n = coords.shape[0]
# Prepare the tensordict
batch_size = 2
td = env.reset(batch_size=(batch_size,)).to(device)
td['locs'] = repeat(coords_norm, 'n d -> b n d', b=batch_size, d=2)
td['demand'] = repeat(demand, 'n -> b n', b=batch_size) / capacity
td['visited'] = torch.zeros((batch_size, 1, n), dtype=torch.uint8)
action_mask = torch.ones(batch_size, n, dtype=torch.bool)
action_mask[:, 0] = False
td['action_mask'] = action_mask
# Augmentation
td = augmentation(td)
# Get the solution from the policy
with torch.no_grad():
out = policy(
td.clone(), decode_type='greedy', num_starts=0, return_actions=True
)
# Calculate the cost on the original scale
coords_repeat = repeat(coords, 'n d -> b n d', b=batch_size, d=2)
td['locs'] = batchify(coords_repeat, num_augment)
reward = env.get_reward(td, out['actions'])
reward = unbatchify(reward, num_augment)
cost = ceil(-1 * torch.max(reward).item())
# Load the optimal cost
solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
optimal_cost = solution['cost']
# Calculate the gap and print
gap = (cost - optimal_cost) / optimal_cost
print(f'Problem: {instance:<15} Cost: {cost:<10} Optimal Cost: {optimal_cost:<10}\t Gap: {gap:.2%}')
# Import augmented utils
from rl4co.data.transforms import (
StateAugmentation as SymmetricStateAugmentation)
from rl4co.utils.ops import batchify, unbatchify
num_augment = 100
augmentation = SymmetricStateAugmentation(num_augment=num_augment)
for instance in instances:
problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
coords = torch.tensor(problem['node_coord']).float()
coords_norm = normalize_coord(coords)
demand = torch.tensor(problem['demand'][1:]).float()
capacity = problem['capacity']
n = coords.shape[0]
# Prepare the tensordict
batch_size = 2
td = env.reset(batch_size=(batch_size,)).to(device)
td['locs'] = repeat(coords_norm, 'n d -> b n d', b=batch_size, d=2)
td['demand'] = repeat(demand, 'n -> b n', b=batch_size) / capacity
td['visited'] = torch.zeros((batch_size, 1, n), dtype=torch.uint8)
action_mask = torch.ones(batch_size, n, dtype=torch.bool)
action_mask[:, 0] = False
td['action_mask'] = action_mask
# Augmentation
td = augmentation(td)
# Get the solution from the policy
with torch.no_grad():
out = policy(
td.clone(), decode_type='greedy', num_starts=0, return_actions=True
)
# Calculate the cost on the original scale
coords_repeat = repeat(coords, 'n d -> b n d', b=batch_size, d=2)
td['locs'] = batchify(coords_repeat, num_augment)
reward = env.get_reward(td, out['actions'])
reward = unbatchify(reward, num_augment)
cost = ceil(-1 * torch.max(reward).item())
# Load the optimal cost
solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
optimal_cost = solution['cost']
# Calculate the gap and print
gap = (cost - optimal_cost) / optimal_cost
print(f'Problem: {instance:<15} Cost: {cost:<10} Optimal Cost: {optimal_cost:<10}\t Gap: {gap:.2%}')
Problem: A-n53-k7 Cost: 1123 Optimal Cost: 1010 Gap: 11.19% Problem: A-n54-k7 Cost: 1305 Optimal Cost: 1167 Gap: 11.83% Problem: A-n55-k9 Cost: 1199 Optimal Cost: 1073 Gap: 11.74% Problem: A-n60-k9 Cost: 1534 Optimal Cost: 1354 Gap: 13.29% Problem: A-n61-k9 Cost: 1187 Optimal Cost: 1034 Gap: 14.80% Problem: A-n62-k8 Cost: 1474 Optimal Cost: 1288 Gap: 14.44% Problem: A-n63-k9 Cost: 1820 Optimal Cost: 1616 Gap: 12.62% Problem: A-n63-k10 Cost: 1505 Optimal Cost: 1314 Gap: 14.54% Problem: A-n64-k9 Cost: 1582 Optimal Cost: 1401 Gap: 12.92% Problem: A-n65-k9 Cost: 1332 Optimal Cost: 1174 Gap: 13.46% Problem: A-n69-k9 Cost: 1305 Optimal Cost: 1159 Gap: 12.60% Problem: A-n80-k10 Cost: 2044 Optimal Cost: 1763 Gap: 15.94% Problem: B-n51-k7 Cost: 1073 Optimal Cost: 1032 Gap: 3.97% Problem: B-n52-k7 Cost: 815 Optimal Cost: 747 Gap: 9.10% Problem: B-n56-k7 Cost: 792 Optimal Cost: 707 Gap: 12.02% Problem: B-n57-k7 Cost: 1219 Optimal Cost: 1153 Gap: 5.72% Problem: B-n57-k9 Cost: 1744 Optimal Cost: 1598 Gap: 9.14% Problem: B-n63-k10 Cost: 1611 Optimal Cost: 1496 Gap: 7.69% Problem: B-n64-k9 Cost: 931 Optimal Cost: 861 Gap: 8.13% Problem: B-n66-k9 Cost: 1427 Optimal Cost: 1316 Gap: 8.43% Problem: B-n67-k10 Cost: 1122 Optimal Cost: 1032 Gap: 8.72% Problem: B-n68-k9 Cost: 1382 Optimal Cost: 1272 Gap: 8.65% Problem: B-n78-k10 Cost: 1437 Optimal Cost: 1221 Gap: 17.69% Problem: E-n51-k5 Cost: 606 Optimal Cost: 521 Gap: 16.31% Problem: E-n76-k7 Cost: 816 Optimal Cost: 682 Gap: 19.65% Problem: E-n76-k8 Cost: 892 Optimal Cost: 735 Gap: 21.36% Problem: E-n76-k10 Cost: 943 Optimal Cost: 830 Gap: 13.61% Problem: E-n76-k14 Cost: 1160 Optimal Cost: 1021 Gap: 13.61% Problem: E-n101-k8 Cost: 1042 Optimal Cost: 815 Gap: 27.85% Problem: E-n101-k14 Cost: 1302 Optimal Cost: 1067 Gap: 22.02% Problem: F-n72-k4 Cost: 286 Optimal Cost: 237 Gap: 20.68% Problem: F-n135-k7 Cost: 1570 Optimal Cost: 1162 Gap: 35.11% Problem: M-n101-k10 Cost: 1037 Optimal Cost: 820 Gap: 26.46% Problem: M-n121-k7 Cost: 1283 Optimal Cost: 1034 Gap: 24.08% Problem: M-n151-k12 Cost: 1407 Optimal Cost: 1015 Gap: 38.62% Problem: M-n200-k16 Cost: 1811 Optimal Cost: 1274 Gap: 42.15% Problem: M-n200-k17 Cost: 1812 Optimal Cost: 1275 Gap: 42.12%
Test the Sampling¶
In [21]:
Copied!
# Parameters for sampling
num_samples = 100
softmax_temp = 0.05
for instance in instances:
problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
coords = torch.tensor(problem['node_coord']).float()
coords_norm = normalize_coord(coords)
demand = torch.tensor(problem['demand'][1:]).float()
capacity = problem['capacity']
n = coords.shape[0]
# Prepare the tensordict
batch_size = 2
td = env.reset(batch_size=(batch_size,)).to(device)
td['locs'] = repeat(coords_norm, 'n d -> b n d', b=batch_size, d=2)
td['demand'] = repeat(demand, 'n -> b n', b=batch_size) / capacity
td['visited'] = torch.zeros((batch_size, 1, n), dtype=torch.uint8)
action_mask = torch.ones(batch_size, n, dtype=torch.bool)
action_mask[:, 0] = False
td['action_mask'] = action_mask
# Sampling
td = batchify(td, num_samples)
# Get the solution from the policy
with torch.no_grad():
out = policy(
td.clone(), decode_type='sampling', num_starts=0, return_actions=True
)
# Calculate the cost on the original scale
coords_repeat = repeat(coords, 'n d -> b n d', b=batch_size, d=2)
td['locs'] = batchify(coords_repeat, num_samples)
reward = env.get_reward(td, out['actions'])
reward = unbatchify(reward, num_samples)
cost = ceil(-1 * torch.max(reward).item())
# Load the optimal cost
solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
optimal_cost = solution['cost']
# Calculate the gap and print
gap = (cost - optimal_cost) / optimal_cost
print(f'Problem: {instance:<15} Cost: {cost:<10} Optimal Cost: {optimal_cost:<10}\t Gap: {gap:.2%}')
# Parameters for sampling
num_samples = 100
softmax_temp = 0.05
for instance in instances:
problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
coords = torch.tensor(problem['node_coord']).float()
coords_norm = normalize_coord(coords)
demand = torch.tensor(problem['demand'][1:]).float()
capacity = problem['capacity']
n = coords.shape[0]
# Prepare the tensordict
batch_size = 2
td = env.reset(batch_size=(batch_size,)).to(device)
td['locs'] = repeat(coords_norm, 'n d -> b n d', b=batch_size, d=2)
td['demand'] = repeat(demand, 'n -> b n', b=batch_size) / capacity
td['visited'] = torch.zeros((batch_size, 1, n), dtype=torch.uint8)
action_mask = torch.ones(batch_size, n, dtype=torch.bool)
action_mask[:, 0] = False
td['action_mask'] = action_mask
# Sampling
td = batchify(td, num_samples)
# Get the solution from the policy
with torch.no_grad():
out = policy(
td.clone(), decode_type='sampling', num_starts=0, return_actions=True
)
# Calculate the cost on the original scale
coords_repeat = repeat(coords, 'n d -> b n d', b=batch_size, d=2)
td['locs'] = batchify(coords_repeat, num_samples)
reward = env.get_reward(td, out['actions'])
reward = unbatchify(reward, num_samples)
cost = ceil(-1 * torch.max(reward).item())
# Load the optimal cost
solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
optimal_cost = solution['cost']
# Calculate the gap and print
gap = (cost - optimal_cost) / optimal_cost
print(f'Problem: {instance:<15} Cost: {cost:<10} Optimal Cost: {optimal_cost:<10}\t Gap: {gap:.2%}')
Problem: A-n53-k7 Cost: 1191 Optimal Cost: 1010 Gap: 17.92% Problem: A-n54-k7 Cost: 1328 Optimal Cost: 1167 Gap: 13.80% Problem: A-n55-k9 Cost: 1286 Optimal Cost: 1073 Gap: 19.85% Problem: A-n60-k9 Cost: 1631 Optimal Cost: 1354 Gap: 20.46% Problem: A-n61-k9 Cost: 1230 Optimal Cost: 1034 Gap: 18.96% Problem: A-n62-k8 Cost: 1505 Optimal Cost: 1288 Gap: 16.85% Problem: A-n63-k9 Cost: 1840 Optimal Cost: 1616 Gap: 13.86% Problem: A-n63-k10 Cost: 1590 Optimal Cost: 1314 Gap: 21.00% Problem: A-n64-k9 Cost: 1643 Optimal Cost: 1401 Gap: 17.27% Problem: A-n65-k9 Cost: 1381 Optimal Cost: 1174 Gap: 17.63% Problem: A-n69-k9 Cost: 1451 Optimal Cost: 1159 Gap: 25.19% Problem: A-n80-k10 Cost: 2170 Optimal Cost: 1763 Gap: 23.09% Problem: B-n51-k7 Cost: 1187 Optimal Cost: 1032 Gap: 15.02% Problem: B-n52-k7 Cost: 884 Optimal Cost: 747 Gap: 18.34% Problem: B-n56-k7 Cost: 853 Optimal Cost: 707 Gap: 20.65% Problem: B-n57-k7 Cost: 1314 Optimal Cost: 1153 Gap: 13.96% Problem: B-n57-k9 Cost: 1744 Optimal Cost: 1598 Gap: 9.14% Problem: B-n63-k10 Cost: 1698 Optimal Cost: 1496 Gap: 13.50% Problem: B-n64-k9 Cost: 1045 Optimal Cost: 861 Gap: 21.37% Problem: B-n66-k9 Cost: 1506 Optimal Cost: 1316 Gap: 14.44% Problem: B-n67-k10 Cost: 1254 Optimal Cost: 1032 Gap: 21.51% Problem: B-n68-k9 Cost: 1510 Optimal Cost: 1272 Gap: 18.71% Problem: B-n78-k10 Cost: 1514 Optimal Cost: 1221 Gap: 24.00% Problem: E-n51-k5 Cost: 613 Optimal Cost: 521 Gap: 17.66% Problem: E-n76-k7 Cost: 882 Optimal Cost: 682 Gap: 29.33% Problem: E-n76-k8 Cost: 952 Optimal Cost: 735 Gap: 29.52% Problem: E-n76-k10 Cost: 1015 Optimal Cost: 830 Gap: 22.29% Problem: E-n76-k14 Cost: 1185 Optimal Cost: 1021 Gap: 16.06% Problem: E-n101-k8 Cost: 1189 Optimal Cost: 815 Gap: 45.89% Problem: E-n101-k14 Cost: 1420 Optimal Cost: 1067 Gap: 33.08% Problem: F-n72-k4 Cost: 344 Optimal Cost: 237 Gap: 45.15% Problem: F-n135-k7 Cost: 3130 Optimal Cost: 1162 Gap: 169.36% Problem: M-n101-k10 Cost: 1221 Optimal Cost: 820 Gap: 48.90% Problem: M-n121-k7 Cost: 1538 Optimal Cost: 1034 Gap: 48.74% Problem: M-n151-k12 Cost: 1688 Optimal Cost: 1015 Gap: 66.31% Problem: M-n200-k16 Cost: 2252 Optimal Cost: 1274 Gap: 76.77% Problem: M-n200-k17 Cost: 2260 Optimal Cost: 1275 Gap: 77.25%