Testing Model on VRPLib¶

In this notebook, we will test the trained model's performance on the VRPLib benchmark.

VRPLib is a collection of instances related to the CVRP, which is a classic optimization challenge in the field of logistics and transportation.

Installation¶

Uncomment the following line to install the package from PyPI. Remember to choose a GPU runtime for faster training!

Note: You may need to restart the runtime in Colab after this

In [ ]:

Copied!

# !pip install rl4co[graph] # include torch-geometric
# !pip install vrplib # for reading instance files

## NOTE: to install latest version from Github (may be unstable) install from source instead:
# !pip install git+https://github.com/ai4co/rl4co.git
# !pip install rl4co[graph] # include torch-geometric
# !pip install vrplib # for reading instance files

## NOTE: to install latest version from Github (may be unstable) install from source instead:
# !pip install git+https://github.com/ai4co/rl4co.git

In [2]:

Copied!

# Install the `vrplib` package
# !pip install vrplib
# Install the `vrplib` package
# !pip install vrplib

Imports¶

In [3]:

Copied!





%load_ext autoreload
%autoreload 2

import os
import torch
import vrplib
from tensordict import TensorDict

from rl4co.envs import CVRPEnv
from rl4co.models.zoo.am import AttentionModelPolicy
from rl4co.models.rl import REINFORCE
from rl4co.utils.trainer import RL4COTrainer

from tqdm import tqdm
%load_ext autoreload
%autoreload 2

import os
import torch
import vrplib
from tensordict import TensorDict

from rl4co.envs import CVRPEnv
from rl4co.models.zoo.am import AttentionModelPolicy
from rl4co.models.rl import REINFORCE
from rl4co.utils.trainer import RL4COTrainer

from tqdm import tqdm

/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/lightning/fabric/__init__.py:41: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/pkg_resources/__init__.py:3144: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('sphinxcontrib')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(pkg)
/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/lightning/fabric/__init__.py:41: Deprecated call to `pkg_resources.declare_namespace('lightning.fabric')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/pkg_resources/__init__.py:2553: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('lightning')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(parent)
/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/lightning/pytorch/__init__.py:37: Deprecated call to `pkg_resources.declare_namespace('lightning.pytorch')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/pkg_resources/__init__.py:2553: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('lightning')`.
Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
  declare_namespace(parent)

In [4]:

Copied!





device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# RL4CO env based on TorchRL
env = CVRPEnv(generator_params={'num_loc': 50})

# Policy: neural network, in this case with encoder-decoder architecture
policy = AttentionModelPolicy(env_name=env.name).to(device)

# RL Model: REINFORCE and greedy rollout baseline
model = REINFORCE(env, 
                    policy,
                    baseline="rollout",
                    batch_size=512,
                    train_data_size=100_000,
                    val_data_size=10_000,
                    optimizer_kwargs={"lr": 1e-4},
                    )
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# RL4CO env based on TorchRL
env = CVRPEnv(generator_params={'num_loc': 50})

# Policy: neural network, in this case with encoder-decoder architecture
policy = AttentionModelPolicy(env_name=env.name).to(device)

# RL Model: REINFORCE and greedy rollout baseline
model = REINFORCE(env, 
                    policy,
                    baseline="rollout",
                    batch_size=512,
                    train_data_size=100_000,
                    val_data_size=10_000,
                    optimizer_kwargs={"lr": 1e-4},
                    ) 

/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/lightning/pytorch/utilities/parsing.py:199: Attribute 'env' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['env'])`.
/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/lightning/pytorch/utilities/parsing.py:199: Attribute 'policy' is an instance of `nn.Module` and is already saved during checkpointing. It is recommended to ignore them using `self.save_hyperparameters(ignore=['policy'])`.

Download vrp problems¶

In [5]:

Copied!





problem_names = vrplib.list_names(low=50, high=100, vrp_type='cvrp') 

instances = [] # Collect Set A, B, E, F, M datasets
for name in problem_names:
    if 'A' in name:
        instances.append(name)
    elif 'B' in name:
        instances.append(name)
    elif 'E' in name:
        instances.append(name)
    elif 'F' in name:
        instances.append(name)
    elif 'M' in name and 'CMT' not in name:
        instances.append(name)

# Modify the path you want to save 
# Note: we don't have to create this folder in advance
path_to_save = './vrplib/' 

try:
    os.makedirs(path_to_save)
    for instance in tqdm(instances):
        vrplib.download_instance(instance, path_to_save)
        vrplib.download_solution(instance, path_to_save)
except: # already exist
    pass
problem_names = vrplib.list_names(low=50, high=100, vrp_type='cvrp') 

instances = [] # Collect Set A, B, E, F, M datasets
for name in problem_names:
    if 'A' in name:
        instances.append(name)
    elif 'B' in name:
        instances.append(name)
    elif 'E' in name:
        instances.append(name)
    elif 'F' in name:
        instances.append(name)
    elif 'M' in name and 'CMT' not in name:
        instances.append(name)

# Modify the path you want to save 
# Note: we don't have to create this folder in advance
path_to_save = './vrplib/' 

try:
    os.makedirs(path_to_save)
    for instance in tqdm(instances):
        vrplib.download_instance(instance, path_to_save)
        vrplib.download_solution(instance, path_to_save)
except: # already exist
    pass 

/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/vrplib/download/list_names.py:58: DeprecationWarning: read_text is deprecated. Use files() instead. Refer to https://importlib-resources.readthedocs.io/en/latest/using.html#migrating-from-legacy for migration advice.
  fi = pkg_resource.read_text(__package__, "instance_data.csv")
/home/botu/mambaforge/envs/rl4co/lib/python3.11/importlib/resources/_legacy.py:80: DeprecationWarning: open_text is deprecated. Use files() instead. Refer to https://importlib-resources.readthedocs.io/en/latest/using.html#migrating-from-legacy for migration advice.
  with open_text(package, resource, encoding, errors) as fp:
/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/vrplib/download/list_names.py:32: DeprecationWarning: The function 'list_names' is deprecated and will be removed in the next major version (vrplib v2.0.0).
  warnings.warn(msg, DeprecationWarning)

In [6]:

Copied!





# Utils function: we will normalize the coordinates of the VRP instances
def normalize_coord(coord:torch.Tensor) -> torch.Tensor:
    x, y = coord[:, 0], coord[:, 1]
    x_min, x_max = x.min(), x.max()
    y_min, y_max = y.min(), y.max()
    
    x_scaled = (x - x_min) / (x_max - x_min) 
    y_scaled = (y - y_min) / (y_max - y_min)
    coord_scaled = torch.stack([x_scaled, y_scaled], dim=1)
    return coord_scaled 

def vrplib_to_td(problem, normalize=True):
    coords = torch.tensor(problem['node_coord']).float()
    coords_norm = normalize_coord(coords) if normalize else coords
    demand = torch.tensor(problem['demand'][1:]).float()
    capacity = problem['capacity']
    n = coords.shape[0]
    td = TensorDict({
        'depot': coords_norm[0,:],
        'locs': coords_norm[1:,:],
        'demand': demand / capacity, # normalized demand
        'capacity': capacity, # original capacity, not needed for inference
    })
    td = td[None] # add batch dimension, in this case just 1
    return td
# Utils function: we will normalize the coordinates of the VRP instances
def normalize_coord(coord:torch.Tensor) -> torch.Tensor:
    x, y = coord[:, 0], coord[:, 1]
    x_min, x_max = x.min(), x.max()
    y_min, y_max = y.min(), y.max()
    
    x_scaled = (x - x_min) / (x_max - x_min) 
    y_scaled = (y - y_min) / (y_max - y_min)
    coord_scaled = torch.stack([x_scaled, y_scaled], dim=1)
    return coord_scaled 

def vrplib_to_td(problem, normalize=True):
    coords = torch.tensor(problem['node_coord']).float()
    coords_norm = normalize_coord(coords) if normalize else coords
    demand = torch.tensor(problem['demand'][1:]).float()
    capacity = problem['capacity']
    n = coords.shape[0]
    td = TensorDict({
        'depot': coords_norm[0,:],
        'locs': coords_norm[1:,:],
        'demand': demand / capacity, # normalized demand
        'capacity': capacity, # original capacity, not needed for inference
    })
    td = td[None] # add batch dimension, in this case just 1
    return td

Test untrained¶

In [8]:

Copied!





tds, actions = [], []
for instance in instances:
    # Inference
    problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
    td_reset = env.reset(vrplib_to_td(problem).to(device))
    with torch.inference_mode():
        out = policy(td_reset.clone(), env, decode_type="sampling", num_samples=128, select_best=True)
        unnormalized_td = env.reset(vrplib_to_td(problem, normalize=False).to(device))
        cost = -env.get_reward(unnormalized_td, out["actions"]).int().item() # unnormalized cost
        
    # Load the optimal cost
    solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
    optimal_cost = solution['cost']

    tds.append(td_reset)
    actions.append(out["actions"])
    
    # Calculate the gap and print
    gap = (cost - optimal_cost) / optimal_cost
    print(f'Problem: {instance:<15} Cost: {cost:<8} BKS: {optimal_cost:<8}\t Gap: {gap:.2%}')
tds, actions = [], []
for instance in instances:
    # Inference
    problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
    td_reset = env.reset(vrplib_to_td(problem).to(device))
    with torch.inference_mode():
        out = policy(td_reset.clone(), env, decode_type="sampling", num_samples=128, select_best=True)
        unnormalized_td = env.reset(vrplib_to_td(problem, normalize=False).to(device))
        cost = -env.get_reward(unnormalized_td, out["actions"]).int().item() # unnormalized cost
        
    # Load the optimal cost
    solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
    optimal_cost = solution['cost']

    tds.append(td_reset)
    actions.append(out["actions"])
    
    # Calculate the gap and print
    gap = (cost - optimal_cost) / optimal_cost
    print(f'Problem: {instance:<15} Cost: {cost:<8} BKS: {optimal_cost:<8}\t Gap: {gap:.2%}')

Problem: A-n53-k7        Cost: 2777     BKS: 1010    	 Gap: 174.95%
Problem: A-n54-k7        Cost: 3130     BKS: 1167    	 Gap: 168.21%
Problem: A-n55-k9        Cost: 2812     BKS: 1073    	 Gap: 162.07%
Problem: A-n60-k9        Cost: 3151     BKS: 1354    	 Gap: 132.72%
Problem: A-n61-k9        Cost: 3060     BKS: 1034    	 Gap: 195.94%
Problem: A-n62-k8        Cost: 3483     BKS: 1288    	 Gap: 170.42%
Problem: A-n63-k9        Cost: 3736     BKS: 1616    	 Gap: 131.19%
Problem: A-n63-k10       Cost: 3110     BKS: 1314    	 Gap: 136.68%
Problem: A-n64-k9        Cost: 3721     BKS: 1401    	 Gap: 165.60%
Problem: A-n65-k9        Cost: 3548     BKS: 1174    	 Gap: 202.21%
Problem: A-n69-k9        Cost: 3600     BKS: 1159    	 Gap: 210.61%
Problem: A-n80-k10       Cost: 4776     BKS: 1763    	 Gap: 170.90%
Problem: B-n51-k7        Cost: 3286     BKS: 1032    	 Gap: 218.41%
Problem: B-n52-k7        Cost: 2852     BKS: 747     	 Gap: 281.79%
Problem: B-n56-k7        Cost: 2762     BKS: 707     	 Gap: 290.66%
Problem: B-n57-k7        Cost: 3553     BKS: 1153    	 Gap: 208.15%
Problem: B-n57-k9        Cost: 3622     BKS: 1598    	 Gap: 126.66%
Problem: B-n63-k10       Cost: 3426     BKS: 1496    	 Gap: 129.01%
Problem: B-n64-k9        Cost: 2804     BKS: 861     	 Gap: 225.67%
Problem: B-n66-k9        Cost: 3273     BKS: 1316    	 Gap: 148.71%
Problem: B-n67-k10       Cost: 2949     BKS: 1032    	 Gap: 185.76%
Problem: B-n68-k9        Cost: 3992     BKS: 1272    	 Gap: 213.84%
Problem: B-n78-k10       Cost: 4367     BKS: 1221    	 Gap: 257.66%
Problem: E-n51-k5        Cost: 1615     BKS: 521     	 Gap: 209.98%
Problem: E-n76-k7        Cost: 2396     BKS: 682     	 Gap: 251.32%
Problem: E-n76-k8        Cost: 2402     BKS: 735     	 Gap: 226.80%
Problem: E-n76-k10       Cost: 2393     BKS: 830     	 Gap: 188.31%
Problem: E-n76-k14       Cost: 2520     BKS: 1021    	 Gap: 146.82%
Problem: E-n101-k8       Cost: 3507     BKS: 815     	 Gap: 330.31%
Problem: E-n101-k14      Cost: 3550     BKS: 1067    	 Gap: 232.71%
Problem: F-n72-k4        Cost: 1274     BKS: 237     	 Gap: 437.55%
Problem: M-n101-k10      Cost: 4036     BKS: 820     	 Gap: 392.20%

In [ ]:

Copied!





# Plot some instances
env.render(tds[0], actions[0].cpu())
env.render(tds[-2], actions[-2].cpu())
env.render(tds[-1], actions[-1].cpu())
# Plot some instances
env.render(tds[0], actions[0].cpu())
env.render(tds[-2], actions[-2].cpu())
env.render(tds[-1], actions[-1].cpu())

No description has been provided for this image

Train¶

We will train for few steps just to show the effects of training a model. Alternatively, we can load the a pretrained checkpoint, e.g. with:

model = AttentionModel.load_from_checkpoint(checkpoint_path, load_baseline=False)

In [10]:

Copied!





trainer = RL4COTrainer(
    max_epochs=3,
    accelerator="gpu",
    devices=1,
    logger=None,
)

trainer.fit(model)
trainer = RL4COTrainer(
    max_epochs=3,
    accelerator="gpu",
    devices=1,
    logger=None,
)

trainer.fit(model)

Using 16bit Automatic Mixed Precision (AMP)
/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/lightning/pytorch/plugins/precision/amp.py:55: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs

/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/logger_connector/logger_connector.py:75: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `lightning.pytorch` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default
val_file not set. Generating dataset instead
test_file not set. Generating dataset instead
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1]

  | Name     | Type                 | Params
--------------------------------------------------
0 | env      | CVRPEnv              | 0     
1 | policy   | AttentionModelPolicy | 694 K 
2 | baseline | WarmupBaseline       | 694 K 
--------------------------------------------------
1.4 M     Trainable params
0         Non-trainable params
1.4 M     Total params
5.553     Total estimated model params size (MB)

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:441: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=31` in the `DataLoader` to improve performance.
/home/botu/mambaforge/envs/rl4co/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:441: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=31` in the `DataLoader` to improve performance.

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=3` reached.

Test trained model¶

In [11]:

Copied!





policy = model.policy.to(device).eval() # trained policy

tds, actions = [], []
for instance in instances:
    # Inference
    problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
    td_reset = env.reset(vrplib_to_td(problem).to(device))
    with torch.inference_mode():
        out = policy(td_reset.clone(), env, decode_type="sampling", num_samples=128, select_best=True)
        unnormalized_td = env.reset(vrplib_to_td(problem, normalize=False).to(device))
        cost = -env.get_reward(unnormalized_td, out["actions"]).int().item() # unnormalized cost
        
    # Load the optimal cost
    solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
    optimal_cost = solution['cost']

    tds.append(td_reset)
    actions.append(out["actions"])
    
    # Calculate the gap and print
    gap = (cost - optimal_cost) / optimal_cost
    print(f'Problem: {instance:<15} Cost: {cost:<8} BKS: {optimal_cost:<8}\t Gap: {gap:.2%}')
policy = model.policy.to(device).eval() # trained policy

tds, actions = [], []
for instance in instances:
    # Inference
    problem = vrplib.read_instance(os.path.join(path_to_save, instance+'.vrp'))
    td_reset = env.reset(vrplib_to_td(problem).to(device))
    with torch.inference_mode():
        out = policy(td_reset.clone(), env, decode_type="sampling", num_samples=128, select_best=True)
        unnormalized_td = env.reset(vrplib_to_td(problem, normalize=False).to(device))
        cost = -env.get_reward(unnormalized_td, out["actions"]).int().item() # unnormalized cost
        
    # Load the optimal cost
    solution = vrplib.read_solution(os.path.join(path_to_save, instance+'.sol'))
    optimal_cost = solution['cost']

    tds.append(td_reset)
    actions.append(out["actions"])
    
    # Calculate the gap and print
    gap = (cost - optimal_cost) / optimal_cost
    print(f'Problem: {instance:<15} Cost: {cost:<8} BKS: {optimal_cost:<8}\t Gap: {gap:.2%}')

Problem: A-n53-k7        Cost: 1180     BKS: 1010    	 Gap: 16.83%
Problem: A-n54-k7        Cost: 1256     BKS: 1167    	 Gap: 7.63%
Problem: A-n55-k9        Cost: 1195     BKS: 1073    	 Gap: 11.37%
Problem: A-n60-k9        Cost: 1502     BKS: 1354    	 Gap: 10.93%
Problem: A-n61-k9        Cost: 1223     BKS: 1034    	 Gap: 18.28%
Problem: A-n62-k8        Cost: 1491     BKS: 1288    	 Gap: 15.76%
Problem: A-n63-k9        Cost: 1792     BKS: 1616    	 Gap: 10.89%
Problem: A-n63-k10       Cost: 1459     BKS: 1314    	 Gap: 11.04%
Problem: A-n64-k9        Cost: 1537     BKS: 1401    	 Gap: 9.71%
Problem: A-n65-k9        Cost: 1355     BKS: 1174    	 Gap: 15.42%
Problem: A-n69-k9        Cost: 1317     BKS: 1159    	 Gap: 13.63%
Problem: A-n80-k10       Cost: 2009     BKS: 1763    	 Gap: 13.95%
Problem: B-n51-k7        Cost: 1182     BKS: 1032    	 Gap: 14.53%
Problem: B-n52-k7        Cost: 863      BKS: 747     	 Gap: 15.53%
Problem: B-n56-k7        Cost: 889      BKS: 707     	 Gap: 25.74%
Problem: B-n57-k7        Cost: 1323     BKS: 1153    	 Gap: 14.74%
Problem: B-n57-k9        Cost: 1772     BKS: 1598    	 Gap: 10.89%
Problem: B-n63-k10       Cost: 1671     BKS: 1496    	 Gap: 11.70%
Problem: B-n64-k9        Cost: 1040     BKS: 861     	 Gap: 20.79%
Problem: B-n66-k9        Cost: 1466     BKS: 1316    	 Gap: 11.40%
Problem: B-n67-k10       Cost: 1201     BKS: 1032    	 Gap: 16.38%
Problem: B-n68-k9        Cost: 1413     BKS: 1272    	 Gap: 11.08%
Problem: B-n78-k10       Cost: 1529     BKS: 1221    	 Gap: 25.23%
Problem: E-n51-k5        Cost: 630      BKS: 521     	 Gap: 20.92%
Problem: E-n76-k7        Cost: 844      BKS: 682     	 Gap: 23.75%
Problem: E-n76-k8        Cost: 862      BKS: 735     	 Gap: 17.28%
Problem: E-n76-k10       Cost: 975      BKS: 830     	 Gap: 17.47%
Problem: E-n76-k14       Cost: 1153     BKS: 1021    	 Gap: 12.93%
Problem: E-n101-k8       Cost: 1070     BKS: 815     	 Gap: 31.29%
Problem: E-n101-k14      Cost: 1303     BKS: 1067    	 Gap: 22.12%
Problem: F-n72-k4        Cost: 312      BKS: 237     	 Gap: 31.65%
Problem: M-n101-k10      Cost: 1134     BKS: 820     	 Gap: 38.29%

In [12]:

Copied!





# Plot some instances
env.render(tds[0], actions[0].cpu())
env.render(tds[-2], actions[-2].cpu())
env.render(tds[-1], actions[-1].cpu())
# Plot some instances
env.render(tds[0], actions[0].cpu())
env.render(tds[-2], actions[-2].cpu())
env.render(tds[-1], actions[-1].cpu())

Great! We can see that the performance vastly improved even with just few minutes of training.

There are several ways to improve the model's performance further, such as:

Training for more steps
Using a different model architecture
Using a different training algorithm
Using a different hyperparameters
Using a different Generator
... and many more!