A2C
A2C
¶
A2C(
env: RL4COEnvBase,
policy: Module,
critic: CriticNetwork = None,
critic_kwargs: dict = {},
actor_optimizer_kwargs: dict = {"lr": 0.0001},
critic_optimizer_kwargs: dict = None,
**kwargs
)
Bases: REINFORCE
Advantage Actor Critic (A2C) algorithm. A2C is a variant of REINFORCE where a baseline is provided by a critic network. Here we additionally support different optimizers for the actor and the critic.
Parameters:
-
env
(RL4COEnvBase
) –Environment to use for the algorithm
-
policy
(Module
) –Policy to use for the algorithm
-
critic
(CriticNetwork
, default:None
) –Critic network to use for the algorithm
-
critic_kwargs
(dict
, default:{}
) –Keyword arguments to pass to the critic network
-
actor_optimizer_kwargs
(dict
, default:{'lr': 0.0001}
) –Keyword arguments for the policy (=actor) optimizer
-
critic_optimizer_kwargs
(dict
, default:None
) –Keyword arguments for the critic optimizer. If None, use the same as actor_optimizer_kwargs
-
**kwargs
–Keyword arguments passed to the superclass
Methods:
-
configure_optimizers
–Configure the optimizers for the policy and the critic network (=baseline)
Source code in rl4co/models/rl/a2c/a2c.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
|
configure_optimizers
¶
configure_optimizers()
Configure the optimizers for the policy and the critic network (=baseline)
Source code in rl4co/models/rl/a2c/a2c.py
52 53 54 55 56 57 58 |
|