Creating Your Environment
In the previous part of the tutorial, Keyboard Demo, we ran the script manual_novelty_test1.py to try out the keyboard agent for the NovelGym environment. Like manual_sanity_checker.py, the script designed for loading a trained model and seeing what action the model selects, and train.py, the script used for training, manual_novelty_test1.py creates the environment from the polycraft_gym_main.yaml config file.
In this part of the tutorial, we examine those keys of the config file that can be modified towards creating a simple custom environment without having to write any additional code. Examples are provided. Later parts of the tutorial cover the creation of entities, such as objects and actions in Examples of Objects & Actions, and spaces, namely in Defining Spaces, from scratch. The section Implementing Novelties includes how anything can become a novelty to the agent.
Layout
map_size
Width and height in cells of the gridworld navigated by the agent.
map_size: `[16, 16]`
rooms
Coordinates of the upper-left and lower-right corner of each room. Where rooms overlap on a row or column, a wall with a door is created.
rooms:
'1':
start: [0, 0]
end: [10, 10]
'2':
start: [10, 0]
end: [15, 15]
Objects
object_types
Source modules, break cost, and collect cost of the object types in the game.
object_types:
tree_tap:
module: gym_novel_gridworlds2.contrib.polycraft.objects.TreeTap
collect_cost: 50000
objects
Quantity and location of the objects initially placed in the environment. The chunked
key set to True
places all objects of the same type next to each other.
objects:
oak_log:
quantity: 5
room: 2
chunked: 'False'
Entities
entities
The agent
key takes the source of behaviour for the agent: KeyboardAgent
, RandomAgent
, or a more complex setting for an RL agent such as in config/polycraft_gym_rl.yaml. See Combining Planning & RL Agents for more detail on integrating intelligent agents.
The entity
key represents the source code of the agent and the id
is a unique identifier of an entity and is used in actions, such as approach_entity_<id>
.
The action_set
key attributes to the entity one of the actions sets specified under the key action_sets
. Multiple entities can share the same action set.
The room
key stands for the room the entity is placed in at the start of the game. Analogously, the inventory
key specifies what the entity has in their inventory at the start of the game, and the inventory is variable throughout the game.
In the case of intelligent agents, max_step_cost
specifies the maximum cost that can be incurred on the agent at any step.
entities:
main_1:
agent: gym_novel_gridworlds2.agents.KeyboardAgent
name: entity.polycraft.Player.name
type: agent
entity: gym_novel_gridworlds2.contrib.polycraft.objects.PolycraftEntity
action_set: main
inventory:
iron_pickaxe: 1
tree_tap: 1
id: 0
room: 2
max_step_cost: 100000
trades
The input and output of a trade and the id
of the trader with whom this trade can be executed.
trades:
block_of_titanium_1:
input:
block_of_platinum: 1
output:
block_of_titanium: 1
trader:
- 103
auto_pickup_agents
List of ids of those entities that are to automatically collect all objects around them at each time step.
Actions
actions
Source modules and step cost of actions possible in the environment. In the case of actions involving interactions with other agents, the entity_id
must be provided. Compound actions include break_<object>
, approach_<object/entity>
, interact_<entity>
, select_<object>
, craft_<object>
, trade_<object>
. Notice nop_placeholder
, a placeholder for a novelty action.
actions:
break_block:
module: gym_novel_gridworlds2.contrib.polycraft.actions.Break
step_cost: 3600
action_sets
Unique sets of actions that can be attributed to any entity. Any set of actions can be shared by entities.
action_sets:
main:
- collect
- break_block
- approach_oak_log
- select_oak_log
- deselect_item
- craft_stick
- nop_placeholder1
- give_up
Goal
recipies
Input, output, and step cost of all the recipies the agent can craft. In the base implementation includes the recipe for the pogo_stick
, the goal craft of the game.
recipies:
pogo_stick:
input:
- stick
- block_of_titanium
- stick
- diamond
- '0'
- '0'
- '0'
- rubber
- '0'
output:
pogo_stick: 1
step_cost: 8400
Training
All of the below keys take integer values.
sleep_time
Time delay after each environment step.
time_limit
Limit on how many steps the agent can take in attempting the goal during training.
seed
For the reproducibility of the experiment run.
num_episodes
Number of episodes to run.