arg_router
1.4.0
C++ command line argument parsing and routing
|
In arg_router core functionality is defined by the parse tree, which consists of tree_node instances that define the parse tree structure (i.e. a parse entry point and token grouping), and policies that define the reusable behaviours of nodes.
-f
, –flag
) or the values for those options. The term argument is used exclusively for a token that expects another word to follow it e.g. arg_t.Policies are common sets of behaviours which can be re-used between nodes, but more importantly for us, can be used by a library user to 'build' a custom tree_node by assembling a collection of policies.
Policies come in three basic flavours:
Nodes are implemented by deriving from tree_node, passing one or more policies and/or children to it's constructor. Nodes can contain other nodes (i.e. children) and in doing so form a tree structure - our parse tree!
As you may have noticed from the above diagram, tree_node is clever, it can work out which constructor args are policies and which are nodes by itself so you can pass them in any order you like.
Although the nodes can do a lot of compile-time error checking themselves from analysing their children and policies, they can't check everything - mainly because you can easily get into dependency loops by nodes needing to know about other node types and vice-versa (which also leads to horrible spaghetti code).
To prevent this, inter-node validation is done separately by a special policy called policy::validation::validator that can only be derived from by the root node (root_t). As the root is the last to be instantiated (the leaves are created first), it has all the information needed to check the entire tree, and because nothing depends on the validator it can know about all the other nodes without causing dependency loops.
["my_app" "mode1" "-f" "--arg" "42"] | V ______ | Root | -> [{none, "mode1"}, {none, "-f"}, {none, "--arg"}, {none, "42}] |______|
Parsing starts at the root node, where initially the tokens from the shell are converted into an array of parsing::token_type instances each with a type of parsing::prefix_type::none because we don't know what they are yet. You're probably thinking "can't it be inferred from the prefix?", sadly not as the type is also context-dependent e.g. you could have an arg_t that expects a string, but if that string just happened to have the same prefix as a flag then the prefix would be stripped at this stage.
______ | Root | |______| _______ |------->| Child | std::optional<parsing::parse_target> pre_parse( |_______| parsing::pre_parse_data<Validator, HasTarget> pre_parse_data, const Parents&... parents) const
The root then iterates through its children calling pre_parse(..)
on each until a valid parsing::parse_target is returned. A parsing::parse_target is a function object that executes the appropriate node's parse(..)
method with the now fully processed parsing::token_type instances.
How a child node decides if it is the correct target and how it processes the tokens is entirely node (and it's policies) dependent. For example an arg_t checks that there are enough unprocessed tokens for the number of value tokens it expects (1 in the case of arg_t via a policy::min_max_count), and then checks that the label token matches the arg_t short or long name. There could be other steps if there are policies that handle the pre-parse phase.
Mode-like types (e.g. mode_t) are complication to this because they need to collect all the valid parsing::parse_target of their non-mode children - but ultimately follow the same pattern. You'll notice that a mode still follows the same interface for the pre-parse method so how does it return multiple parsing::parse_target instances? Because you can add sub-targets to it! The top-level target is the mode node itself, and all the subtargets are for the children.
______ | Root | ------> (*returned_parse_target)() |______|
Once the valid parsing::parse_target is returned to the root, it executes like any other function object. It should be noted that there various scenarios along this path where nodes and policies throw an exception, we'll ignore that here for brevity.
Taking our arg_t example again, the value_type parse(parsing::parse_target target, const Parents&... parents) const
method takes the value token and converts it to the expected type. If the node has policies that support the validation phase, then the parsed value is validated by them. If the node has a routing policy attached (only for top-level nodes) then the validated result value is dispatched to it for handling by the library user - otherwise it is returned to the caller (usually for collection by a mode).
As you have probably gathered from the above, the pre-processing and parsing within a node is split up into 'phases' which different policies implement different parts of.