You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
 

472 lines
22 KiB

<?xml version="1.0"?>
<doc>
<assembly>
<name>AForge.MachineLearning</name>
</assembly>
<members>
<member name="T:AForge.MachineLearning.BoltzmannExploration">
<summary>
Boltzmann distribution exploration policy.
</summary>
<remarks><para>The class implements exploration policy base on Boltzmann distribution.
Acording to the policy, action <b>a</b> at state <b>s</b> is selected with the next probability:</para>
<code lang="none">
exp( Q( s, a ) / t )
p( s, a ) = -----------------------------
SUM( exp( Q( s, b ) / t ) )
b
</code>
<para>where <b>Q(s, a)</b> is action's <b>a</b> estimation (usefulness) at state <b>s</b> and
<b>t</b> is <see cref="P:AForge.MachineLearning.BoltzmannExploration.Temperature"/>.</para>
</remarks>
<seealso cref="T:AForge.MachineLearning.RouletteWheelExploration"/>
<seealso cref="T:AForge.MachineLearning.EpsilonGreedyExploration"/>
<seealso cref="T:AForge.MachineLearning.TabuSearchExploration"/>
</member>
<member name="P:AForge.MachineLearning.BoltzmannExploration.Temperature">
<summary>
Termperature parameter of Boltzmann distribution, >0.
</summary>
<remarks><para>The property sets the balance between exploration and greedy actions.
If temperature is low, then the policy tends to be more greedy.</para></remarks>
</member>
<member name="M:AForge.MachineLearning.BoltzmannExploration.#ctor(System.Double)">
<summary>
Initializes a new instance of the <see cref="T:AForge.MachineLearning.BoltzmannExploration"/> class.
</summary>
<param name="temperature">Termperature parameter of Boltzmann distribution.</param>
</member>
<member name="M:AForge.MachineLearning.BoltzmannExploration.ChooseAction(System.Double[])">
<summary>
Choose an action.
</summary>
<param name="actionEstimates">Action estimates.</param>
<returns>Returns selected action.</returns>
<remarks>The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc).</remarks>
</member>
<member name="T:AForge.MachineLearning.EpsilonGreedyExploration">
<summary>
Epsilon greedy exploration policy.
</summary>
<remarks><para>The class implements epsilon greedy exploration policy. Acording to the policy,
the best action is chosen with probability <b>1-epsilon</b>. Otherwise,
with probability <b>epsilon</b>, any other action, except the best one, is
chosen randomly.</para>
<para>According to the policy, the epsilon value is known also as exploration rate.</para>
</remarks>
<seealso cref="T:AForge.MachineLearning.RouletteWheelExploration"/>
<seealso cref="T:AForge.MachineLearning.BoltzmannExploration"/>
<seealso cref="T:AForge.MachineLearning.TabuSearchExploration"/>
</member>
<member name="P:AForge.MachineLearning.EpsilonGreedyExploration.Epsilon">
<summary>
Epsilon value (exploration rate), [0, 1].
</summary>
<remarks><para>The value determines the amount of exploration driven by the policy.
If the value is high, then the policy drives more to exploration - choosing random
action, which excludes the best one. If the value is low, then the policy is more
greedy - choosing the beat so far action.
</para></remarks>
</member>
<member name="M:AForge.MachineLearning.EpsilonGreedyExploration.#ctor(System.Double)">
<summary>
Initializes a new instance of the <see cref="T:AForge.MachineLearning.EpsilonGreedyExploration"/> class.
</summary>
<param name="epsilon">Epsilon value (exploration rate).</param>
</member>
<member name="M:AForge.MachineLearning.EpsilonGreedyExploration.ChooseAction(System.Double[])">
<summary>
Choose an action.
</summary>
<param name="actionEstimates">Action estimates.</param>
<returns>Returns selected action.</returns>
<remarks>The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc).</remarks>
</member>
<member name="T:AForge.MachineLearning.IExplorationPolicy">
<summary>
Exploration policy interface.
</summary>
<remarks>The interface describes exploration policies, which are used in Reinforcement
Learning to explore state space.</remarks>
</member>
<member name="M:AForge.MachineLearning.IExplorationPolicy.ChooseAction(System.Double[])">
<summary>
Choose an action.
</summary>
<param name="actionEstimates">Action estimates.</param>
<returns>Returns selected action.</returns>
<remarks>The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc).</remarks>
</member>
<member name="T:AForge.MachineLearning.RouletteWheelExploration">
<summary>
Roulette wheel exploration policy.
</summary>
<remarks><para>The class implements roulette whell exploration policy. Acording to the policy,
action <b>a</b> at state <b>s</b> is selected with the next probability:</para>
<code lang="none">
Q( s, a )
p( s, a ) = ------------------
SUM( Q( s, b ) )
b
</code>
<para>where <b>Q(s, a)</b> is action's <b>a</b> estimation (usefulness) at state <b>s</b>.</para>
<para><note>The exploration policy may be applied only in cases, when action estimates (usefulness)
are represented with positive value greater then 0.</note></para>
</remarks>
<seealso cref="T:AForge.MachineLearning.BoltzmannExploration"/>
<seealso cref="T:AForge.MachineLearning.EpsilonGreedyExploration"/>
<seealso cref="T:AForge.MachineLearning.TabuSearchExploration"/>
</member>
<member name="M:AForge.MachineLearning.RouletteWheelExploration.#ctor">
<summary>
Initializes a new instance of the <see cref="T:AForge.MachineLearning.RouletteWheelExploration"/> class.
</summary>
</member>
<member name="M:AForge.MachineLearning.RouletteWheelExploration.ChooseAction(System.Double[])">
<summary>
Choose an action.
</summary>
<param name="actionEstimates">Action estimates.</param>
<returns>Returns selected action.</returns>
<remarks>The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc).</remarks>
</member>
<member name="T:AForge.MachineLearning.TabuSearchExploration">
<summary>
Tabu search exploration policy.
</summary>
<remarks>The class implements simple tabu search exploration policy,
allowing to set certain actions as tabu for a specified amount of
iterations. The actual exploration and choosing from non-tabu actions
is done by <see cref="P:AForge.MachineLearning.TabuSearchExploration.BasePolicy">base exploration policy</see>.</remarks>
<seealso cref="T:AForge.MachineLearning.BoltzmannExploration"/>
<seealso cref="T:AForge.MachineLearning.EpsilonGreedyExploration"/>
<seealso cref="T:AForge.MachineLearning.RouletteWheelExploration"/>
</member>
<member name="P:AForge.MachineLearning.TabuSearchExploration.BasePolicy">
<summary>
Base exploration policy.
</summary>
<remarks>Base exploration policy is the policy, which is used
to choose from non-tabu actions.</remarks>
</member>
<member name="M:AForge.MachineLearning.TabuSearchExploration.#ctor(System.Int32,AForge.MachineLearning.IExplorationPolicy)">
<summary>
Initializes a new instance of the <see cref="T:AForge.MachineLearning.TabuSearchExploration"/> class.
</summary>
<param name="actions">Total actions count.</param>
<param name="basePolicy">Base exploration policy.</param>
</member>
<member name="M:AForge.MachineLearning.TabuSearchExploration.ChooseAction(System.Double[])">
<summary>
Choose an action.
</summary>
<param name="actionEstimates">Action estimates.</param>
<returns>Returns selected action.</returns>
<remarks>The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc). The action is choosed from
non-tabu actions only.</remarks>
</member>
<member name="M:AForge.MachineLearning.TabuSearchExploration.ResetTabuList">
<summary>
Reset tabu list.
</summary>
<remarks>Clears tabu list making all actions allowed.</remarks>
</member>
<member name="M:AForge.MachineLearning.TabuSearchExploration.SetTabuAction(System.Int32,System.Int32)">
<summary>
Set tabu action.
</summary>
<param name="action">Action to set tabu for.</param>
<param name="tabuTime">Tabu time in iterations.</param>
</member>
<member name="T:AForge.MachineLearning.QLearning">
<summary>
QLearning learning algorithm.
</summary>
<remarks>The class provides implementation of Q-Learning algorithm, known as
off-policy Temporal Difference control.</remarks>
<seealso cref="T:AForge.MachineLearning.Sarsa"/>
</member>
<member name="P:AForge.MachineLearning.QLearning.StatesCount">
<summary>
Amount of possible states.
</summary>
</member>
<member name="P:AForge.MachineLearning.QLearning.ActionsCount">
<summary>
Amount of possible actions.
</summary>
</member>
<member name="P:AForge.MachineLearning.QLearning.ExplorationPolicy">
<summary>
Exploration policy.
</summary>
<remarks>Policy, which is used to select actions.</remarks>
</member>
<member name="P:AForge.MachineLearning.QLearning.LearningRate">
<summary>
Learning rate, [0, 1].
</summary>
<remarks>The value determines the amount of updates Q-function receives
during learning. The greater the value, the more updates the function receives.
The lower the value, the less updates it receives.</remarks>
</member>
<member name="P:AForge.MachineLearning.QLearning.DiscountFactor">
<summary>
Discount factor, [0, 1].
</summary>
<remarks>Discount factor for the expected summary reward. The value serves as
multiplier for the expected reward. So if the value is set to 1,
then the expected summary reward is not discounted. If the value is getting
smaller, then smaller amount of the expected reward is used for actions'
estimates update.</remarks>
</member>
<member name="M:AForge.MachineLearning.QLearning.#ctor(System.Int32,System.Int32,AForge.MachineLearning.IExplorationPolicy)">
<summary>
Initializes a new instance of the <see cref="T:AForge.MachineLearning.QLearning"/> class.
</summary>
<param name="states">Amount of possible states.</param>
<param name="actions">Amount of possible actions.</param>
<param name="explorationPolicy">Exploration policy.</param>
<remarks>Action estimates are randomized in the case of this constructor
is used.</remarks>
</member>
<member name="M:AForge.MachineLearning.QLearning.#ctor(System.Int32,System.Int32,AForge.MachineLearning.IExplorationPolicy,System.Boolean)">
<summary>
Initializes a new instance of the <see cref="T:AForge.MachineLearning.QLearning"/> class.
</summary>
<param name="states">Amount of possible states.</param>
<param name="actions">Amount of possible actions.</param>
<param name="explorationPolicy">Exploration policy.</param>
<param name="randomize">Randomize action estimates or not.</param>
<remarks>The <b>randomize</b> parameter specifies if initial action estimates should be randomized
with small values or not. Randomization of action values may be useful, when greedy exploration
policies are used. In this case randomization ensures that actions of the same type are not chosen always.</remarks>
</member>
<member name="M:AForge.MachineLearning.QLearning.GetAction(System.Int32)">
<summary>
Get next action from the specified state.
</summary>
<param name="state">Current state to get an action for.</param>
<returns>Returns the action for the state.</returns>
<remarks>The method returns an action according to current
<see cref="P:AForge.MachineLearning.QLearning.ExplorationPolicy">exploration policy</see>.</remarks>
</member>
<member name="M:AForge.MachineLearning.QLearning.UpdateState(System.Int32,System.Int32,System.Double,System.Int32)">
<summary>
Update Q-function's value for the previous state-action pair.
</summary>
<param name="previousState">Previous state.</param>
<param name="action">Action, which leads from previous to the next state.</param>
<param name="reward">Reward value, received by taking specified action from previous state.</param>
<param name="nextState">Next state.</param>
</member>
<member name="T:AForge.MachineLearning.Sarsa">
<summary>
Sarsa learning algorithm.
</summary>
<remarks>The class provides implementation of Sarse algorithm, known as
on-policy Temporal Difference control.</remarks>
<seealso cref="T:AForge.MachineLearning.QLearning"/>
</member>
<member name="P:AForge.MachineLearning.Sarsa.StatesCount">
<summary>
Amount of possible states.
</summary>
</member>
<member name="P:AForge.MachineLearning.Sarsa.ActionsCount">
<summary>
Amount of possible actions.
</summary>
</member>
<member name="P:AForge.MachineLearning.Sarsa.ExplorationPolicy">
<summary>
Exploration policy.
</summary>
<remarks>Policy, which is used to select actions.</remarks>
</member>
<member name="P:AForge.MachineLearning.Sarsa.LearningRate">
<summary>
Learning rate, [0, 1].
</summary>
<remarks>The value determines the amount of updates Q-function receives
during learning. The greater the value, the more updates the function receives.
The lower the value, the less updates it receives.</remarks>
</member>
<member name="P:AForge.MachineLearning.Sarsa.DiscountFactor">
<summary>
Discount factor, [0, 1].
</summary>
<remarks>Discount factor for the expected summary reward. The value serves as
multiplier for the expected reward. So if the value is set to 1,
then the expected summary reward is not discounted. If the value is getting
smaller, then smaller amount of the expected reward is used for actions'
estimates update.</remarks>
</member>
<member name="M:AForge.MachineLearning.Sarsa.#ctor(System.Int32,System.Int32,AForge.MachineLearning.IExplorationPolicy)">
<summary>
Initializes a new instance of the <see cref="T:AForge.MachineLearning.Sarsa"/> class.
</summary>
<param name="states">Amount of possible states.</param>
<param name="actions">Amount of possible actions.</param>
<param name="explorationPolicy">Exploration policy.</param>
<remarks>Action estimates are randomized in the case of this constructor
is used.</remarks>
</member>
<member name="M:AForge.MachineLearning.Sarsa.#ctor(System.Int32,System.Int32,AForge.MachineLearning.IExplorationPolicy,System.Boolean)">
<summary>
Initializes a new instance of the <see cref="T:AForge.MachineLearning.Sarsa"/> class.
</summary>
<param name="states">Amount of possible states.</param>
<param name="actions">Amount of possible actions.</param>
<param name="explorationPolicy">Exploration policy.</param>
<param name="randomize">Randomize action estimates or not.</param>
<remarks>The <b>randomize</b> parameter specifies if initial action estimates should be randomized
with small values or not. Randomization of action values may be useful, when greedy exploration
policies are used. In this case randomization ensures that actions of the same type are not chosen always.</remarks>
</member>
<member name="M:AForge.MachineLearning.Sarsa.GetAction(System.Int32)">
<summary>
Get next action from the specified state.
</summary>
<param name="state">Current state to get an action for.</param>
<returns>Returns the action for the state.</returns>
<remarks>The method returns an action according to current
<see cref="P:AForge.MachineLearning.Sarsa.ExplorationPolicy">exploration policy</see>.</remarks>
</member>
<member name="M:AForge.MachineLearning.Sarsa.UpdateState(System.Int32,System.Int32,System.Double,System.Int32,System.Int32)">
<summary>
Update Q-function's value for the previous state-action pair.
</summary>
<param name="previousState">Curren state.</param>
<param name="previousAction">Action, which lead from previous to the next state.</param>
<param name="reward">Reward value, received by taking specified action from previous state.</param>
<param name="nextState">Next state.</param>
<param name="nextAction">Next action.</param>
<remarks>Updates Q-function's value for the previous state-action pair in
the case if the next state is non terminal.</remarks>
</member>
<member name="M:AForge.MachineLearning.Sarsa.UpdateState(System.Int32,System.Int32,System.Double)">
<summary>
Update Q-function's value for the previous state-action pair.
</summary>
<param name="previousState">Curren state.</param>
<param name="previousAction">Action, which lead from previous to the next state.</param>
<param name="reward">Reward value, received by taking specified action from previous state.</param>
<remarks>Updates Q-function's value for the previous state-action pair in
the case if the next state is terminal.</remarks>
</member>
</members>
</doc>