You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

472 lines
22 KiB

1 month ago
  1. <?xml version="1.0"?>
  2. <doc>
  3. <assembly>
  4. <name>AForge.MachineLearning</name>
  5. </assembly>
  6. <members>
  7. <member name="T:AForge.MachineLearning.BoltzmannExploration">
  8. <summary>
  9. Boltzmann distribution exploration policy.
  10. </summary>
  11. <remarks><para>The class implements exploration policy base on Boltzmann distribution.
  12. Acording to the policy, action <b>a</b> at state <b>s</b> is selected with the next probability:</para>
  13. <code lang="none">
  14. exp( Q( s, a ) / t )
  15. p( s, a ) = -----------------------------
  16. SUM( exp( Q( s, b ) / t ) )
  17. b
  18. </code>
  19. <para>where <b>Q(s, a)</b> is action's <b>a</b> estimation (usefulness) at state <b>s</b> and
  20. <b>t</b> is <see cref="P:AForge.MachineLearning.BoltzmannExploration.Temperature"/>.</para>
  21. </remarks>
  22. <seealso cref="T:AForge.MachineLearning.RouletteWheelExploration"/>
  23. <seealso cref="T:AForge.MachineLearning.EpsilonGreedyExploration"/>
  24. <seealso cref="T:AForge.MachineLearning.TabuSearchExploration"/>
  25. </member>
  26. <member name="P:AForge.MachineLearning.BoltzmannExploration.Temperature">
  27. <summary>
  28. Termperature parameter of Boltzmann distribution, >0.
  29. </summary>
  30. <remarks><para>The property sets the balance between exploration and greedy actions.
  31. If temperature is low, then the policy tends to be more greedy.</para></remarks>
  32. </member>
  33. <member name="M:AForge.MachineLearning.BoltzmannExploration.#ctor(System.Double)">
  34. <summary>
  35. Initializes a new instance of the <see cref="T:AForge.MachineLearning.BoltzmannExploration"/> class.
  36. </summary>
  37. <param name="temperature">Termperature parameter of Boltzmann distribution.</param>
  38. </member>
  39. <member name="M:AForge.MachineLearning.BoltzmannExploration.ChooseAction(System.Double[])">
  40. <summary>
  41. Choose an action.
  42. </summary>
  43. <param name="actionEstimates">Action estimates.</param>
  44. <returns>Returns selected action.</returns>
  45. <remarks>The method chooses an action depending on the provided estimates. The
  46. estimates can be any sort of estimate, which values usefulness of the action
  47. (expected summary reward, discounted reward, etc).</remarks>
  48. </member>
  49. <member name="T:AForge.MachineLearning.EpsilonGreedyExploration">
  50. <summary>
  51. Epsilon greedy exploration policy.
  52. </summary>
  53. <remarks><para>The class implements epsilon greedy exploration policy. Acording to the policy,
  54. the best action is chosen with probability <b>1-epsilon</b>. Otherwise,
  55. with probability <b>epsilon</b>, any other action, except the best one, is
  56. chosen randomly.</para>
  57. <para>According to the policy, the epsilon value is known also as exploration rate.</para>
  58. </remarks>
  59. <seealso cref="T:AForge.MachineLearning.RouletteWheelExploration"/>
  60. <seealso cref="T:AForge.MachineLearning.BoltzmannExploration"/>
  61. <seealso cref="T:AForge.MachineLearning.TabuSearchExploration"/>
  62. </member>
  63. <member name="P:AForge.MachineLearning.EpsilonGreedyExploration.Epsilon">
  64. <summary>
  65. Epsilon value (exploration rate), [0, 1].
  66. </summary>
  67. <remarks><para>The value determines the amount of exploration driven by the policy.
  68. If the value is high, then the policy drives more to exploration - choosing random
  69. action, which excludes the best one. If the value is low, then the policy is more
  70. greedy - choosing the beat so far action.
  71. </para></remarks>
  72. </member>
  73. <member name="M:AForge.MachineLearning.EpsilonGreedyExploration.#ctor(System.Double)">
  74. <summary>
  75. Initializes a new instance of the <see cref="T:AForge.MachineLearning.EpsilonGreedyExploration"/> class.
  76. </summary>
  77. <param name="epsilon">Epsilon value (exploration rate).</param>
  78. </member>
  79. <member name="M:AForge.MachineLearning.EpsilonGreedyExploration.ChooseAction(System.Double[])">
  80. <summary>
  81. Choose an action.
  82. </summary>
  83. <param name="actionEstimates">Action estimates.</param>
  84. <returns>Returns selected action.</returns>
  85. <remarks>The method chooses an action depending on the provided estimates. The
  86. estimates can be any sort of estimate, which values usefulness of the action
  87. (expected summary reward, discounted reward, etc).</remarks>
  88. </member>
  89. <member name="T:AForge.MachineLearning.IExplorationPolicy">
  90. <summary>
  91. Exploration policy interface.
  92. </summary>
  93. <remarks>The interface describes exploration policies, which are used in Reinforcement
  94. Learning to explore state space.</remarks>
  95. </member>
  96. <member name="M:AForge.MachineLearning.IExplorationPolicy.ChooseAction(System.Double[])">
  97. <summary>
  98. Choose an action.
  99. </summary>
  100. <param name="actionEstimates">Action estimates.</param>
  101. <returns>Returns selected action.</returns>
  102. <remarks>The method chooses an action depending on the provided estimates. The
  103. estimates can be any sort of estimate, which values usefulness of the action
  104. (expected summary reward, discounted reward, etc).</remarks>
  105. </member>
  106. <member name="T:AForge.MachineLearning.RouletteWheelExploration">
  107. <summary>
  108. Roulette wheel exploration policy.
  109. </summary>
  110. <remarks><para>The class implements roulette whell exploration policy. Acording to the policy,
  111. action <b>a</b> at state <b>s</b> is selected with the next probability:</para>
  112. <code lang="none">
  113. Q( s, a )
  114. p( s, a ) = ------------------
  115. SUM( Q( s, b ) )
  116. b
  117. </code>
  118. <para>where <b>Q(s, a)</b> is action's <b>a</b> estimation (usefulness) at state <b>s</b>.</para>
  119. <para><note>The exploration policy may be applied only in cases, when action estimates (usefulness)
  120. are represented with positive value greater then 0.</note></para>
  121. </remarks>
  122. <seealso cref="T:AForge.MachineLearning.BoltzmannExploration"/>
  123. <seealso cref="T:AForge.MachineLearning.EpsilonGreedyExploration"/>
  124. <seealso cref="T:AForge.MachineLearning.TabuSearchExploration"/>
  125. </member>
  126. <member name="M:AForge.MachineLearning.RouletteWheelExploration.#ctor">
  127. <summary>
  128. Initializes a new instance of the <see cref="T:AForge.MachineLearning.RouletteWheelExploration"/> class.
  129. </summary>
  130. </member>
  131. <member name="M:AForge.MachineLearning.RouletteWheelExploration.ChooseAction(System.Double[])">
  132. <summary>
  133. Choose an action.
  134. </summary>
  135. <param name="actionEstimates">Action estimates.</param>
  136. <returns>Returns selected action.</returns>
  137. <remarks>The method chooses an action depending on the provided estimates. The
  138. estimates can be any sort of estimate, which values usefulness of the action
  139. (expected summary reward, discounted reward, etc).</remarks>
  140. </member>
  141. <member name="T:AForge.MachineLearning.TabuSearchExploration">
  142. <summary>
  143. Tabu search exploration policy.
  144. </summary>
  145. <remarks>The class implements simple tabu search exploration policy,
  146. allowing to set certain actions as tabu for a specified amount of
  147. iterations. The actual exploration and choosing from non-tabu actions
  148. is done by <see cref="P:AForge.MachineLearning.TabuSearchExploration.BasePolicy">base exploration policy</see>.</remarks>
  149. <seealso cref="T:AForge.MachineLearning.BoltzmannExploration"/>
  150. <seealso cref="T:AForge.MachineLearning.EpsilonGreedyExploration"/>
  151. <seealso cref="T:AForge.MachineLearning.RouletteWheelExploration"/>
  152. </member>
  153. <member name="P:AForge.MachineLearning.TabuSearchExploration.BasePolicy">
  154. <summary>
  155. Base exploration policy.
  156. </summary>
  157. <remarks>Base exploration policy is the policy, which is used
  158. to choose from non-tabu actions.</remarks>
  159. </member>
  160. <member name="M:AForge.MachineLearning.TabuSearchExploration.#ctor(System.Int32,AForge.MachineLearning.IExplorationPolicy)">
  161. <summary>
  162. Initializes a new instance of the <see cref="T:AForge.MachineLearning.TabuSearchExploration"/> class.
  163. </summary>
  164. <param name="actions">Total actions count.</param>
  165. <param name="basePolicy">Base exploration policy.</param>
  166. </member>
  167. <member name="M:AForge.MachineLearning.TabuSearchExploration.ChooseAction(System.Double[])">
  168. <summary>
  169. Choose an action.
  170. </summary>
  171. <param name="actionEstimates">Action estimates.</param>
  172. <returns>Returns selected action.</returns>
  173. <remarks>The method chooses an action depending on the provided estimates. The
  174. estimates can be any sort of estimate, which values usefulness of the action
  175. (expected summary reward, discounted reward, etc). The action is choosed from
  176. non-tabu actions only.</remarks>
  177. </member>
  178. <member name="M:AForge.MachineLearning.TabuSearchExploration.ResetTabuList">
  179. <summary>
  180. Reset tabu list.
  181. </summary>
  182. <remarks>Clears tabu list making all actions allowed.</remarks>
  183. </member>
  184. <member name="M:AForge.MachineLearning.TabuSearchExploration.SetTabuAction(System.Int32,System.Int32)">
  185. <summary>
  186. Set tabu action.
  187. </summary>
  188. <param name="action">Action to set tabu for.</param>
  189. <param name="tabuTime">Tabu time in iterations.</param>
  190. </member>
  191. <member name="T:AForge.MachineLearning.QLearning">
  192. <summary>
  193. QLearning learning algorithm.
  194. </summary>
  195. <remarks>The class provides implementation of Q-Learning algorithm, known as
  196. off-policy Temporal Difference control.</remarks>
  197. <seealso cref="T:AForge.MachineLearning.Sarsa"/>
  198. </member>
  199. <member name="P:AForge.MachineLearning.QLearning.StatesCount">
  200. <summary>
  201. Amount of possible states.
  202. </summary>
  203. </member>
  204. <member name="P:AForge.MachineLearning.QLearning.ActionsCount">
  205. <summary>
  206. Amount of possible actions.
  207. </summary>
  208. </member>
  209. <member name="P:AForge.MachineLearning.QLearning.ExplorationPolicy">
  210. <summary>
  211. Exploration policy.
  212. </summary>
  213. <remarks>Policy, which is used to select actions.</remarks>
  214. </member>
  215. <member name="P:AForge.MachineLearning.QLearning.LearningRate">
  216. <summary>
  217. Learning rate, [0, 1].
  218. </summary>
  219. <remarks>The value determines the amount of updates Q-function receives
  220. during learning. The greater the value, the more updates the function receives.
  221. The lower the value, the less updates it receives.</remarks>
  222. </member>
  223. <member name="P:AForge.MachineLearning.QLearning.DiscountFactor">
  224. <summary>
  225. Discount factor, [0, 1].
  226. </summary>
  227. <remarks>Discount factor for the expected summary reward. The value serves as
  228. multiplier for the expected reward. So if the value is set to 1,
  229. then the expected summary reward is not discounted. If the value is getting
  230. smaller, then smaller amount of the expected reward is used for actions'
  231. estimates update.</remarks>
  232. </member>
  233. <member name="M:AForge.MachineLearning.QLearning.#ctor(System.Int32,System.Int32,AForge.MachineLearning.IExplorationPolicy)">
  234. <summary>
  235. Initializes a new instance of the <see cref="T:AForge.MachineLearning.QLearning"/> class.
  236. </summary>
  237. <param name="states">Amount of possible states.</param>
  238. <param name="actions">Amount of possible actions.</param>
  239. <param name="explorationPolicy">Exploration policy.</param>
  240. <remarks>Action estimates are randomized in the case of this constructor
  241. is used.</remarks>
  242. </member>
  243. <member name="M:AForge.MachineLearning.QLearning.#ctor(System.Int32,System.Int32,AForge.MachineLearning.IExplorationPolicy,System.Boolean)">
  244. <summary>
  245. Initializes a new instance of the <see cref="T:AForge.MachineLearning.QLearning"/> class.
  246. </summary>
  247. <param name="states">Amount of possible states.</param>
  248. <param name="actions">Amount of possible actions.</param>
  249. <param name="explorationPolicy">Exploration policy.</param>
  250. <param name="randomize">Randomize action estimates or not.</param>
  251. <remarks>The <b>randomize</b> parameter specifies if initial action estimates should be randomized
  252. with small values or not. Randomization of action values may be useful, when greedy exploration
  253. policies are used. In this case randomization ensures that actions of the same type are not chosen always.</remarks>
  254. </member>
  255. <member name="M:AForge.MachineLearning.QLearning.GetAction(System.Int32)">
  256. <summary>
  257. Get next action from the specified state.
  258. </summary>
  259. <param name="state">Current state to get an action for.</param>
  260. <returns>Returns the action for the state.</returns>
  261. <remarks>The method returns an action according to current
  262. <see cref="P:AForge.MachineLearning.QLearning.ExplorationPolicy">exploration policy</see>.</remarks>
  263. </member>
  264. <member name="M:AForge.MachineLearning.QLearning.UpdateState(System.Int32,System.Int32,System.Double,System.Int32)">
  265. <summary>
  266. Update Q-function's value for the previous state-action pair.
  267. </summary>
  268. <param name="previousState">Previous state.</param>
  269. <param name="action">Action, which leads from previous to the next state.</param>
  270. <param name="reward">Reward value, received by taking specified action from previous state.</param>
  271. <param name="nextState">Next state.</param>
  272. </member>
  273. <member name="T:AForge.MachineLearning.Sarsa">
  274. <summary>
  275. Sarsa learning algorithm.
  276. </summary>
  277. <remarks>The class provides implementation of Sarse algorithm, known as
  278. on-policy Temporal Difference control.</remarks>
  279. <seealso cref="T:AForge.MachineLearning.QLearning"/>
  280. </member>
  281. <member name="P:AForge.MachineLearning.Sarsa.StatesCount">
  282. <summary>
  283. Amount of possible states.
  284. </summary>
  285. </member>
  286. <member name="P:AForge.MachineLearning.Sarsa.ActionsCount">
  287. <summary>
  288. Amount of possible actions.
  289. </summary>
  290. </member>
  291. <member name="P:AForge.MachineLearning.Sarsa.ExplorationPolicy">
  292. <summary>
  293. Exploration policy.
  294. </summary>
  295. <remarks>Policy, which is used to select actions.</remarks>
  296. </member>
  297. <member name="P:AForge.MachineLearning.Sarsa.LearningRate">
  298. <summary>
  299. Learning rate, [0, 1].
  300. </summary>
  301. <remarks>The value determines the amount of updates Q-function receives
  302. during learning. The greater the value, the more updates the function receives.
  303. The lower the value, the less updates it receives.</remarks>
  304. </member>
  305. <member name="P:AForge.MachineLearning.Sarsa.DiscountFactor">
  306. <summary>
  307. Discount factor, [0, 1].
  308. </summary>
  309. <remarks>Discount factor for the expected summary reward. The value serves as
  310. multiplier for the expected reward. So if the value is set to 1,
  311. then the expected summary reward is not discounted. If the value is getting
  312. smaller, then smaller amount of the expected reward is used for actions'
  313. estimates update.</remarks>
  314. </member>
  315. <member name="M:AForge.MachineLearning.Sarsa.#ctor(System.Int32,System.Int32,AForge.MachineLearning.IExplorationPolicy)">
  316. <summary>
  317. Initializes a new instance of the <see cref="T:AForge.MachineLearning.Sarsa"/> class.
  318. </summary>
  319. <param name="states">Amount of possible states.</param>
  320. <param name="actions">Amount of possible actions.</param>
  321. <param name="explorationPolicy">Exploration policy.</param>
  322. <remarks>Action estimates are randomized in the case of this constructor
  323. is used.</remarks>
  324. </member>
  325. <member name="M:AForge.MachineLearning.Sarsa.#ctor(System.Int32,System.Int32,AForge.MachineLearning.IExplorationPolicy,System.Boolean)">
  326. <summary>
  327. Initializes a new instance of the <see cref="T:AForge.MachineLearning.Sarsa"/> class.
  328. </summary>
  329. <param name="states">Amount of possible states.</param>
  330. <param name="actions">Amount of possible actions.</param>
  331. <param name="explorationPolicy">Exploration policy.</param>
  332. <param name="randomize">Randomize action estimates or not.</param>
  333. <remarks>The <b>randomize</b> parameter specifies if initial action estimates should be randomized
  334. with small values or not. Randomization of action values may be useful, when greedy exploration
  335. policies are used. In this case randomization ensures that actions of the same type are not chosen always.</remarks>
  336. </member>
  337. <member name="M:AForge.MachineLearning.Sarsa.GetAction(System.Int32)">
  338. <summary>
  339. Get next action from the specified state.
  340. </summary>
  341. <param name="state">Current state to get an action for.</param>
  342. <returns>Returns the action for the state.</returns>
  343. <remarks>The method returns an action according to current
  344. <see cref="P:AForge.MachineLearning.Sarsa.ExplorationPolicy">exploration policy</see>.</remarks>
  345. </member>
  346. <member name="M:AForge.MachineLearning.Sarsa.UpdateState(System.Int32,System.Int32,System.Double,System.Int32,System.Int32)">
  347. <summary>
  348. Update Q-function's value for the previous state-action pair.
  349. </summary>
  350. <param name="previousState">Curren state.</param>
  351. <param name="previousAction">Action, which lead from previous to the next state.</param>
  352. <param name="reward">Reward value, received by taking specified action from previous state.</param>
  353. <param name="nextState">Next state.</param>
  354. <param name="nextAction">Next action.</param>
  355. <remarks>Updates Q-function's value for the previous state-action pair in
  356. the case if the next state is non terminal.</remarks>
  357. </member>
  358. <member name="M:AForge.MachineLearning.Sarsa.UpdateState(System.Int32,System.Int32,System.Double)">
  359. <summary>
  360. Update Q-function's value for the previous state-action pair.
  361. </summary>
  362. <param name="previousState">Curren state.</param>
  363. <param name="previousAction">Action, which lead from previous to the next state.</param>
  364. <param name="reward">Reward value, received by taking specified action from previous state.</param>
  365. <remarks>Updates Q-function's value for the previous state-action pair in
  366. the case if the next state is terminal.</remarks>
  367. </member>
  368. </members>
  369. </doc>