1. Learn action-value representation -- Q value instead of learning utility
2. Denote Q-values related to utility values
3. Q function is a way of utility function, but Q function does not need a model for learning action selection. So it is model free since temporal difference does not need to know transition model but only handle the changes. How to deal with the things are no-change. It is a frame problem.
4. Agent function (an evaluation function) is represented by Q function instead of model + utility function
5. Traditional AI is to build knowledge-based method (build a representation of some aspect of the environment in which agent is situated).
6. environment becomes complex, the advantages of a knowledge-based approach become more apparent. Knowledge-based model representation has met more success than Q-learning methods
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment