Machine learning to play the games by watching the human might sound like the plot of a science fiction novel, but that is exactly what researchers at the OpenAI, which is a non-profit San Francisco based Artificial intelligence research company backed by the Elon Musk, Peter Thiel, some of the other tech luminaries and Google subsidiary DeepMind claim to have accomplished.
The research was submitted to the Neural Information Processing System which is going to be scheduled to take place in Canada at the time of the first week in December.
“To solve complex real-world problems with reinforcement learning, we cannot rely on manually specified reward functions,” the team wrote. “Instead, we can have humans communicate an objective to the agent directly.”
It is a technique which has been referred to in the prior research as “inverse reinforcement learning,” and it also holds promise for the tasks which involved poorly defined objectives which even tend to trip up Artificially Intelligent systems
Game playing against which is created by the researchers AI model did not merely mimic human behavior. If they had, then they would not have been particularly scalable, because they would have required a human expert to teach them how to perform the specific tasks and never would be able to achieve “significantly” better performance than the said experts.
The model as of now consist of the two parts: a Deep Q learning network, which DeepMind clicked in the prior research to achieve the superhuman performance in the Atari 2600 games and a model with reward, a convolutional neural network trained on the labels supplied by an annotator either with a human or a synthetic system during the task training.