Hugo (@robonaissance): "Christopher Watkins was at Philips Research Labs in Surrey, working on expert systems. He had left Cambridge two years earlier without finishing his PhD. He had no paper to present at the workshop, and no specific agenda. During one of the talks at UC Irvine, he was bored. He i…"

Make money doing the work you believe in

Christopher Watkins was at Philips Research Labs in Surrey, working on expert systems. He had left Cambridge two years earlier without finishing his PhD. He had no paper to present at the workshop, and no specific agenda.

During one of the talks at UC Irvine, he was bored. He interrupted the speaker to ask whether anyone in the room had done any work on animal learning. The speaker said, without pausing, that it had all been done already, and continued his presentation.

In the coffee break that followed, Richard Sutton came over and introduced himself. He had liked the question. He gave Watkins reprints of his recent papers, including a 1983 paper, co-authored with Andrew Barto, on a neuronlike system that learned to balance a pole on a cart.

Watkins read the paper on the flight back to London. And then again. And then again. Two years later, the algorithm he developed had a one-letter name. The letter was Q.

Robonaissance

The Journey of RL, Part 2: The Value Hypothesis

Jun 9

4:56 PM

Make money doing the work you believe in

Log in or sign up