So his approach was to take a list of 50 key deep reinforcement learning papers, read one or two a day, and pick a handful to actually reproduce.
He spent a bunch of time coding in Python and TensorFlow, sometimes 12 hours a day, trying to debug and tune things until they were actually working.
writing TensorFlow, starting a virtual machine with a GPU, etc).
focusing on well-known and established ML algorithms is probably better for your learning.
What you do need is to get your hands dirty implementing and debugging ML algorithms, and to build evidence for job interviews that you have some experience doing this.
The most straightforward way to gain this experience is to choose a subfield of ML relevant to a lab you’re interested in. Then read a few dozen of the subfield’s key papers, and reimplement a few of the foundational algorithms that the papers are based on or reference most frequently
@@ part2
They got through about 20-30 of those papers, spending maybe 1.5 hours independently reading and half an hour discussing each paper.
they implemented a handful of the key algorithms in TensorFlow:
They spent 2-10 days on each algorithm (in parallel as experiments ran), depending on how in-depth they wanted to go.
Once the algorithm was partially working, they would attain higher performance by looking for remaining bugs, both by reviewing the code carefully, and by collecting metrics such as average policy entropy to perform sanity-checks,
finally, when they wanted to match the performance of Baselines, they scrutinized the Baselines implementations for small important details, such as exactly how to preprocess and normalize observations.
going from math in a paper to running code.