Doesn't seem converge on the Pong env. #1

tomchen1000 · 2018-05-10T17:28:35Z

I ran it for >18000 episodes for the Pong environment. The score doesn't converge to high score.

tdavchev · 2018-05-14T16:39:48Z

Hi, thanks, I have noticed similar behaviour before. Would be great if you find a fix. Otherwise, I will try and fix it at first chance.

Wongcheukwai · 2018-09-03T09:49:09Z

Have you find the fix yet?

GoingMyWay · 2019-10-21T01:11:18Z

@yadrimz @tomchen1000

In main.py, I found

    tot_reward = tf.Variable(0.)
    tf.summary.scalar("DOCA/Total Reward", tot_reward)
    cum_reward = tf.Variable(0.)
    tf.summary.scalar("DOCA/Cummulative Reward", tot_reward)

I think tf.summary.scalar("DOCA/Cummulative Reward", tot_reward) should be changed to
tf.summary.scalar("DOCA/Cummulative Reward", cum_reward )

aayushee · 2022-03-05T02:56:43Z

Hello, I tried to run the model with different environments but it not converging at all. Made the above change of cumulative reward but that doesn't help.
Any idea for which environment and for how many episodes does this model work ?

tdavchev added the bug label May 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Doesn't seem converge on the Pong env. #1

Doesn't seem converge on the Pong env. #1

tomchen1000 commented May 10, 2018 •

edited

Loading

tdavchev commented May 14, 2018

Wongcheukwai commented Sep 3, 2018

GoingMyWay commented Oct 21, 2019 •

edited

Loading

aayushee commented Mar 5, 2022

Doesn't seem converge on the Pong env. #1

Doesn't seem converge on the Pong env. #1

Comments

tomchen1000 commented May 10, 2018 • edited Loading

tdavchev commented May 14, 2018

Wongcheukwai commented Sep 3, 2018

GoingMyWay commented Oct 21, 2019 • edited Loading

aayushee commented Mar 5, 2022

tomchen1000 commented May 10, 2018 •

edited

Loading

GoingMyWay commented Oct 21, 2019 •

edited

Loading