Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't seem converge on the Pong env. #1

Open
tomchen1000 opened this issue May 10, 2018 · 4 comments
Open

Doesn't seem converge on the Pong env. #1

tomchen1000 opened this issue May 10, 2018 · 4 comments
Labels

Comments

@tomchen1000
Copy link

tomchen1000 commented May 10, 2018

I ran it for >18000 episodes for the Pong environment. The score doesn't converge to high score.

@tdavchev tdavchev added the bug label May 14, 2018
@tdavchev
Copy link
Owner

Hi, thanks, I have noticed similar behaviour before. Would be great if you find a fix. Otherwise, I will try and fix it at first chance.

@Wongcheukwai
Copy link

Have you find the fix yet?

@GoingMyWay
Copy link

GoingMyWay commented Oct 21, 2019

@yadrimz @tomchen1000

In main.py, I found

    tot_reward = tf.Variable(0.)
    tf.summary.scalar("DOCA/Total Reward", tot_reward)
    cum_reward = tf.Variable(0.)
    tf.summary.scalar("DOCA/Cummulative Reward", tot_reward)

I think tf.summary.scalar("DOCA/Cummulative Reward", tot_reward) should be changed to
tf.summary.scalar("DOCA/Cummulative Reward", cum_reward )

@aayushee
Copy link

aayushee commented Mar 5, 2022

Hello, I tried to run the model with different environments but it not converging at all. Made the above change of cumulative reward but that doesn't help.
Any idea for which environment and for how many episodes does this model work ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants