Questions tagged [reward]

Use this tag in the context of reward functions for machine learning and especially reinforcement learning.

Use this tag in the context of reward functions for machine learning and especially reinforcement learning.

66 questions
1
vote
1 answer

Can contextual bandit rewards be changed over time?

I am working on implementing a contextual bandit with Vowpal Wabbit for dynamic pricing where arms represent price margins. The cost/reward is determined by taking price – expected cost. Cost is not known initially so it is a prediction and has the…
aab
  • 11
  • 2
1
vote
2 answers

How to prevent my reward sum received during evaluation runs repeating in intervals when using RLlib?

I am using Ray 1.3.0 (for RLlib) with a combination of SUMO version 1.9.2 for the simulation of a multi-agent scenario. I have configured RLlib to use a single PPO network that is commonly updated/used by all N agents. My evaluation settings look…
hridayns
  • 697
  • 8
  • 16
1
vote
0 answers

Understanding the reward functionality in Reinforcment learning (atari breakout)

I'm trying to understand the reward functionality in Breakout atari implemented by Deepmind. I'm a little confused about the reward. They represent every state using four frames and depending on that the reward for every action will be received…
jon
  • 11
  • 4
1
vote
0 answers

Is the reward related to previous state or next state?

In the reinforcement learning framework, I am a little bit confused about the reward and how it is related to states. For example, in Q-learning, we have the following formula for updating the Q table: that means that the reward is obtained from…
MadMage
  • 186
  • 1
  • 7
1
vote
0 answers

What is the best way to deal with imbalanced sample database with rewards

I look for a solution to train a DNNClassifier (4 classes, 20 numeric features) from imbalanced rewarded samples datafile. Each class represents a game action and reward the action score. Features are given observations. So it looks as QLearning…
GerardL
  • 81
  • 7
1
vote
1 answer

How to train a bad reward with a classifying Neural Net?

I am trying to train a Neural Net on playing Tic Tac Toe via Reinforcement Learning with Keras, Python. Currently the Net gets an Input of the current board: array([0,1,0,-1,0,1,0,0,0]) 1 = X -1 = O 0 = an empty field If the Net won a game it…
nailuj05
  • 37
  • 7
1
vote
0 answers

Why are my rewards converging but still have a lot of variations

I am training a reinforcement learning agent on an episodic task of fixed episode length. I am tracking the training process by plotting the cumulative rewards over an episode. I am using tensorboard for plotting the rewards. I have trained my agent…
1
vote
1 answer

Custom environment Gym for step function processing with DDPG Agent

I'm new to reinforcement learning, and I would like to process audio signal using this technique. I built a basic step function that I wish to flatten to get my hands on Gym OpenAI and reinforcement learning in general. To do so, I am using the…
Post. T.
  • 79
  • 5
1
vote
1 answer

Discounted rewards in basic reinforcement learning

I'm wondering how discounting rewards for reinforcement learning actually works. I believe the idea is that rewards later in an episode get weighted heavier than early rewards. That makes perfect sense to me. I'm having a hard time understanding how…
Perks
  • 11
  • 2
1
vote
1 answer

Keras Reinforcement Learning: How to pass reward to the model

import numpy as np import gym from gym import wrappers # 追加 from keras.models import Sequential from keras.layers import Dense, Activation, Flatten from keras.optimizers import Adam from rl.agents.dqn import DQNAgent from rl.policy import…
leppy
  • 49
  • 4
1
vote
0 answers

Canvas problems. Not able to reproduce design

I need to build canvas animation like design requires. I spend almost 3 days but I'm not able to do anything like in design. Here a REQUESTED design!. And here - what I've got for now: current implementation which definitely not what requested from…
MR.QUESTION
  • 359
  • 2
  • 9
1
vote
1 answer

WebView remote site and reward videos

I have a simple game developed in PHP. I have loaded the remote site in Android WebView. I want to find out that if user clicks on a FREE life button which is on my remote PHP site, I want to start a reward video on my Android app. But how can I…
Sam
  • 2,972
  • 6
  • 34
  • 62
1
vote
0 answers

Rewarded Videos - Time Left Counter

I would like to ask a question about rewarded videos in Android. I have set the rewarded videos to show once per hour, which means every user can watch one video per one hour. My question is, what is the USER? What is google tracking actually? Is it…
1
vote
1 answer

Android app coding error

I downloaded a source code and i saw some errors i cannot fix on my on its listed below MainActivity.java package com.droidoxy.pocket; also got error in import android.transition.*; private void checkReadPhoneStatePermission() { if…
user7197369
1
vote
2 answers

How do I implement admob rewarded ads into unity

using UnityEngine; using System.Collections; using GoogleMobileAds; using GoogleMobileAds.Api; using UnityEngine.Advertisements; public class GameAdvertising : MonoBehaviour { public RewardBasedVideoAd rewardBasedVideo; bool hasPlayed; …
Physix
  • 27
  • 4
  • 9