Highest Voted 'reward' Questions

1

vote

1 answer

Can contextual bandit rewards be changed over time?

I am working on implementing a contextual bandit with Vowpal Wabbit for dynamic pricing where arms represent price margins. The cost/reward is determined by taking price – expected cost. Cost is not known initially so it is a prediction and has the…

asked Dec 29 '21 at 16:40

aab

11
2

1

vote

2 answers

How to prevent my reward sum received during evaluation runs repeating in intervals when using RLlib?

I am using Ray 1.3.0 (for RLlib) with a combination of SUMO version 1.9.2 for the simulation of a multi-agent scenario. I have configured RLlib to use a single PPO network that is commonly updated/used by all N agents. My evaluation settings look…

reinforcement-learning ray multi-agent reward rllib

asked Jun 21 '21 at 15:08

hridayns

697
8
16

1

vote

0 answers

Understanding the reward functionality in Reinforcment learning (atari breakout)

I'm trying to understand the reward functionality in Breakout atari implemented by Deepmind. I'm a little confused about the reward. They represent every state using four frames and depending on that the reward for every action will be received…

reinforcement-learning dqn reward

asked Mar 04 '21 at 14:34

jon

11
4

1

vote

0 answers

Is the reward related to previous state or next state?

In the reinforcement learning framework, I am a little bit confused about the reward and how it is related to states. For example, in Q-learning, we have the following formula for updating the Q table: that means that the reward is obtained from…

reinforcement-learning q-learning reward

asked Jan 03 '21 at 16:46

MadMage

186
1
7

1

vote

0 answers

What is the best way to deal with imbalanced sample database with rewards

I look for a solution to train a DNNClassifier (4 classes, 20 numeric features) from imbalanced rewarded samples datafile. Each class represents a game action and reward the action score. Features are given observations. So it looks as QLearning…

tensorflow weighted q-learning reward

asked Jan 23 '20 at 16:34

GerardL

81
7

1

vote

1 answer

How to train a bad reward with a classifying Neural Net?

I am trying to train a Neural Net on playing Tic Tac Toe via Reinforcement Learning with Keras, Python. Currently the Net gets an Input of the current board: array([0,1,0,-1,0,1,0,0,0]) 1 = X -1 = O 0 = an empty field If the Net won a game it…

python keras reinforcement-learning reward

asked Jan 04 '20 at 15:13

nailuj05

37
7

1

vote

0 answers

Why are my rewards converging but still have a lot of variations

I am training a reinforcement learning agent on an episodic task of fixed episode length. I am tracking the training process by plotting the cumulative rewards over an episode. I am using tensorboard for plotting the rewards. I have trained my agent…

artificial-intelligence reinforcement-learning convergence reward

asked Nov 29 '19 at 10:30

chink

1,505
3
28
70

1

vote

1 answer

Custom environment Gym for step function processing with DDPG Agent

I'm new to reinforcement learning, and I would like to process audio signal using this technique. I built a basic step function that I wish to flatten to get my hands on Gym OpenAI and reinforcement learning in general. To do so, I am using the…

reinforcement-learning openai-gym reward

asked Jul 08 '19 at 08:32

Post. T.

79
5

1

vote

1 answer

Discounted rewards in basic reinforcement learning

I'm wondering how discounting rewards for reinforcement learning actually works. I believe the idea is that rewards later in an episode get weighted heavier than early rewards. That makes perfect sense to me. I'm having a hard time understanding how…

python reinforcement-learning reward

asked Apr 21 '19 at 01:12

Perks

11
2

1

vote

1 answer

Keras Reinforcement Learning: How to pass reward to the model

import numpy as np import gym from gym import wrappers # 追加 from keras.models import Sequential from keras.layers import Dense, Activation, Flatten from keras.optimizers import Adam from rl.agents.dqn import DQNAgent from rl.policy import…

keras reinforcement-learning reward keras-rl

asked Jun 12 '18 at 05:22

leppy

49
4

1

vote

0 answers

Canvas problems. Not able to reproduce design

I need to build canvas animation like design requires. I spend almost 3 days but I'm not able to do anything like in design. Here a REQUESTED design!. And here - what I've got for now: current implementation which definitely not what requested from…

javascript canvas reward

asked Jan 25 '18 at 18:09

MR.QUESTION

359
2
9

1

vote

1 answer

WebView remote site and reward videos

I have a simple game developed in PHP. I have loaded the remote site in Android WebView. I want to find out that if user clicks on a FREE life button which is on my remote PHP site, I want to start a reward video on my Android app. But how can I…

android admob android-webview reward

asked Jan 23 '18 at 14:17

Sam

2,972
6
34
62

1

vote

0 answers

Rewarded Videos - Time Left Counter

I would like to ask a question about rewarded videos in Android. I have set the rewarded videos to show once per hour, which means every user can watch one video per one hour. My question is, what is the USER? What is google tracking actually? Is it…

android video time reward

asked Nov 10 '17 at 18:04

Filjan Kishija

11
2

1

vote

1 answer

Android app coding error

I downloaded a source code and i saw some errors i cannot fix on my on its listed below MainActivity.java package com.droidoxy.pocket; also got error in import android.transition.*; private void checkReadPhoneStatePermission() { if…

java pocket reward

asked Nov 23 '16 at 00:48

user7197369

1

vote

2 answers

How do I implement admob rewarded ads into unity

using UnityEngine; using System.Collections; using GoogleMobileAds; using GoogleMobileAds.Api; using UnityEngine.Advertisements; public class GameAdvertising : MonoBehaviour { public RewardBasedVideoAd rewardBasedVideo; bool hasPlayed; …

android unity-game-engine admob ads reward

asked May 16 '16 at 21:16

Physix

27
4
9

Questions tagged [reward]