2

Im a novice experimenting with machine learning. I saw this repo https://github.com/jbp261/Optimal-Classification-Model-of-BLE-RSSI-Dataset and wanted to replicate a similar experiment.

So I have 2 receivers and want to classify near which one the given values of Rssi are closest. I captured some training data and defined area 0(near beacon 1) and area 1(near beacon 2).

I build a model with keras (also tried with a RandomForest which works fine) but even when evaluating the base training data with an accuracy of 0.8 I get 50% wrong prediction.

batch_size = 100

#reading the input samples and separating the input from the outputs
dataframe = pd.read_csv("C:\aaa\Log.csv")
labels = dataframe.pop('result')

#creating the dataset from the data
ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
ds = ds.batch(batch_size)

feature_columns = []
headers = dataframe.columns.tolist()

# numeric cols
for header in headers:
  temp = feature_column.numeric_column(header)
  #feature_columns.append(feature_column.bucketized_column(temp, boundaries=[-70, -60, -50, -40 , -30])) tried also this
  feature_columns.append(temp)

feature_layer = tf.keras.layers.DenseFeatures(feature_columns)

model = tf.keras.Sequential([
  feature_layer,
  layers.Dense(128, activation='relu'),
  layers.Dense(128, activation='relu'),
  layers.Dense(2, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(ds, epochs=20)


test_ds = tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
test_ds = test_ds.batch(batch_size)

loss, accuracy = model.evaluate(test_ds)
print("Accuracy", accuracy)

2 Answers2

2

in model.fit() add some validation (simple way is validation_split=0.5 or whatever percent you want to split.) This takes some of your data, separates it from training data, and only uses it after epoch ends to see how the network is performing on data it has never seen before. This way you'll see loss, accuracy, and validation_loss and validation_accuracy. The latter two better reflect how the model will perform in actual use.

Once you start using that metric, you can see if you're over-fitting, or if changes you make to the network are actually helping or not.

TheLoneDeranger
  • 1,161
  • 9
  • 13
0

I think you want to get 2 outputs within regression value.

Please try to use relu as activation and mean_squared_error as loss.

model = tf.keras.Sequential([
  feature_layer,
  layers.Dense(128, activation='relu'),
  layers.Dense(128, activation='relu'),
  layers.Dense(2, activation='relu')
])

model.compile(optimizer='adam',
              loss='mean_squared_error',
              metrics=['accuracy'])
yaho cho
  • 1,779
  • 1
  • 7
  • 19
  • it's not really working well , i got an 0.45 accuracy while with output sigmoid,sparse_categorical_crossentropy and sgd compiler i get 0.85 – Kondor Kondorowski May 23 '19 at 07:26
  • @KondorKondorowski Yes. I have no information about `Rssi` what you told as the output. I am not sure whether my understanding is correct or not. Anyway, If it is 0 or 1, It's is not regression problem. `sigmoid` and `sparse_categorical_crossentropy` are correct. Is there more information that the link what you mentioned? – yaho cho May 23 '19 at 08:07
  • @KondorKondorowski I open csv file. Is `Location` label? Is it output? – yaho cho May 23 '19 at 08:12
  • Yes Location is the output, in my case i have only 2 possible values( layers.Dense(2) ) but this value might vary depending on the possible distinct location values – Kondor Kondorowski May 23 '19 at 08:38
  • @KondorKondorowski The 'location' is just one field. Do you mean the 2 possible values as the input? – yaho cho May 23 '19 at 08:48
  • no i mean that it can assume only two possible values which i mapped to 0 or 1 while in that csv it has many more(O02 , P01, P02 ... ) – Kondor Kondorowski May 23 '19 at 09:26
  • @KondorKondorowski Ok. But, I can't understand yet how you is going to map `(O02 , P01, P02 ... )` values to `0 or 1`. Anyway, `sigmoid and crossentropy` can be correct an activation and a loss function if your output is `0 or 1`. Just I think you set `1` to labeled location and you set `0` to unlabed location. I am going to try to get higher accuracy and evaluation score what you got by `random forest`. I will touch you later. – yaho cho May 23 '19 at 09:56
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/193831/discussion-between-kondor-kondorowski-and-yaho-cho). – Kondor Kondorowski May 23 '19 at 12:34