Activation function for linear dataset

Question

I have been working with data sets that mostly show the linear relationship between different attributes/features. What activation should I be using with linear datasets? I have been using sigmoid function until now.

Is there any other activation function I must try?

well, I recommend you to search about it, you'll find for sure! But if you can't find after you search, you can try asking again. — M. Ali Öztürk, Aug 01 '18 at 13:15
For linear relationship use linear regression, no need for deep learning. Nothing will capture a linear relationship with more style. — missuse, Aug 01 '18 at 13:20
@missuse Linear regression would still require an activation function? — Lambar, Aug 01 '18 at 13:22
@Lambar no. check some easily to google topics: [1](http://r-statistics.co/Linear-Regression.html), [2](https://www.statmethods.net/stats/regression.html) and [3](https://datascienceplus.com/linear-regression-from-scratch-in-r/). The R's `lm` function solves the parameters using QR decomposition. — missuse, Aug 01 '18 at 13:26
@RLave If there is a need to determine the output in probability, would not I use `sigmoid` function? — Lambar, Aug 01 '18 at 13:34
There's a little confusion going on here, what's the output of your model? is it binary classification? or regression — RLave, Aug 01 '18 at 13:44
If it's binary (0 or 1), you can use a linear model variation called logistic linear model, and this will output probabilities, no need for a activation function — RLave, Aug 01 '18 at 13:45

score 0 · Answer 1 · answered Aug 15 '18 at 16:34

1) Too many linearly dependent attributes is not good as they may introduce too much noise in comparison with informative attributes. If your sample like IR-spectra of some gas at different temperatures, then it is better to use PCA (or some other dimensiality reduction algorighm) to reduce dimensionality of your data only too the most informative.

2) The activiation function depends on structure of NN as well as of their function. E.g. ReLU activation function is very "trendy" now. For example see the code below for classification iris data set in keras library. Layers have different activation functions.

library(keras)
train <- iris[sample(nrow(iris)),]

y <- train[, "Species"]
x <- train[, 1:4]

x <- as.matrix(apply(x, 2, function(x) (x - min(x)) / (max(x) - min(x))))

levels(y) <- seq_along(y)
y <- to_categorical(as.integer(y) - 1 , num_classes = 3)

model <- keras_model_sequential()

# add layers and activation functions
model %>%
  layer_dense(input_shape = ncol(x), units = 10, activation = "relu") %>%
  layer_dense(units = 10, activation = "relu") %>%
  layer_dense(units = 3, activation = "softmax")

model %>%
  compile(
    loss = "categorical_crossentropy",
    optimizer = "adagrad",
    metrics = "accuracy"
  )

fit <- model %>%
  fit(
    x = x,
    y = y,
    shuffle = T,
    batch_size = 5,
    validation_split = 0.3,
    epochs = 150
  )

Activation function for linear dataset

1 Answers1