Trying to write a softmax and NNLib softmax giving unexpected output

Question

I am working through a python book.. but using Julialang instead.. in order to learn the language etc... and I have come upon another area here where I am not quite clear ..

but when i start tossing more complex matrices it fell apart..

include("activation_function_exercise/spiral_data.jl")
include("activation_function_exercise/dense_layer.jl")
include("activation_function_exercise/activation_relu.jl")
include("activation_function_exercise/activation_softmax.jl")

coords, color = spiral_data(100, 3)

dense1 = LayerDense(2,3)
dense2 = LayerDense(3,3)

forward(dense1, coords)
println("Forward 1 layer")
activated_output = relu_activation(dense1.output)
forward(dense2, activated_output)
println("Forward 2 layer")
activated_output2 = softmax_activation(dense2.output)

println("\n", activated_output2)

I get a proper matrix back

julia> activated_output2
300×3 Matrix{Float64}:
 0.00333346  0.00333337  0.00333335
 0.00333345  0.00333337  0.00333335
 0.00333345  0.00333336  0.00333335
 0.00333344  0.00333336  0.00333335
 0.00333343  0.00333336  0.00333334
 0.00333311  0.00333321  0.00333322

but the book has

>>>
[[0.33333 0.3333 0.3333]
...

Seems I am an order of magnitude lower than the book? even when using FluxMLs softmax function

EDIT:

I thought maybe my ReLU activation code was causing the discrepancy.. and tried switching to the FluxML NNlib version... but get same activated_output2 with 0.0033333 instead of 0.333333

will keep checking other parts like my forward function

EDIT2:

Adding my DenseLayer implementation for completeness

DenseLayer

# see https://github.com/FluxML/Flux.jl/blob/b78a27b01c9629099adb059a98657b995760b617/src/layers/basic.jl#L71-L111
using Base: Integer, Float64

mutable struct LayerDense
    weights::Matrix{Float64}
    biases::Matrix{Float64}
    num_inputs::Integer
    num_neurons::Integer
    output::Matrix{Float64}
    LayerDense(num_inputs::Integer, num_neurons::Integer) = new(0.01 * randn(num_inputs, num_neurons), zeros((1, num_neurons)),num_inputs, num_neurons)
end


function forward(layer::LayerDense, inputs::Matrix{Float64})
    layer.output = inputs * layer.weights .+ layer.biases
end

EDIT3:

Using the library.. I started inspecting my spiral_data implementation.. seems within reason

Python

import numpy as np
import nnfs

from nnfs.datasets import spiral_data

nnfs.init()


X, y = spiral_data(samples=100, classes=3)

print(X[:4]). # just check the first couple

>>>
[[0.         0.        ]
 [0.00299556 0.00964661]
 [0.01288097 0.01556285]
 [0.02997479 0.0044481 ]]

JuliaLang

include("activation_function_exercise/spiral_data.jl")

coords, color = spiral_data(100, 3)

julia> coords
300×2 Matrix{Float64}:
  0.0         0.0
 -0.00133462  0.0100125
  0.00346739  0.0199022
 -0.00126302  0.0302767
  0.00184948  0.0403617
  0.0113095   0.0492225
  0.0397276   0.0457691
  0.0144484   0.0692151
  0.0181726   0.0787382
  0.0320308   0.0850793

The "quite cryptic" part is key. Your Julia implementation is way off the Python code in multiple aspects. — phipsgabler, Jul 07 '21 at 11:22
ignoring the softmax.. and just using `NNlib` is fine.. but I am trying to figure out how my matrix has off by an order of magnitude ( `/100` ) vs the expected output — Erik, Jul 07 '21 at 11:24
Those are two different questions -- why does _your_ function not work, and why does the NNlib.jl function not work. Please ask them separately! (Since there are no answers already, you could cut out the NNlib part.) — phipsgabler, Jul 07 '21 at 11:27
i think I have solved my issue.. that the `NNlib` softmax is being applied to the entire matrix.. not row-wise... so I am now trying to piece together how to do row-wise instead — Erik, Jul 07 '21 at 16:34

score 1 · Answer 1 · answered Jul 07 '21 at 16:43

turned out I was using the NNlib softmax on the entire matrix.. which the python book was NOT doing.. and all in needed to do was to modify my softmax() call likeso

using NNlib

function softmax_activation(inputs)
    return softmax(inputs, dims=2)
end

Then the output at the end of my big long example comes out as expected

#using Pkg
#Pkg.add("Plots")

include("activation_function_exercise/spiral_data.jl")
include("activation_function_exercise/dense_layer.jl")
include("activation_function_exercise/activation_relu.jl")
include("activation_function_exercise/activation_softmax.jl")

coords, color = spiral_data(100, 3)

dense1 = LayerDense(2,3)
dense2 = LayerDense(3,3)

# Julia doesn't lend itself to OO programming...
# so the following will just be function
# activation1 = activation_relu
# activation2 = activation_softmax

forward(dense1, coords)
activated_output = relu_activation(dense1.output)
forward(dense2, activated_output)
activated_output2 = softmax_activation(dense2.output)


using Plots

#scatter(coords[:,1], coords[:,2])
scatter(coords[:,1], coords[:,2], zcolor=color, framestyle=:box)

display(activated_output2)

300×3 Matrix{Float64}:
 0.333333  0.333333  0.333333
 0.333336  0.333334  0.33333
 0.333338  0.333339  0.333323
 0.33334   0.333344  0.333316
 0.333339  0.333361  0.3333
 0.333341  0.333365  0.333294
 0.333345  0.333362  0.333293
 0.333345  0.333374  0.333281
 0.333349  0.33337   0.333281
 0.333347  0.33339   0.333262
 ⋮                   
 0.333564  0.332673  0.333764
 0.333583  0.332885  0.333532
 0.333588  0.332967  0.333445
 0.333587  0.333148  0.333265
 0.333593  0.332935  0.333472
 0.333596  0.333006  0.333398
 0.333583  0.33333   0.333086
 0.3336    0.333062  0.333338
 0.333603  0.333082  0.333316

Trying to write a softmax and NNLib softmax giving unexpected output

1 Answers1