Automatic identification of scientific publications describing digital reconstructions of neural morphology

Brain Informatics

Table 3 Hyper-parameters used to train the deep neural network

Hyperparameter	Applied Value
Hidden units	64
Hidden layers	2
Optimization algorithm	Stochastic Gradient Descent:
	(1) \(\overset{\wedge }{y} = \frac{1}{m}\nabla _\theta \displaystyle \sum _{i=1}^{m} L(x^{i};\theta ;y^i)\)
	(2) \(\theta = \theta - \alpha \overset{\wedge }{y}\)
	Learning rate (\(\alpha \)): 0.01
	Mini-batch size (m): 8
	Loss Function (L) : \(Mean\ Square\ Error = (y-\overset{\wedge }{y})^2 \)
Activation functions (g)	\(Hidden\ layers: \ ReLU = max(0, z)\)
	\(Output\ layer: \sigma = \frac{1}{1+e^{-z}}\)
Epochs	Number of training iterations over the data set: 30