-
Notifications
You must be signed in to change notification settings - Fork 4
Using ATHENA
This page details how to setup up and initialise a network model within a Fortran program using the ATHENA library.
As discussed elsewhere, the ATHENA library can be imported using
use athena
Now, we have access to the network model, data augmentation/normalisation procedures, optimisers, and metrics.
To initialise a model, we need to initialise a network_type derived type variable. From here onwards, we will use the network as the name for the model variable. In our variable initialisation we need the following statement:
type(network_type) :: network
To add layers to the network, we need to call the following procedure
call network%add(LAYER_TYPE)
where LAYER_TYPE is the chosen type of the layer. The list of available layer derived types can be found here.
Here is an example of setting up a simple 3-layer dense/full neural network, with two layers being hidden layers (the first with 10 neurons, and the second with 2).
call network%add(full_layer_type(num_outputs=10))
call network%add(full_layer_type(num_outputs=2))
Note, we don't need to define the first (input) layer as its neurons are simply equal to each of the input features.
Whilst we have finished adding layers to the network, we must first define the optimiser. The following variable initialisation must be included at the start of the program or procedure:
type(optimiser_type) :: optimiser
We then need to initialise the learning rate parameter within the optimiser.
optimiser % learning_rate = 0.01
For a more detailed description for the optimiser derived type and its parameters can be found here.
We now have the minimum of what is needed to compile our network.
call network%compile(optimiser=optimiser, loss_method="categorical_crossentropy", metrics="loss")
With the model compiled, we can now train it on dataset features, TRAIN_DATA, and dataset labels, TRAIN_LABELS. To call training, we use the following procedure:
call network%train(TRAIN_DATA, TRAIN_LABELS, NUM_EPOCHS, BATCH_SIZE)
Here, NUM_EPOCHS and BATCH_SIZE are integers defining the number of epochs and batch size for learning on the training dataset, respectively.
Models can be trained and tested on integer and float datasets, of any rank/dimension.
The weights and biases (and other necessary features) can be printed to a file using the following procedure:
call network%print(file="network.out")
The model testing is very similar to the model training. We instead train on dataset features TEST_DATA, and dataset labels TEST_LABELS.
call network%test(TEST_DATA, TEST_LABELS)
Having followed this guide, one should be able to set up, train and print, and test a network model using the ATHENA library.
There exists a set of procedures available for data preprocessing/preparation in the the ATHENA library. These include:
shuffle is used to randomly shuffle the input (and output) datasets to remove any bias associated with initial dataset ordering. An example of using it is:
call shuffle(data, [labels,] dim, seed)
data and labels are the features and labels of the input dataset. dim is the sample/record dimension (and, as such, will be shuffled along), and seed defines the seed for the random number generator. labels is optional. Further details on the procedure can be found here.
split is used to split a dataset into separate training and testing datasets. An example of using it is:
call split(data, labels, train_data, test_data, train_labels, test_labels, dim, train_size, shuffle)
train_ and test_ refer to the split datasets of data and labels, respectively. train_size is the fractional value of the dataset that should populate the training set. dim is the sample/record dimension, and shuffle is an optional boolean stating whether to shuffle the dataset before splitting. If shuffle is true, then it effectively runs call shuffle() first. Further details on the procedure can be found here.
pad_data is used for padding n-dimensional image data (where n=1..5). Further details on the procedure can be found here.
Current normalisation methods are available only for 1D data. However, this is can still be applied to higher dimensional data, as Fortran will reshape the input data to fit a 1D shape.
linear_renormalise can be used to scale features to a specific range (usually set to min=0 and max=1). An example of this procedure is:
call linear_renormalise(data, min, max)
renormalise_norm can be used to scale the norm of the dataset. An example of this procedure is:
call renormalise_norm(data, norm, mirror)
mirror is a boolean whether to first centre the dataset about 0.
renormalise_sum can be used to scale the sum of the dataset. An example of this procedure is:
call renormalise_sum(data, norm, mirror, magnitude)
norm is the flag name for the desired value of the sum of the dataset. magnitude is a boolean whether to sum the magnitude or the exact value of the data.