Skip to content

Using ATHENA

Ned Taylor edited this page Mar 11, 2024 · 5 revisions

This page details how to setup up and initialise a network model within a Fortran program using the ATHENA library.

As discussed elsewhere, the ATHENA library can be imported using

use athena

Now, we have access to the network model, data augmentation/normalisation procedures, optimisers, and metrics.

Model initialising

To initialise a model, we need to initialise a network_type derived type variable. From here onwards, we will use the network as the name for the model variable. In our variable initialisation we need the following statement:

type(network_type) :: network

To add layers to the network, we need to call the following procedure

call network%add(LAYER_TYPE)

where LAYER_TYPE is the chosen type of the layer. The list of available layer derived types can be found here.

Here is an example of setting up a simple 3-layer dense/full neural network, with two layers being hidden layers (the first with 10 neurons, and the second with 2).

call network%add(full_layer_type(num_outputs=10))
call network%add(full_layer_type(num_outputs=2))

Note, we don't need to define the first (input) layer as its neurons are simply equal to each of the input features.

Whilst we have finished adding layers to the network, we must first define the optimiser. The following variable initialisation must be included at the start of the program or procedure:

type(optimiser_type) :: optimiser

We then need to initialise the learning rate parameter within the optimiser.

optimiser % learning_rate = 0.01

For a more detailed description for the optimiser derived type and its parameters can be found here.

We now have the minimum of what is needed to compile our network.

call network%compile(optimiser=optimiser, loss_method="categorical_crossentropy", metrics="loss")

Model training

With the model compiled, we can now train it on dataset features, TRAIN_DATA, and dataset labels, TRAIN_LABELS. To call training, we use the following procedure:

call network%train(TRAIN_DATA, TRAIN_LABELS, NUM_EPOCHS, BATCH_SIZE)

Here, NUM_EPOCHS and BATCH_SIZE are integers defining the number of epochs and batch size for learning on the training dataset, respectively.

Models can be trained and tested on integer and float datasets, of any rank/dimension.

Model printing

The weights and biases (and other necessary features) can be printed to a file using the following procedure:

call network%print(file="network.out")

Model testing

The model testing is very similar to the model training. We instead train on dataset features TEST_DATA, and dataset labels TEST_LABELS.

call network%test(TEST_DATA, TEST_LABELS)

Having followed this guide, one should be able to set up, train and print, and test a network model using the ATHENA library.

Data preprocessing

There exists a set of procedures available for data preprocessing/preparation in the the ATHENA library. These include:

Data shuffling

shuffle is used to randomly shuffle the input (and output) datasets to remove any bias associated with initial dataset ordering. An example of using it is:

call shuffle(data, [labels,] dim, seed)

data and labels are the features and labels of the input dataset. dim is the sample/record dimension (and, as such, will be shuffled along), and seed defines the seed for the random number generator. labels is optional. Further details on the procedure can be found here.

Data splitting

split is used to split a dataset into separate training and testing datasets. An example of using it is:

call split(data, labels, train_data, test_data, train_labels, test_labels, dim, train_size, shuffle)

train_ and test_ refer to the split datasets of data and labels, respectively. train_size is the fractional value of the dataset that should populate the training set. dim is the sample/record dimension, and shuffle is an optional boolean stating whether to shuffle the dataset before splitting. If shuffle is true, then it effectively runs call shuffle() first. Further details on the procedure can be found here.

Data padding

pad_data is used for padding n-dimensional image data (where n=1..5). Further details on the procedure can be found here.

Data normalisation

Current normalisation methods are available only for 1D data. However, this is can still be applied to higher dimensional data, as Fortran will reshape the input data to fit a 1D shape.

Min-max scaling (linear normalisation)

linear_renormalise can be used to scale features to a specific range (usually set to min=0 and max=1). An example of this procedure is:

call linear_renormalise(data, min, max)

Standardisation

renormalise_norm can be used to scale the norm of the dataset. An example of this procedure is:

call renormalise_norm(data, norm, mirror)

mirror is a boolean whether to first centre the dataset about 0.

renormalise_sum can be used to scale the sum of the dataset. An example of this procedure is:

call renormalise_sum(data, norm, mirror, magnitude)

norm is the flag name for the desired value of the sum of the dataset. magnitude is a boolean whether to sum the magnitude or the exact value of the data.

Clone this wiki locally