ISIS Neutron and Muon Source Data Journal

This is a page describing data taken during an experiment at the ISIS Neutron and Muon Source. Information about the ISIS Neutron and Muon Source can be found at https://www.isis.stfc.ac.uk.


NEUTRON-INDUCED EFFECTS AND FAULT TOLERANCE IN MODERN PARALLEL AND HETEROGENEOUS ARCHITECTURES

Abstract: We evaluate the resilience of modern parallel devices for high performance computing applications, i.e. Graphics Processing Units (GPUs), and heterogeneous Systems on Chips, i.e. Accelerated Processing Units. Nowadays the error rate of supercomputers may be extremely high. As we have evaluated with previous experiments at ISIS, the error rate of TITAN, a supercomputer composed of 18,000 GPUs, can be of up to one error every 10 minutes. TITAN personnel confirmed this value. Additionally, we were able to match experimental data gathered at ISIS with TITAN field data, based on more than 1,400 millions GB of data, and 500 millions GPU node hours of operation. Finally, we will study the neutron sensitivity of modern System on Chips embedding ARM core and FPGA programmable logic. We will exploit the FPGA programmability to implement a general-purpose mitigation scheme for the ARM processor.

Principal Investigator: Dr Paolo Rech
Experimenter: Professor Luigi Carro
Experimenter: Mr Lucas Tambara
Experimenter: Dr Luca Sterpone

DOI: 10.5286/ISIS.E.RB1510031

ISIS Experiment Number: RB1510031

Part DOI Instrument Public release date Download Link
10.5286/ISIS.E.61786330 VESUVIO 24 July 2018 Download
10.5286/ISIS.E.61010509 VESUVIO 27 July 2018 Download

Publisher: STFC ISIS Neutron and Muon Source

Data format: RAW/Nexus
Select the data format above to find out more about it.

Data Citation

The recommended format for citing this dataset in a research publication is as:
[author], [date], [title], [publisher], [doi]

For Example:
Dr Paolo Rech et al; (2015): NEUTRON-INDUCED EFFECTS AND FAULT TOLERANCE IN MODERN PARALLEL AND HETEROGENEOUS ARCHITECTURES, STFC ISIS Neutron and Muon Source, https://doi.org/10.5286/ISIS.E.RB1510031

Data is released under the CC-BY-4.0 license.



UKRI


Science and Technology Facilities Council Switchboard: 01793 442000