Resources and Support for Users of the Supercomputing Wales services
Guidance for Completing the Project Application form on MySCW
Project Title: A descriptive title that aims to cover the aims and objectives of the project and the work that will be undertaken as part of the project.
Project Description: A meaningful description of the scientific goals and objectives of the project, and the methods and objectives to be pursued. Here are some example descriptions to demonstrate the level of detail desired:
Example 1
Title: A DSGE model of shadow banking in the US
Description: Our aim is to bring data to bear on the important policy issues identified here through the means of an estimated and tested DSGE model of the US economy. We use the method of Indirect Inference in order to evaluate whether the model can fit the data. We do this by comparing the performance of the auxiliary model estimated on simulated data with the auxiliary model estimated on the actual data. If our structural model is correct then it should be able to produce simulations with time series properties that statistically match those of the actual data. Generally. we bootstrap 1000 times and in order to estimate the model we use a Simulated Annealing algorithm to find the minimum-value Wald statistic for the model. This gives us a set of parameters that produces simulations that are closest to the data.
Example 2
Title: Development of a Deep Learning Model to Integrate Diagnostics for Noninvasive Detection of Fatty Liver Disease
Description: We will combine those biomarkers with ultrasound imaging, and the clinical,
and laboratory data to develop a deep learning model to predict NASH and will compare the performance of our predictive model with the CK18 and FIB4, two widely-used blood test-based prediction rule (Aim 1). We will further validate our predictive model in an independent cohort of NAFLD subjects for NAFLD risk stratification (Aim 2). Last but not least, we will build an open-access and centralized information system that allows individual clinicians to easily harness our model for diagnosis remotely.
Example 3
Title: Mapping human brain myeloarchitecture by MR imaging
Description: On a 7-Tesla MR scanner, human brain 3D quantitative mapping can be performed at 0,4mm isotropic resolution, providing unprecedented in-vivo information on myelination and delivering new insight into the mechanisms of neurodevelopment, neural plasticity and neurodegeneration.
For a standard-size brain, about 10 million pixels must be processed by fitting the repeated signal to a non-linear model based on the underlying physics. Python is used to process the data due to its flexibility and the existence of consolidated medical imaging libraries. On a single core, the processing can easily take days.
The aim of the project is to speed up the psst-processing of the acquired data. By using the Ray package (Anyscale), the processing, occurring independently over the 10 million pixels, can be distributed over multiple CPUs, and the “promises” from every CPU can be eventually recollected to form the final map. With a 12-core, 2.8GHz Intel Xeon X5660, distribute processing takes 6 hours.
Legacy HPC Wales ID (Project legacy ID from HPC Wales) Although now unlikely after two years of the operational service of Hawk and Sunbird, if you are migrating a project from the HPC Wales legacy HPC system, please enter the project’s ID code e.g. HPCW0337, SAM0012.
Legacy ARCCA ID (Project legacy ID ARCCA): Although unlikely some 10 months after closure of ARCCA’s legacy “Raven” HPC system, if you are migrating a project from Raven, please enter the project’s ID code e.g. PR340.
Owning institution project reference: If your project is externally funded (see below), please enter the reference number provided by the sponsors, e.g., EPSRC (EP/S016376/1), BBSRC (BB/T006188/1)
Department: Your School, Institute and/or Department. If the project is under the auspices of an Institute, please also specify your School.
Principal Investigator’s name, position and email: By default, we request that Principal Investigators for Projects on the system be permanent staff members who are ultimately responsible for the research being undertaken. If you are a PhD student, Research Associate or Research Assistant, this would naturally be your supervisor or line manager. As the person requesting the project, you will be listed as Technical Lead, the person we will contact in first instance on all matters related with this project and who will receive project membership requests and can approve/deny them.
Funding source: If the research study, of which this project is part, has received funding (internal or external), please let us know by selecting the appropriate category. Current options:
BBSRC
EPSRC
MRC
NERC
AHRC
STFC
Leverhulme
EU / H2020
Ser Cymru II
WEFO
Internally Funded
Other (please specify)
None
N/A
Confidential
Start date: This is the date when you expect start using the system. If there is likely to be a delay in starting the project, please let us know here.
End date: Project end date. The project can be suspended from this date, preventing you from submitting further jobs to the system. Please ensure that your data has been removed from the system ahead of this date. If you require extending the duration of your project, please get in touch with us. Please note, any data left behind in the system after the project’s end data will be deleted. We will attempt to contact the project TL and PI but this is not guaranteed, please ensure your data is properly backed up in a suitable location off the system.
Software Requirements: Enter details of any software you require to be installed on the system to be able to use it for your research. Examples include compilers (Intel and GNU C, C++, and Fortran compilers are available already), interpreters (Python and R are available already), open-source libraries, and commercial software (e.g. Stata, Matlab, Molpro). In addition, if you require a specific version of the software then please specify this.
Gateway Requirements: If you have any special requirements on accessing or transferring data to Hawk from an external network, please let us know in this field.
Training Requirements: Do you require training in order to make use of the system (e.g. Linux, Bash, General HPC, containers)? If so, write details of this here.
On-boarding Requirements: Do you require any assistance in adapting your software to be able to make use of the system? Would a virtual session explaining the use of the Scheduler be helpful, If so, write details of these here.
CPU time allocation in hours: How many “core-hours” do you anticipate this project needing? For example, if you have a program that only uses one CPU on your laptop, and runs for 48 hours, and you need to run it 1,000 times, then this number would be 48,000; if instead it used four cores on your laptop for the same amount of time, it would be 192,000. Thus one useful way to quantify the overall project requirements is to first define the requirements of a typical job, then have a stab at estimating the likely number of jobs to be run over the lifetime of the project. We realise this can be tough providing even a ballpark estimate given the dynamics of any research project, but it doesn’t have to be too quantitative, just a qualitative estimate will be fine.
RAM allocation in GBytes: How much memory will your program need at once per node? If you don’t know this number, but know that the program runs on your computer, then enter the amount of RAM on your computer.
Home storage in GBytes: Home storage is long-term file storage used for data you need to keep on the system for months at a time. This is subject to a quota. Enter the amount of storage you need for the longer term; Default storage quota: 50 GBytes on Hawk and 100 GBytes on Sunbird; Default file limit: 100 thousand files on Hawk and Sunbird. Please note that there are no backups available.
Scratch storage in GBytes: Scratch storage is short-term, high-performance parallel file storage used for intermediary data you need temporarily but will either delete or move off the system once it is no longer needed. Default storage quota: 3 TBytes on Hawk and 20 TBytes on Sunbird; Default file limit: 3 million files on Hawk and 10 million files on Sunbird. Please notice, when the scratch filesystem approaches full capacity, old unused files will be deleted. Please also note that there are no backups available.
Upload Supporting Documents: Any other documentation relevant for the project application. This could include relevant publications from previous work, a case for greater allocations than the defaults (storage etc), detailed software requirements etc.