If you want to compile to run on a , please first read these instructions on how to compile GROMACS on an AMI without CUDA. These instructions then explain how to install the and compile GROMACS against it.
The first few steps are loosely based on , except rather than download the NVIDIA driver, we shall download the CUDA toolkit since this includes an NVIDIA driver. First we need to make sure the kernel is updated
sudo yum install kernel-devel-`uname -r`
sudo reboot
Safest to do a quick reboot here. Assuming you are in your HOME directory, move into your packages folder.
cd packages/
And download the (version 7.5 at present)
wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda_7.5.18_linux.run
sudo /bin/bash cuda_7.5.18_linux.run
It will ask you to accept the license and then asks you a series of questions. I answer Yes to everything except installing the CUDA samples. Now add the following to the end of your ~/.bash_profile using a text editor
export PATH; PATH="/usr/local/cuda-7.5/bin:$PATH"
export LD_LIBRARY_PATH; LD_LIBRARY_PATH="/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH"
Now we can build GROMACS against the CUDA toolkit. I’m assuming you’ve already downloaded a version of GROMACS and probably installed a non-CUDA version of GROMACS (so you’ll already have one build directory). Let’s make another build directory. You can call it what you want, but some kind of consistent naming can be helpful. The -j 4 flag assumes you have four cores to compile on – this will depend on the EC2 instance you have deployed. Obviously the more cores, the faster, but GROMACS only takes minutes, not hours.
mkdir build-gcc48-cuda75
cd build-gcc48-cuda75
cmake .. -DGMX_BUILD_OWN_FFTW=ON -DCMAKE_INSTALL_PREFIX=/usr/local/gromacs/5.0.7-cuda/  -DGMX_GPU=ON -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda
make -j 4
sudo make install
To load all the GROMACS tools into your $PATH, run this command and you are done!
source /usr/local/gromacs/5.0.7-cuda/bin/GMXRC
If you run this mdrun binary on a  it should automatically detect the GPU and run on it, assuming your MDP file options support this. If it does you will see this zip by in the log file as GROMACS starts up
1 GPU detected:
  #0: NVIDIA GRID K520, compute cap.: 3.0, ECC:  no, stat: compatible
1 GPU auto-selected for this run.
Mapping of GPU to the 1 PP rank in this node: #0
Will do PME sum in reciprocal space for electrostatic interactions.
Depending on the size and forcefield you are using you should get a speedup of at least a factor two, and realistically three, using a GPU in combination with the CPUs. For example, see these benchmarks.