tMerge branch 'master' of https://github.com/anders-dc/sphere - sphere - GPU-ba… | |
git clone git://src.adamsgaard.dk/sphere | |
Log | |
Files | |
Refs | |
LICENSE | |
--- | |
commit e7349ff3e5028a04e59d49a52bcc3b5b36d1aa11 | |
parent 0531e489b13785c47cc32ec30d5f2852f02b88b1 | |
Author: Anders Damsgaard <[email protected]> | |
Date: Tue, 25 Sep 2012 09:17:40 +0200 | |
Merge branch 'master' of https://github.com/anders-dc/sphere | |
Diffstat: | |
A doc/Makefile | 9 +++++++++ | |
M doc/sphere-doc.pdf | 0 | |
M doc/sphere-doc.tex | 142 ++++++++++++++++-------------… | |
3 files changed, 82 insertions(+), 69 deletions(-) | |
--- | |
diff --git a/doc/Makefile b/doc/Makefile | |
t@@ -0,0 +1,9 @@ | |
+INTERPRETER=pdflatex | |
+SOURCE=sphere-doc.tex | |
+PDF=$(SOURCE:.tex=.pdf) | |
+ | |
+$(PDF): $(SOURCE) | |
+ $(INTERPRETER) $< $@ | |
+ | |
+clean: | |
+ $(RM) *.{out,toc,log,aux} | |
diff --git a/doc/sphere-doc.pdf b/doc/sphere-doc.pdf | |
Binary files differ. | |
diff --git a/doc/sphere-doc.tex b/doc/sphere-doc.tex | |
t@@ -88,9 +88,9 @@ | |
\begin{document} | |
\title{Users guide to \texttt{SPHERE}:\\ GPU based discrete element method sof… | |
\author{Anders Damsgaard Christensen\\ | |
- \url{[email protected]}\\ | |
+ \url{[email protected]}\\ | |
\url{http://cs.au.dk/~adc/}} | |
- \date{Last revision: \today \\[5mm] Version \textbf{0.1} \\ | |
+ \date{Last revision: \today \\[5mm] Version \textbf{0.2} \\ | |
%\begin{center} \includegraphics[scale=0.12]{FigCover} \end{center} | |
} | |
\maketitle | |
t@@ -98,12 +98,14 @@ | |
\thispagestyle{empty} | |
\begin{abstract} | |
-\noindent This document is the official documentation for the \texttt{SPHERE} … | |
+\noindent This document is the official documentation for the \texttt{SPHERE} … | |
-\texttt{SPHERE} is developed by Anders Damsgaard Christensen under supervision… | |
+\texttt{SPHERE} is developed by Anders Damsgaard Christensen under supervision… | |
\end{abstract} | |
\vfill | |
+\marginpar{Todo: Change cover image} | |
+ | |
\begin{figure}[htb] | |
\begin{center} | |
\includegraphics[width=0.9\textwidth]{quiver3.eps} | |
t@@ -119,6 +121,7 @@ | |
\textbf{Date} & \textbf{Doc. version} & \textbf{\texttt{SPHERE} version} & \te… | |
\hline | |
2010-12-06 & 0.1 & Beta 0.03 & Initial draft \\ | |
+2012-09-13 & 0.2 & Beta 0.25 & Updated for Python API and major source code … | |
% & & \\ | |
% & & \\ | |
\hline | |
t@@ -134,7 +137,7 @@ | |
\newpage | |
\section{Introduction} | |
-The \texttt{SPHERE}-software is used for three-dimensional discrete element me… | |
+The \texttt{SPHERE}-software is used for three-dimensional discrete element me… | |
The ultimate aim of the \texttt{SPHERE} software is to simulate soft-bedded su… | |
\begin{itemize} | |
t@@ -143,89 +146,102 @@ The ultimate aim of the \texttt{SPHERE} software is to … | |
\item A CUDA-enabled GPU with compute capability 1.1 or greater\footnote{See… | |
\item The CUDA Developer Drivers and the CUDA Toolkit\footnote{Obtainable fr… | |
\end{itemize} | |
-For simulation setup and data handling, a {\sc Matlab} installation of a recen… | |
+For simulation setup and data handling, a Python distribution of a recent vers… | |
\section{Discrete element method theory} | |
\label{sec:DEMtheory} | |
-The discrete element method (or distinct element method) was initially formula… | |
+The discrete element method (or distinct element method) was initially formula… | |
+\marginpar{Todo: Expand this section; contact models, etc.} | |
\section{\texttt{SPHERE} source code structure} | |
\label{sec:spheresrcstructure} | |
-The source code is located in the \texttt{sphere/src/} folder, and named \text… | |
+The source code is located in the \texttt{sphere/src/} folder. After compiling… | |
\begin{enumerate} | |
- \item Setup of particle assemblage, physical properties and conditions… | |
- \item Execution of \texttt{SPHERE} software, which simulates the parti… | |
- \item Inspection, analysis, interpretation and visualization of \textt… | |
+ \item Setup of particle assemblage, physical properties and conditions… | |
+ \item Execution of \texttt{SPHERE} software, which simulates the parti… | |
+ \item Inspection, analysis, interpretation and visualization of \textt… | |
\end{enumerate} | |
\subsection{The \texttt{SPHERE} algorithm} | |
\label{subsec:spherealgo} | |
-The \texttt{SPHERE}-binary is launched from the system terminal by passing the… | |
+The \texttt{SPHERE}-binary is launched from the system terminal by passing the… | |
\begin{enumerate} | |
- \item System check, including search for NVIDIA CUDA compatible devices. | |
- | |
- \item Initial data import from binary {\sc Matlab} file. | |
+ \item System check, including search for NVIDIA CUDA compatible devices (\te… | |
- \item Allocation of memory for all host variables (particles, grid, etc.). | |
+ \item Initial data import from binary input file (\texttt{main.cpp}). | |
- \item Continued import from binary {\sc Matlab} file. | |
+ \item Allocation of memory for all host variables (particles, grid, walls, e… | |
- \item Memory allocation of device memory. | |
+ \item Continued import from binary input file (\texttt{main.cpp}). | |
+ | |
+ \item Control handed to GPU-specific function \texttt{gpuMain(\ldots)} (\tex… | |
- \item Transfer of data from host to device variables. | |
+ \item Memory allocation of device memory (\texttt{device.cu}). | |
- \item Initialization of CUDPP radix sort configuration. | |
+ \item Transfer of data from host to device variables (\texttt{device.cu}). | |
- \item OpenGL initialization. | |
+ \item Initialization of Thrust\footnote{\url{https://code.google.com/p/thrus… | |
- \item Status and data written to \verb"<simulation_ID>.status.dat" and \verb… | |
+ \item Calculation of GPU workload configuration (thread and block layout) (\… | |
+ | |
+ \item Status and data written to \verb"<simulation_ID>.status.dat" and \verb… | |
- \item Main loop (while \texttt{time.current <= time.total}): | |
+ \item Main loop (while \texttt{time.current <= time.total}) (functions calle… | |
\begin{enumerate} | |
\item \label{loopstart}CUDA thread synchronization point. | |
- \item \texttt{calcHash<<<,>>>(\ldots)}: Particle-grid hash value calculati… | |
+ \item \texttt{calcParticleCellID<<<,>>>(\ldots)}: Particle-grid hash value… | |
\item CUDA thread synchronization point. | |
- \item \texttt{cudppSort(\ldots):} CUDPP radix sort of particle-grid hash a… | |
+ \item \texttt{thrust::sort\_by\_key(\ldots)}: Thrust radix sort of particl… | |
- \item \texttt{cudaMemset(\ldots):} Writing zero value (\texttt{0xffffffff}… | |
+ \item \texttt{cudaMemset(\ldots)}: Writing zero value (\texttt{0xffffffff}… | |
- \item \texttt{reorderParticles<<<,>>>(\ldots):} Reordering of particle arr… | |
+ \item \texttt{reorderArrays<<<,>>>(\ldots)}: Reordering of particle arrays… | |
\item CUDA thread synchronization point. | |
+ | |
+ \item Optional: \texttt{topology<<<,>>>(\ldots)}: If particle contact hist… | |
- \item \texttt{cudaBindTexture(\ldots):} Binding of textures (position, lin… | |
+ \item CUDA thread synchronization point. | |
- \item \texttt{interact<<<,>>>(\ldots)}: For each particle: Search of conta… | |
+ \item \texttt{interact<<<,>>>(\ldots)}: For each particle: Search of conta… | |
\item CUDA thread synchronization point. | |
- \item \texttt{integrate<<<,>>>(\ldots)}: Updating of spatial degrees of fr… | |
- | |
- \item \texttt{cudaUnbindTexture(\ldots):} Unbinding of textures. | |
+ \item \texttt{integrate<<<,>>>(\ldots)}: Updating of spatial degrees of fr… | |
+ | |
+ \item CUDA thread synchronization point. | |
+ | |
+ \item \texttt{summation<<<,>>>(\ldots)}: Particle contributions to the net… | |
+ | |
+ \item CUDA thread synchronization point. | |
+ | |
+ \item \texttt{integrateWalls<<<,>>>(\ldots)}: Updating of spatial degrees … | |
- \item Update of timers and loop-related counters (e.g. \texttt{time.curren… | |
+ \item Update of timers and loop-related counters (e.g. \texttt{time.curren… | |
\item If file output interval is reached: | |
\begin{enumerate} | |
- \item Optional write of data to output binary (\verb"<simulation_ID>.o… | |
- \item Update of \verb"<simulation_ID>.status#.bin". | |
+ \item Optional write of data to output binary (\verb"<simulation_ID>.o… | |
+ \item Update of \verb"<simulation_ID>.status#.bin" (\texttt{device.cu}… | |
\end{enumerate} | |
\item Return to point \ref{loopstart}, unless \texttt{time.current >= ti… | |
\end{enumerate} | |
- \item \label{loopend}Liberation of device and host memory. | |
+\item \label{loopend}Liberation of device memory (\texttt{device.cu}). | |
+ | |
+\item Control returned to \texttt{main(\ldots)}, liberation of host memory (\t… | |
- \item End of program. | |
+ \item End of program, return status equal to zero (0) if no problems where e… | |
\end{enumerate} | |
t@@ -233,13 +249,17 @@ The \texttt{SPHERE}-binary is launched from the system t… | |
The length of the computational time steps (\texttt{time.dt}) is calculated vi… | |
\begin{equation} | |
\label{eq:dt} | |
- \Delta t = 0.5 \times \mathrm{min} \left( \sqrt{\frac{\rho R^2}{K}} \r… | |
+\Delta t = 0.17 \min \left( m/\max(k_n,k_t) \right) | |
\end{equation} | |
-where $\rho$ is the particle material density, $R$ is particle radius, and $K$… | |
+where $m$ is the particle mass, and $k$ are the elastic stiffnesses. This equa… | |
\subsubsection{Host and device memory types} | |
\label{subsubsec:memorytypes} | |
-A full, listed description of the \texttt{SPHERE} source code variables can be… | |
+A full, listed description of the \texttt{SPHERE} source code variables can be… | |
+ | |
+The floating point precision operating internally in \texttt{SPHERE} is define… | |
+ | |
+Three-dimensional variables (e.g. spatial vectors in $E^3$) are in global memo… | |
\begin{figure}[htbp] | |
\label{fig:memory} | |
t@@ -348,49 +368,33 @@ A full, listed description of the \texttt{SPHERE} source… | |
\paragraph{Host memory} is the main random-access computer memory (RAM), i.e. … | |
-\paragraph{Device memory} is the main, global device memory. It resides off-ch… | |
- | |
- | |
-\paragraph{Constant memory} | |
- | |
- | |
-\paragraph{Textures} | |
- | |
- | |
- | |
- | |
- | |
-\subsection{The main loop} | |
-\label{subsec:mainloop} | |
-The \texttt{SPHERE} software calculates particle movement and rotation based o… | |
- | |
- | |
+\paragraph{Device memory} is the main, global device memory. It resides off-ch… | |
+\marginpar{Todo: Expand section on device memory types} | |
+\paragraph{Constant memory} values cannot be changed after they are set, and a… | |
+%\subsection{The main loop} | |
+%\label{subsec:mainloop} | |
+%The \texttt{SPHERE} software calculates particle movement and rotation based … | |
\subsection{Performance} | |
+\marginpar{Todo: insert graph of performance vs. np and performance vs. $\Delt… | |
\subsubsection{Particles and computational time} | |
-With an increasing amount of particles, obviously more calculations have to be… | |
- | |
-The size of the computational timestep length is fixed at a sufficiently low v… | |
- | |
-\subsubsection{Parallel computing} | |
-The \texttt{DISC} code is heavily parallized, i.e. it does carries out multipl… | |
- | |
-\subsubsection{Compute profiler results} | |
- | |
\subsection{Compilation} | |
\label{subsec:compilation} | |
-\texttt{SPHERE} is supplied with a makefile which helps the compilation proces… | |
+An important note is that the \texttt{C} examples of the NVIDIA CUDA SDK shoul… | |
+ | |
+\texttt{SPHERE} is supplied with several Makefiles, which automate the compila… | |
+ | |
-\section{Model setup} | |
+\section{Python API: Model setup} | |
\label{sec:ModelSetup} | |
In {\sc Matlab}, enter the \texttt{mfiles/}-directory as the current folder (e… | |
t@@ -458,7 +462,7 @@ In {\sc Matlab}, the state of the calculations can be chec… | |
The basics of the DEM algorithm, used in \texttt{SPHERE}, is described in sect… | |
-\section{Data analysis in {\sc Matlab}} | |
+\section{Python API: Data analysis} | |
\label{sec:DataAnalysis} | |
A number of preconfigured visualization methods are featured in \texttt{show.m… | |
\begin{lstlisting} |