Archive for category How-To and Software

Oriented Particles in 2D - an example.

We have been asked quite a lot about the Oriented Particles approach to physically based simulation, where particles besides position and velocity also have an ellipsoid shape, an orientation and an angular velocity. So we decided to do a small c++ demo in 2D of how it can be done in practice.

Basically what the demo show is a somewhat simplified version of our Oriented Particles Christmas Card. The demo is a standard GLUT/OpenGL application but it can also be cross-compiled to javascript/WebGL using emscripten, the result of which can be seen here. In the following we will assume that the reader has read the original paper and thus we will just give a brief introduction to the demo code.

In the demo we want to be able to toss this guy around:

Also we want to make it look like the guy has bones in his body. For this purpose we have drawn a "skeleton" and used OpenCV to identify individual bones and fit ellipses to these. The result can be seen below with the ellipses shown in red. Our little OpenCV program outputs code directly which has been inserted into the example code.

 

Because of some technicalities with emscripten we wanted to avoid loading files from the example, so the shaders have been inlined as strings in c-headers. Also the image of the guy has been saved to a c-header using GIMP and included directly in the example. Source code for the example can be found here.

The class PositionBasedDynamics encapsulates a basic generalized Position Based Dynamics simulation loop, and two constraints, StayAboveLineConstraint and GeneralizedShapeMatchingConstraint, have also been included in the project. In main.cpp particles are attached to nearby particles using implicit (generalized) shape matching constraints in the function CreateObject and the object is inserted to the physics system in the function UploadOrientedParticlesObjectToPhysicsSystem.

For visualization a grid of vertices is created covering the area of the simulated particles. The vertices are triangulated and supplied with texture coordinated and used for rendering. As the particles each have an orientation they each constitute a two-dimensional coordinate system and can thus be used for skinning the grid mesh. For this purpose a parametrisation of vertices is made in the function GenerateGrid2D. Each vertex can be skinned from up to 3 particles and the parametrisation information is stored as 3D texture coordinates where the integer part of each component describes which particle to skin from and the fractional part describes the weighting to use for this particle. Skinning matrices are computed and uploaded to a float RGBA texture and the actual skinning is done in the vertex shader.

To get some motion in the system the little guys bag is moved around the scene. The example is not optimal - for instance the grid mesh could be based on index buffers which would save some calculations. Another problem with the 2D example is that for shape matching involving only 2 particles we sometimes find an optimal rotation that mirrors the particle. A quick fix to remedy this would be to check for mirroring matrices in the 2D matrix class. Also as you might know there are some issues with floating point textures in WebGL - especially when we want to access the texture from vertex shader. So to improve compatability for WebGL/GLES it might be a good idea to do the skinning on the CPU instead of in the vertex program - this was the solution we used when porting the system to iOS.

Hope you'll have fun with the example!

 

 

WebGL Tutorial: Optimizing Data Transfer for WebGL Applications

Introduction

WebGL is a technology that has pushed the limits of content that can be published on the web. For example, Fig. 1 shows that with a modern graphics card, WebGL allows us to render very complex scenes with hundreds of thousands of polygons directly in a browser window. One challenge is now that large amounts of geometric data must be transferred over a network connection prior to rendering of such complex scenes.

With current broadband connections users expect smooth browsing experience where web pages load almost immediately. If the loading time exceeds a few seconds the audience often lose interest and proceed to another website [1]. This timeframe turns out to be hard to reach for some WebGL applications.

Figure 1: The Stanford dragon consists of 871414 triangles. Models with this polygon count are straightforwardly rendered in real time in WebGL. A big challenge, however, is how to load the massive amounts of vertex data for such detailed models. For example, the model shown in the figure uses 60 megabytes of vertex position and normal data.

As shown in Fig. 1, the size of raw vertex data can easily be on the order of tens of megabytes. With a reasonably fast connection with a 10 Mbit/s bandwidth, it takes around one second to load each megabyte of load vertex data. On the other hand, javascript engines used in modern browsers can process data at a much higher rate, and, hence, we can obtain faster loading times if we somehow can decrease the amount of data transferred over network at the expense of some postprocessing at the client side.

In this tutorial we will go through various techniques that can be used to optimize the loading time for large amounts of vertex data. Furthermore, we will demonstrate how data can be cached so that it does not need to be reloaded between browser sessions. We assume that the reader has basic knowledge about OpenGL vertexbuffers, JavaScript, and Ajax. We will extensively make use of features that are still W3C working drafts the so code may not run on older browsers and functionality may change in the future. We have tested all our code samples with Google Chrome 20 and Mozilla Firefox 16.
Read the rest of this entry »

Paper Accepted For HPG 2011

DragonEar

We have just had our paper on real-time subsurface scattering accepted for High-Performance Graphics 2011. The conference takes place August 5-7 in Vancouver. Here you can find a preprint of the paper which is titled SSLPV: Subsurface Light Propagation Volumes. Also we have provided a small demo along with the GLSL shader code.

Steam shader effect

About a month ago we wrote about our shader-based heat shimmer effect. Another effect we developed for the Danfoss-experience project is illustrated in the video below. It tries to reproduce the effect of a window or mirror steaming up entirely as a shader.

Our method uses a gradient image G which dictates the overall "growth" of the steam. A variable t is animated over time from 0 to 2 and the amount of steam s for a given position \mathbf{x} is then defined by a clamped linear interpolation over some range r:

s = \mathrm{clamp}\left(\frac{t - G(\mathbf{x})}{r}, 0, 1\right)

To make the growth more interesting, we modify G with a noise texture. The color of the steam c_s is determined from a lookup into a texture-rendered version of the scene which has been blurred by a Gaussian filter. To this value, we add some white noise to give the steam some texture. Finally, we add a value which can be tweaked to give the steam the desired amount of lightness.
The pixel color is then given by the unmodified color c interpolated with the steam color:

\mathrm{lerp}(c, c_s, s)

The final touch to our effect, is the ability to interact with the steam by dragging the mouse around like a finger on the steamed up window. This is implemented by adding a mask texture and modifying the amount of steam by taking the minimum value of the mask and s. The mask texture is updated by blending black lines into it where the mouse has been when it is being dragged around. Furthermore, a dark grey color is constantly blended into the entire mask in order to make the older strokes steam up again.

"Particle Director" Unity Extension Released

Our first Unity extension - Particle Director - just went live on the Unity Asset Store today. It is a tool for specifying custom particle velocities for particle systems. With it you are able to specify motion of the particles by placing control vectors, solids, sources, sinks, etc.

Check it out here.

Unity Custom Particle System Demo

We have been working on a customizable, artist-friendly way of specifying particle velocities for particle systems in Unity. The built-in particle system animator only allows for a very limited range of motions, and it would be really hard to make the particles flow around obstacles or create vortices.

But before we dive into the details of our custom particle editor, why don't you have a look at the result in the demo below? (We also demonstrate a Unity implementation of the shimmer shader effect from the previous post as well as a shader which creates foggy mirrors)

The screenshot below shows our editor in action. You are able to specify motion of the particles in a 2D plane by placing control vectors, solids, sources, sinks, etc. Velocities are then computed for the entire plane by running a fluid solver for a certain amount of iterations using your inputs as boundary conditions. This only takes a few seconds and ensures that you get a nice and smooth velocity field in the entire plane. In the scene view, control vectors, solids, sources and sinks are displayed in order to help with aligning the motion with the scene geometry.



We have chosen to create a 2D editor instead of a full 3D editor since it greatly simplifies the user interface as well as reducing the computational load of the fluid solver. Our system still allows you to create 3D motions by extrapolating the 2D motion into space using a linear fall-off.

But more importantly, the system allows for using a linear combination of several velocity fields. Therefore, to specify a 3D motion you create several 2D motions and position them differently in your scene. The below screenshots show 5 planes combined to create a swirling, tornado-like motion. The highlighted plane in the left image specifies the horizontal motion, while the the highlighted plane in the right image and the 3 remaining planes contain sources and dictate an upwards motion.

The custom particle system fully integrates with the built-in particle emitter, animator and renderer meaning that all other aspects - save for the motion of the particles - are handled in the usual way. Other highlights include the ability to:

  • Emit particles from your custom sources rather than the single particle emitter
  • Add vortices to the velocity field
  • Change the weight of each velocity field separately at runtime, effectively turning on and off various motions

We plan on putting this custom particle system and editor on the Unity Asset Store in the near future.

Heat shimmer shader effect

As part of the Danfoss-experience project we have been working for the last couple of months on many interesting effects for illustrating the application of Danfoss products digitally. One of the more successful effects that we have produced is a heat shimmer effect made entirely as a shader.

To make this effect work we have applied an interesting texture advection shader technique used by Valve for water flow in Left for Dead 2. A presentation of the technique and its improvements for portal 2 was given at Siggraph 2010.

The method combines a few neat shader tricks. Lets say we have a surface texture T and velocities V for the surface. If we advect the texture by the velocities in V, we can make the texture flow according to the velocity field. But rather quickly the displacement of the texture makes it look stretched and wrong. We can remove the distortion by reverting to the original texture after every S seconds, but at the cost of a massive sudden change. We can hide the sudden revert back to the original texture by linearly interpolating between a second texture advection of the same texture. If we offset the advection by half the advection interval we can always show a not too distorted texture advection while reverting the advection of the hidden texture.

What we have added to the method is a layer on top of the advection method. With the Valve texture advection method we create a directional flow of a distortion texture (we use a velocity field with the same speed in the entire domain). The distortion is then used when doing texture lookup in the texture-rendered scene. In this way we can add time-dependent distortion with a directional flow to any surface. A video of the effect on a static background image is shown below.

It would be fairly easy to add depth dependent distortion to the method by using a depth buffer, but for better directability we chose to add a scaling mask instead.

Noe's tutorial on deforming 3D geometry using RBFs

 In the following I will present a method for deforming three dimensional geometry using a technique relying on radial basis functions (RBFs). These are mathematical functions that take a real number as input argument and return a real number. RBFs can be used for creating a smooth interpolation between values known only at a discrete set of positions. The term radial is used because the input argument given is typically computed as the distance between a fixed position in 3D space and another position at which we would like to evaluate a certain quantity.

The tutorial will give a short introduction to the linear algebra involved. However the source code contains a working implementation of the technique which may be used as a black box. First an example of what we achieve in the end:

YouTube Preview Image

Source code with Visual Studio 2005 solution can be found here. The code should also compile on other platforms.

Interpolation by radial basis functions

Assume that the value of a scalar valued function F : \mathbb{R}^3 \rightarrow \mathbb{R} is known in M distinct discrete points \mathbf{x}_i  in three dimensional space. Then RBFs provide a means for creating a smooth interpolation function of F in the whole domain of \mathbb{R}^3. This function is written as a sum of M evaluations of a radial basis function g(r_i) : \mathbb{R} \rightarrow \mathbb{R} where r_i is the distance between the point \mathbf{x} = (x, y, z) to be evaluated and \mathbf{x}_i:

F(\mathbf{x}) = \sum_{i=1}^M a_i g(||\mathbf{x} - \mathbf{x}_i||) + c_0 + c_1 x + c_2 y + c_3 z,\ \ \ \mathbf{x} = (x,y,z) \ \ \ \mathbf{(1)}

 Here a_i are scalar coefficients and the last four terms constitute a first degree polynomial with coefficients c_0 to c_3. These terms describe an affine transformation which cannot be realised by the radial basis functions alone. From the M known function values F( x_i, y_i, z_i ) = F_i we can assemble a system of M+4 linear equations:  \mathbf{G} \mathbf{A} = \mathbf{F}
 where \mathbf{F} = (F_1, F_2, \ldots, F_M, 0, 0, 0, 0), \mathbf{A} = (a_1, a_2, \ldots, a_M, c_0, c_1, c_2, c_3) and \mathbf{G} is an (M+4) \times (M+4) matrix :

 \mathbf{G} = \left[<br />
\begin{array}{cccccccccc}<br />
g_{11} & g_{12} & \bullet & \bullet & \bullet & g_{1M} & 1 & x_1 & y_1 & z_1\\<br />
g_{21} & g_{22} & \bullet & \bullet & \bullet & g_{2M} & 1 & x_2 & y_2 & z_2\\<br />
\bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet\\<br />
\bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet\\<br />
\bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet & \bullet\\<br />
g_{M1} & g_{M2} & \bullet & \bullet & \bullet & g_{MM} & 1 & x_M & y_M & z_M\\<br />
1 & 1 & \bullet & \bullet & \bullet & 1 & 0 & 0 & 0 & 0\\<br />
x_1 & x_2 & \bullet & \bullet & \bullet & x_M & 0 & 0 & 0 & 0\\<br />
y_1 & y_2 & \bullet & \bullet & \bullet & y_M & 0 & 0 & 0 & 0\\<br />
z_1 & z_2 & \bullet & \bullet & \bullet & z_M & 0 & 0 & 0 & 0<br />
\end{array}<br />
\right]

 Here g_{ij} = g(|| \mathbf{x}_i - \mathbf{x}_j ||). A number of choices for g will result in a unique solution of the system. In this tutorial we use the shifted log function:

 g(t) = \sqrt{log(t^2 + k^2)}, \ \ \ k^2\geq 1


 with k = 1. Solving the equation system for \mathbf{A} gives us the coefficients to use  in equation \textbf{(1)} when interpolating between known values.

 

Interpolating displacements

How can RBF's be used for deforming geometry? Well assume that the deformation is known for M 3D positions \mathbf{x}_i and that this information is represented by a vector describing 3D displacement \mathbf{u}_i of the geometry that was positioned at \mathbf{x}_i in the original, undeformed state. You can think of the \mathbf{x}_i positions as control points that have been moved to positions \mathbf{x}_i+\mathbf{u}_i. The RBF interpolation method can now be used for interpolating these displacements to other positions. 

Using the notation \mathbf{x}_i = (x_i, y_i, z_i) and \mathbf{u}_i = (u^x_i, u^y_i, u^z_i)  three linear systems are set up as above letting the displacements u be the quantity we called a in the previous section:

\mathbf{G} \mathbf{A}_x = (u^x_1, u^x_2, \ldots, u^x_M, 0, 0, 0, 0)^T


\mathbf{G} \mathbf{A}_y = (u^y_1, u^y_2, \ldots, u^y_M, 0, 0, 0, 0)^T


\mathbf{G} \mathbf{A}_z = (u^z_1, u^z_2, \ldots, u^z_M, 0, 0, 0, 0)^T

where \mathbf{G} is assembled as described above. Solving for \mathbf{A}_x, \mathbf{A}_y, and \mathbf{A}_z involves a single matrix inversion and three matrix-vector multiplications and gives us the coefficients for interpolating displacements in all three directions by the expression \mathbf{(1)}

 

The source code

In the source code accompanying this tutorial you will find the class RBFInterpolator which has an interface like this:

class RBFInterpolator
{
public:
	RBFInterpolator();
	~RBFInterpolator();
 
	//create an interpolation function f that obeys F_i = f(x_i, y_i, z_i)
	RBFInterpolator(vector x, vector y, vector z, vector F);
 
	//specify new function values F_i while keeping the same
	void UpdateFunctionValues(vector F);
 
	//evaluate the interpolation function f at the 3D position (x,y,z)
	real interpolate(float x, float y, float z);
 
private:
            ...
};

This class implements the interpolation method described above using the newmat matrix library. It is quite easy to use: just fill stl::vectors with the x_i, y_i and z_i components of the positions where the value F is known and another stl::vector with the F_i values. Then pass these vectors to the RBFInterpolator constructor, and it will be ready to interpolate. The F value at any position is then evaluated by calling the 'interpolate' function. If some of the F_i values change at any time, the interpolator can be quickly updated using the 'UpdateFunctionValues' method.

We want to deform a triangle surface mesh. These are stored in a class TriangleMesh, and loaded from OBJ files.
In the source code the allocation of stl::vectors discribing the control points and the initialisation of RBFInterpolators looks like this:

void loadMeshAndSetupControlPoints()
{
	// open an OBJ file to deform
	string sourceOBJ = "test_dragon.obj";
	undeformedMesh = new TriangleMesh(sourceOBJ);
	deformedMesh = new TriangleMesh(sourceOBJ);
 
	// we want 11 control points which we place at different vertex positions
	const int numControlPoints = 11;
 
	const int verticesPerControlPoint = ((int)undeformedMesh->getParticles().size())/numControlPoints;
 
	for (int i = 0; i<numControlPoints; i++)
	{
		Vector3 pos = undeformedMesh->getParticles()[i*verticesPerControlPoint].getPos();
		controlPointPosX.push_back(pos[0]);
		controlPointPosY.push_back(pos[1]);
		controlPointPosZ.push_back(pos[2]);
	}
 
	// allocate vectors for storing displacements
	for (unsigned int i = 0; i<controlPointPosX.size();  i++)
	{
		controlPointDisplacementX.push_back(0.0f);
		controlPointDisplacementY.push_back(0.0f);
		controlPointDisplacementZ.push_back(0.0f);
	}
 
	// initialize interpolation functions
	rbfX = RBFInterpolator(controlPointPosX, controlPointPosY, 
                                           controlPointPosZ, controlPointDisplacementX );
	rbfY = RBFInterpolator(controlPointPosX, controlPointPosY, 
                                           controlPointPosZ, controlPointDisplacementY );
	rbfZ = RBFInterpolator(controlPointPosX, controlPointPosY, 
                                           controlPointPosZ, controlPointDisplacementZ );
}

Now all displacements are set to zero vectors - not terribly exciting! To make it a bit more fun we can vary the displacements with time:

	// move control points
	for (unsigned int i = 0; i<controlPointPosX.size(); i++ )
	{
		controlPointDisplacementX[i] = displacementMagnitude*cosf(time+i*timeOffset);
		controlPointDisplacementY[i] = displacementMagnitude*sinf(2.0f*(time+i*timeOffset));
		controlPointDisplacementZ[i] = displacementMagnitude*sinf(4.0f*(time+i*timeOffset));
	}
 
	// update the control points based on the new control point positions
	rbfX.UpdateFunctionValues(controlPointDisplacementX);
	rbfY.UpdateFunctionValues(controlPointDisplacementY);
	rbfZ.UpdateFunctionValues(controlPointDisplacementZ);
 
	// deform the object to render
	deformObject(deformedMesh, undeformedMesh);

Here the function 'deformObject' looks like this:

// Code for deforming the mesh 'initialObject' based on the current interpolation functions (global variables). 
// The deformed vertex positions will be stored in the mesh 'res'
// The triangle connectivity is assumed to be already correct in 'res'  
void deformObject(TriangleMesh* res, TriangleMesh* initialObject)
{
	for (unsigned int i = 0; i < res->getParticles().size(); i++)
	{
		Vector3 oldpos = initialObject->getParticles()[i].getPos();
 
		Vector3 newpos;
		newpos[0] = oldpos[0] + rbfX.interpolate(oldpos[0], oldpos[1], oldpos[2]);
		newpos[1] = oldpos[1] + rbfY.interpolate(oldpos[0], oldpos[1], oldpos[2]);
		newpos[2] = oldpos[2] + rbfZ.interpolate(oldpos[0], oldpos[1], oldpos[2]);
 
		res->getParticles()[i].setPos(newpos);
	}
}

That's it!!! Now I encourage you to download the source code and play with it. Perhaps you can experiment with other radial basis functions? Or make the dragon crawl like a caterpillar? If you code something interesting based on this tutorial send a link to me and we will link to it from this page :-)

Karsten Noe

I got a mail from Woo Won Kim from Yonsei University in South Korea who has made a head modeling program that can generate 3D human heads from two pictures of the person using code from this RBF tutorial. Check out a video of this here.

Tags: , , , , , ,

Real time finite element modelling using CUDA

Here is a little movie showing real time simulation of non-linear elastic material properties using the Total Lagrangian Explicit Dynamic FEM. Three different sets of material parameters were used. Our implementation is done in CUDA. Thanks to Brian Bunch Christensen and Jens Rimestad for cooperation on the implementation.

YouTube Preview Image

The source code is under the LGPL licence and can be found here. Please acknowledge if you use it for your research. Thanks to Movania Muhammad Mobeen for tidying up the project so it should compile out of the box.

Dirk Fortmeier made this project work in linux using CMAKE. You can find the code for this at https://github.com/fortmeier/CUDA-FE-With-Haptics which also includes haptics interaction.

How we used the model in previous research can be read in the following paper:

Solid mesh registration for radiotherapy treatment planning.
K. Ø. Noe, T.S. Sørensen.
2010 International Symposium on Biomedical Simulation (ISBMS).
Lecture Notes in Computer Science, 2010; 5958: 59-70.
[abstract]

Some videos showing our use can be found here.

Tags: , , , , ,

Triers CUDA ray tracing tutorial

Welcome to my little cuda ray tracing tutorial, and first  a warning: Ray tracing is both fun and contagious, there is a fair chance that you will end up coding different variants of your ray tracer just to see those beautiful images. So if you value your spare time stop reading :-)

I am not an ray-tracing expert, and i also value my spare time. But i really like to synthisize images on my computer, and i cannot explain why, other than it is really cool what a few lines of code can produce. But in ray tracing, it often takes a long time to create a decent result. There is of course a lot you can do to speed up thing:

  1. first buy a faster computer, there more cores the better.
  2. implement an acceleration structure.
  3. dive into the world of optimizations, simd stuff, cache coherency ...
  4. And if you are really talented you might end up doing some hairy math, that will sent your ray tracer into warp speed.

Or if you are like me, use the graphics card to speed up things a bit. This little tutorial will not state that the GPU thing is the only way to go or try to preach something. But it will hopefully help you to get started using CUDA to accelerate ray tracing. I have been writing a lot of shaders for other applications so it was very natural for me to experiment with a GPU implementation. So the natural choice would be to use Nvidia's CG or glsl. But i decided to try Cuda, because i wanted to learn what kind of beast it is and second it should be easier to code. For most parts cuda is a quite pleasant experience, for instance you can make emulation debug and print out values!! But again some times it crashes for real due to some invalid pointer, leaving you with a smoldering Vista-XP blue screen of death.

But let get started. This tutorial will focus on the CUDA stuff like how do you upload some triangles to your graphics card and how to implement a very simple ray tracing algorithm that runs on the GPU. It will be assumed you know Open GL, and some theory about ray tracing.

Anyways just to initialize some ray tracing knowledge pointer in your brain here is the ray tracing 101 (if you already is familiar with this GOTO 10):

To create a ray traced image like this:

bunnyboy

  1. for each pixel create a ray.
  2. follow the ray into the scene and find the closest object that is intersected.
  3. if the object is made of glass, mirror  or something similar find the reflected/refracted ray and trace on.
  4. else computer the color of the pixel based on light sources and material information.

label : 10 // you made it this far: 100 xp award
Ok lets code:

The code has been written with one main design criteria. Keep it pretty simple! and dont include a bunch of unnecessary stuff. By my opinion a lot of tutorials i have seen, forces people to include complete frameworks and i think it makes the code less comprehensive. So the code is only one  .cpp (main.cpp) and one .cu (raytracer.cu) source file.

The ray tracer will be very simple, and it will be left as an exercise to include more advanced features like refraction, acceleration structures and cool materials. I will focus on sending a ray into a loaded triangulated model, computing the first hit position and do simple light calculations.

YouTube Preview Image

Above is a youtube video of the tutorial. Compared with the first image there is of course a lot of things missing, but hey its a start :-)

To start we need to load some geometry and we use the .obj format to load the 3d model shown in the video above. After loading the geometry we need to send the triangle information to the graphic-card memory before we can start tracing rays. To do this we use a CUDA 1D texture that is declared in the top of ray-tracer.cu like this:

texture'&lt;'float4, 1, cudaReadModeElementType'&gt;' triangle_texture;

The CUDA texture memory provides a default caching so it is a bit faster than using global memory. The triangles is simple packed into the 1D texture in a way that each float4 contains the either the first triangle-vertex or two triangle edges.

Like this: ( float4(vertex.x,vertex.y,vertex.z, 0), float4 (egde1.x,egde1.y,egde1.z,0),float4 (egde2.x,egde2.y,egde2.z,0)) for each triangle.

This information is copied into a cuda device pointer and bound to the declared texture. The texture lookup is set to POINT mode to avoid linear interpolation.

The code for binding the device pointer to the 1D texture

void bindTriangles(float *dev_triangle_p, unsigned int number_of_triangles)
{
triangle_texture.normalized = false;                      // access with normalized texture coordinates
triangle_texture.filterMode = cudaFilterModePoint;        // Point mode, so no
triangle_texture.addressMode[0] = cudaAddressModeWrap;    // wrap texture coordinates
 
size_t size = sizeof(float4)*number_of_triangles*3;
cudaChannelFormatDesc channelDesc = cudaCreateChannelDesc&lt;float4&gt;();
cudaBindTexture(0,triangle_texture,dev_triangle_p,channelDesc,size);
}

After uploading and binding triangle information, we need a way to communicate between openGL and CUDA, because we use openGL to visualize the ray tracing result. Fortunately this is easily done by using a PBO (pixel buffer object), that serves as communication between the two. If you have the CUDA SDK this is described in the openGL PostProcess example.

Now the fun part begins... ray tracing the triangles. So the first problem is spawning the rays. Here we create a virtual camera and by setting a few parameters rotation, distance to origo and elevation, we can calculate the a vector basis for the camera. Given this basis we can compute the ray start position and direction. The camera calculation is done in the updateCamera() function.

Now we have a ray(a start position and a direction) for each pixel, and the we would like to know the closest triangles hit if any. But before that we do a little optimization that is equivilant to frustum culling. By calculating a axis aligned bounding box that encapsulates the entire scene, we first check wether the ray intersect this rectangle. If not we don't have to worry and just set the pixel color to the background value.

The ray tracing is all done in the CUDA kernel found in raytracer.cu file

__global__ void raytrace(...)

Next stop ray triangle intersection. To compute the ray triangle intersection is a science by itself because this is where all the action is going on. So a slight optimization will really show off. For our tutorial we use the method developed by Thomas Muller and Ben Trumbore, which is very fast and easy to implement in CUDA.

To find the closest triangle hit point we run linear through all the triangles in our scene and compute the ray triangle intersection value. To get the smallest intersection value we store the smallest t value yet found in a hit-record. Below is the code that calculates the ray triangle intersection values and finds the closest.

for(int i = 0; i &lt; number_of_triangles; i++)
 {
 float4 v0 = tex1Dfetch(triangle_texture,i*3);
 float4 e1 = tex1Dfetch(triangle_texture,i*3+1);
 float4 e2 = tex1Dfetch(triangle_texture,i*3+2);
 
 float t = RayTriangleIntersection(r, make_float3(v0.x,v0.y,v0.z),make_float3(e1.x,e1.y,e1.z), make_float3(e2.x,e2.y,e2.z));
 
 if(t &lt; hit_r.t &amp;&amp; t &gt; 0.001)
 {
 hit_r.t = t; // found a closer hit! store it.
 hit_r.hit_index = i;
 }

If we hit something we calculate a face normal using a cross product, and given a light position and color we do a Phong lighting computation.

Shadows and reflections.

By using ray tracing it is very easy to calculate shadows and reflections, so this should of course be included in the tutorial. The way we determine if a pixel is in the shadow is done by creating a new ray starting from the triangle hit-point and pointing towards the light source. by intersecting all triangles like we did before we can determine if there is triangles blocking the path to the light. If yes we multiply the color by some value < 1.0 to darken the color.

Reflection is also simple to do, in a CPU ray tracer you could create a recursive call, but here we just use a while loop to follow the ray through the scene. We employ a little hack to create a reflective object in the center of the scene. This is achieved by storing a extra value in the triangle float4 values, because only the first three is used we can store extra information in the last component.

Suggested improvement.

As stated before there is a some of improvements that need to be done, before this little ray tracer can produce nice pictures. Here is some suggestions that you might want to play with.

  1. Refraction, transparent materials is the core of ray tracing, impress your mom by rendering household equipment!
  2. Anti-aliasing, one ray pr pixel is not enough to produces decent images, experiment with random and jittered sampling.
  3. Acceleration structures, there is a lot of speed to be found by implementing some sort of tree structure (KD trees, Bounding volumes, octtrees)
  4. More advanced materials.
  5. write to me and i will add your improvement to the list

I really hope that you enjoy this little tutorial to start doing some ray tracing, and email me anytime if you have any ideas that might improve this tutorial or questions.

peter.trier {at} alexandra(.)dk

By the way thanks for hanging on to the end 1000 xp bonus!

Source code download

Tutorial source and windows executable download cuda 2.3

Tutorial source and windows executable download_cuda_2.1

Links:

Cool ray tracing forum

CUDA development zone

http://www.cs.virginia.edu/~gfx/Courses/2003/ImageSynthesis/papers/Acceleration/Fast%20MinimumStorage%20RayTriangle%20Intersection.pdf