gpu_vector_kernel.cu File Reference

Kernel functions for vector operations. More...

#include "toeplitz.h"

Go to the source code of this file.

Functions

__device__ T norm (unsigned int n, const T *x, T *swork)
__device__ T dot (unsigned int n, const T *x, const T *y, T *swork)
__device__ T dot_reverse_y (unsigned int n, const T *x, const T *y, T *swork)
__device__ void axpy (unsigned int n, T a, const T *x, T *y)
__device__ void axpy_reverse_x (unsigned int n, T a, const T *x, T *y, T *swork)
__device__ void axpxb_reverse_x (unsigned int n, T a, const T *x, T *y, T b, T *swork)


Detailed Description

Kernel functions for vector operations.

Author:
Leandro GraciĆ” Gil, leagragi@inf.upv.es
Date:
18/11/08

Definition in file gpu_vector_kernel.cu.


Function Documentation

__device__ void axpxb_reverse_x ( unsigned int  n,
a,
const T *  x,
T *  y,
b,
T *  swork 
)

Performs x = b * (a * reverse(x) + x) operation.

Parameters:
n Size of x vector.
a Scalar factor.
x Input vector.
y Output vector. Can be x.
b Result scaling value.
swork Shared memory workspace. Requires 6 * blockSize - 1 floats.

Definition at line 380 of file gpu_vector_kernel.cu.

__device__ void axpy ( unsigned int  n,
a,
const T *  x,
T *  y 
)

Performs y = a *x + y operation.

Parameters:
n Size of x and y vectors.
a Scalar factor.
x First vector.
y Second vector, modified with result.

Definition at line 245 of file gpu_vector_kernel.cu.

__device__ void axpy_reverse_x ( unsigned int  n,
a,
const T *  x,
T *  y,
T *  swork 
)

Performs y = a * reverse(x) + y operation.

Parameters:
n Size of x and y vectors.
a Scalar factor.
x First vector.
y Second vector, modified with result. Cannot be x.
swork Shared memory workspace. Requires 5 * blockSize - 1 floats.

Definition at line 278 of file gpu_vector_kernel.cu.

__device__ T dot ( unsigned int  n,
const T *  x,
const T *  y,
T *  swork 
)

Calculates the dot product of x and y vectors.

Parameters:
n Size of x and y vectors.
x First vector.
y Second vector.
swork Shared memory workspace. Requires blockSize floats.
Returns:
Thread 0 of the block returns the dot product result.

Definition at line 110 of file gpu_vector_kernel.cu.

__device__ T dot_reverse_y ( unsigned int  n,
const T *  x,
const T *  y,
T *  swork 
)

Calculates the dot product of x and reverse(y) vectors.

Parameters:
n Size of x and y vectors.
x First vector.
y Second vector.
swork Shared memory workspace. Requires 4 * blockSize - 1 floats.
Returns:
Thread 0 of the block returns the dot product result.

Definition at line 163 of file gpu_vector_kernel.cu.

__device__ T norm ( unsigned int  n,
const T *  x,
T *  swork 
)

Calculates the 2-norm (length) of a given vector. Should be faster than sqrt(dot(x, x)) since reads each component only once.

Parameters:
n Size of x vector.
x Input vector.
swork Shared memory workspace. Requires blockSize floats.
Returns:
Thread 0 of the block returns the dot product result.

Definition at line 40 of file gpu_vector_kernel.cu.


Generated on Sun Dec 14 14:21:11 2008 for Multi-GPU symmetric Toeplitz Eigenvalue Extractor by  doxygen 1.5.6