i using pycuda , know if there equivalent function cudamemcpytosymbol
i copy constant host device below
import pycuda.driver cuda import pycuda.autoinit pycuda.compiler import sourcemodule import numpy sys import path struct import * gpustruct import gpustruct if __name__ == '__main__': # list devices ndevices = cuda.device.count() print '{} devices found'.format(ndevices) in xrange(ndevices): print ' ', cuda.device(i).name() # compile device.cu mod = sourcemodule(''' __device__ __constant__ int constd; struct results { float *a; float *b; float *c; }; struct fin { float *n; }; __global__ void test(results *src,fin *dest){ int i=blockidx.x *blockdim.x + threadidx.x; src->c[i]=src->a[i]+src->b[i]+dest->n[i]+constd; }''', nvcc='/opt/cuda65/bin/nvcc', ) kern = mod.get_function("test") constante=5 src_gpu = gpustruct([(numpy.int32,'*a', numpy.ones(10,dtype=numpy.int32)),(numpy.int32,'*b', numpy.ones(10,dtype=numpy.int32)),(numpy.int32,'*c', numpy.zeros(10,dtype=numpy.int32))]) test_gpu = gpustruct([(numpy.int32,'*n', numpy.array(10*[5],dtype=numpy.int32))]) #something this: **cudamemcpytosymbol(constd, &constante, sizeof(int));** src_gpu.copy_to_gpu() test_gpu.copy_to_gpu() kern(src_gpu.get_ptr(),test_gpu.get_ptr(),block=(10,1,1),grid=(1,1)) src_gpu.copy_from_gpu() print(src_gpu)
the pycuda implementation directly follows cuda driver api, can use driver api code can find model, there 2 things required make work:
- use module function
module.get_global()
retrieve device pointer symbol within compiled source module - use
driver.memcpy_htod
copy values pointer. note pycuda apis require objects support python buffer protocol. in practice means should usingnumpy.ndarray
or similar on python side.
this cudamemcpytosymbol
under hood.
Comments
Post a Comment