python - Is Maxwell architecture supported in Numbapro? -
i want execute cuda kernel in python using numbapro api. have code:
import math import numpy numbapro import jit, cuda, int32, float32 matplotlib import pyplot @cuda.jit('void(float32[:], float32[:], float32[:], float32[:], float32, float32, float32, int32)') def calculate_velocity_field(x, y, u_source, v_source, x_source, y_source, strength_source, n): start = cuda.blockidx.x * cuda.blockdim.x + cuda.threadidx.x end = n stride = cuda.griddim.x * cuda.blockdim.x in range(start, end, stride): u_source[i] = strength_source/(2*math.pi) * (x[i]-x_source)/((x[i]-x_source)**2 + (y[i]-y_source)**2) v_source[i] = strength_source/(2*math.pi) * (y[i]-x_source)/((x[i]-x_source)**2 + (y[i]-y_source)**2) def main(): n = 200 # number of points in each direction x_start, x_end = -4.0, 4.0 # boundaries in x-direction y_start, y_end = -2.0, 2.0 # boundaries in y-direction x = numpy.linspace(x_start, x_end, n) # creates 1d-array x-coordinates y = numpy.linspace(y_start, y_end, n) # creates 1d-array y-coordinates x, y = numpy.meshgrid(x, y) # generates mesh grid strength_source = 5.0 # source strength x_source, y_source = -1.0, 0.0 # location of source start = timer() #calculate grid dimensions blocksize = 1024 gridsize = int(math.ceil(float(n)/blocksize)) #transfer memory device x_d = cuda.to_device(x) y_d = cuda.to_device(y) u_source_d = cuda.device_array_like(x) v_source_d = cuda.device_array_like(y) #launch kernel calculate_velocity_field[gridsize,blocksize](x_d,y_d,u_source_d,v_source_d,x_source,y_source,strength_source,n) #transfer memory host u_source = numpy.empty_like(x) v_source = numpy.empty_like(y) u_source_d.to_host(u_source) v_source_d.to_host(v_source) elapsed_time = timer() - start print("exec time gpu %f s" % elapsed_time) if __name__ == "__main__": main()
is giving me error:
nvvmerror traceback (most recent call last) <ipython-input-17-85e4a6e56a14> in <module>() ----> 1 @cuda.jit('void(float32[:], float32[:], float32[:], float32[:], float32, float32, float32, int32)') 2 def calculate_velocity_field(x, y, u_source, v_source, x_source, y_source, strength_source, n): 3 start = cuda.blockidx.x * cuda.blockdim.x + cuda.threadidx.x 4 end = n 5 stride = cuda.griddim.x * cuda.blockdim.x ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/decorators.py in kernel_jit(func) 89 # force compilation current context 90 if bind: ---> 91 kernel.bind() 92 93 return kernel ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/compiler.py in bind(self) 319 force binding current cuda context 320 """ --> 321 self._func.get() 322 323 @property ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/compiler.py in get(self) 254 cufunc = self.cache.get(device.id) 255 if cufunc none: --> 256 ptx = self.ptx.get() 257 258 # link ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/compiler.py in get(self) 226 arch = nvvm.get_arch_option(*cc) 227 ptx = nvvm.llvm_to_ptx(self.llvmir, opt=3, arch=arch, --> 228 **self._extra_options) 229 self.cache[cc] = ptx 230 if config.dump_assembly: ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in llvm_to_ptx(llvmir, **opts) 420 cu.add_module(llvmir.encode('utf8')) 421 cu.add_module(libdevice.get()) --> 422 ptx = cu.compile(**opts) 423 return ptx 424 ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in compile(self, **options) 211 x in opts]) 212 err = self.driver.nvvmcompileprogram(self._handle, len(opts), c_opts) --> 213 self._try_error(err, 'failed compile\n') 214 215 # result ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in _try_error(self, err, msg) 229 230 def _try_error(self, err, msg): --> 231 self.driver.check_error(err, "%s\n%s" % (msg, self.get_log())) 232 233 def get_log(self): ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in check_error(self, error, msg, exit) 118 sys.exit(1) 119 else: --> 120 raise exc 121 122 nvvmerror: failed compile libnvvm : error: -arch=compute_52 unsupported option nvvm_error_invalid_option
i tried numbapro examples , same error ocurrs. don't know if it's bug of numbapro doesn't support 5.2 compute capability or it's problem of nvidia nvvm... suggestions?
in theory it should supported, don't know happening.
i'm using linux cuda 7.0 , driver version 346.29
finally found solution here
- solution 1:
conda update cudatoolkit
fetching package metadata: .... # requested packages installed. # packages in environment @ ~/.anaconda3: # cudatoolkit 6.0 p0
it looks me updating cuda toolkit doesn't update cuda 7.0. second solution can done:
- solution 2
conda install -c numba cudatoolkit
fetching package metadata: ...... solving package specifications: . package plan installation in environment ~/.anaconda3: following packages downloaded: package | build ---------------------------|----------------- cudatoolkit-7.0 | 1 190.8 mb following packages updated: cudatoolkit: 6.0-p0 --> 7.0-1 proceed ([y]/n)? y
before:
in [4]: check_cuda() ------------------------------libraries detection------------------------------- finding cublas located @ ~/.anaconda3/lib/libcublas.so.6.0.37 trying open library... ok finding cusparse located @ ~/.anaconda3/lib/libcusparse.so.6.0.37 trying open library... ok finding cufft located @ ~/.anaconda3/lib/libcufft.so.6.0.37 trying open library... ok finding curand located @ ~/.anaconda3/lib/libcurand.so.6.0.37 trying open library... ok finding nvvm located @ ~/.anaconda3/lib/libnvvm.so.2.0.0 trying open library... ok finding libdevice compute_20... ok finding libdevice compute_30... ok finding libdevice compute_35... ok -------------------------------hardware detection------------------------------- found 1 cuda devices id 0 b'geforce gtx 970' [supported] compute capability: 5.2 pci device id: 0 pci bus id: 7 summary: 1/1 devices supported passed out[4]: true
after:
in [6]: check_cuda() ------------------------------libraries detection------------------------------- finding cublas located @ ~/.anaconda3/lib/libcublas.so.7.0.28 trying open library... ok finding cusparse located @ ~/.anaconda3/lib/libcusparse.so.7.0.28 trying open library... ok finding cufft located @ ~/.anaconda3/lib/libcufft.so.7.0.35 trying open library... ok finding curand located @ ~/.anaconda3/lib/libcurand.so.7.0.28 trying open library... ok finding nvvm located @ ~/.anaconda3/lib/libnvvm.so.3.0.0 trying open library... ok finding libdevice compute_20... ok finding libdevice compute_30... ok finding libdevice compute_35... ok -------------------------------hardware detection------------------------------- found 1 cuda devices id 0 b'geforce gtx 970' [supported] compute capability: 5.2 pci device id: 0 pci bus id: 7 summary: 1/1 devices supported passed out[6]: true
Comments
Post a Comment