python - Is Maxwell architecture supported in Numbapro? -


i want execute cuda kernel in python using numbapro api. have code:

import math import numpy numbapro import jit, cuda, int32, float32 matplotlib import pyplot  @cuda.jit('void(float32[:], float32[:], float32[:], float32[:], float32, float32, float32, int32)') def calculate_velocity_field(x, y, u_source, v_source, x_source, y_source, strength_source, n):     start  = cuda.blockidx.x * cuda.blockdim.x + cuda.threadidx.x     end    = n     stride = cuda.griddim.x * cuda.blockdim.x     in range(start, end, stride):         u_source[i] = strength_source/(2*math.pi) * (x[i]-x_source)/((x[i]-x_source)**2 + (y[i]-y_source)**2)         v_source[i] = strength_source/(2*math.pi) * (y[i]-x_source)/((x[i]-x_source)**2 + (y[i]-y_source)**2)   def main():     n = 200                                # number of points in each direction     x_start, x_end = -4.0, 4.0            # boundaries in x-direction     y_start, y_end = -2.0, 2.0            # boundaries in y-direction     x = numpy.linspace(x_start, x_end, n)    # creates 1d-array x-coordinates     y = numpy.linspace(y_start, y_end, n)    # creates 1d-array y-coordinates      x, y = numpy.meshgrid(x, y)              # generates mesh grid      strength_source = 5.0                      # source strength     x_source, y_source = -1.0, 0.0             # location of source      start = timer()      #calculate grid dimensions     blocksize = 1024     gridsize  = int(math.ceil(float(n)/blocksize))      #transfer memory device     x_d        = cuda.to_device(x)     y_d        = cuda.to_device(y)     u_source_d = cuda.device_array_like(x)     v_source_d = cuda.device_array_like(y)      #launch kernel     calculate_velocity_field[gridsize,blocksize](x_d,y_d,u_source_d,v_source_d,x_source,y_source,strength_source,n)      #transfer memory host     u_source = numpy.empty_like(x)     v_source = numpy.empty_like(y)     u_source_d.to_host(u_source)     v_source_d.to_host(v_source)      elapsed_time = timer() - start     print("exec time gpu %f s" % elapsed_time)  if __name__ == "__main__":     main() 

is giving me error:

nvvmerror                                 traceback (most recent call last) <ipython-input-17-85e4a6e56a14> in <module>() ----> 1 @cuda.jit('void(float32[:], float32[:], float32[:], float32[:], float32, float32, float32, int32)')       2 def calculate_velocity_field(x, y, u_source, v_source, x_source, y_source, strength_source, n):       3     start  = cuda.blockidx.x * cuda.blockdim.x + cuda.threadidx.x       4     end    = n       5     stride = cuda.griddim.x * cuda.blockdim.x  ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/decorators.py in kernel_jit(func)      89             # force compilation current context      90             if bind: ---> 91                 kernel.bind()      92       93             return kernel  ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/compiler.py in bind(self)     319         force binding current cuda context     320         """ --> 321         self._func.get()     322      323     @property  ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/compiler.py in get(self)     254         cufunc = self.cache.get(device.id)     255         if cufunc none: --> 256             ptx = self.ptx.get()     257      258             # link  ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/compiler.py in get(self)     226             arch = nvvm.get_arch_option(*cc)     227             ptx = nvvm.llvm_to_ptx(self.llvmir, opt=3, arch=arch, --> 228                                    **self._extra_options)     229             self.cache[cc] = ptx     230             if config.dump_assembly:  ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in llvm_to_ptx(llvmir, **opts)     420     cu.add_module(llvmir.encode('utf8'))     421     cu.add_module(libdevice.get()) --> 422     ptx = cu.compile(**opts)     423     return ptx     424   ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in compile(self, **options)     211                                           x in opts])     212         err = self.driver.nvvmcompileprogram(self._handle, len(opts), c_opts) --> 213         self._try_error(err, 'failed compile\n')     214      215         # result  ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in _try_error(self, err, msg)     229      230     def _try_error(self, err, msg): --> 231         self.driver.check_error(err, "%s\n%s" % (msg, self.get_log()))     232      233     def get_log(self):  ~/.anaconda3/lib/python3.4/site-packages/numba/cuda/cudadrv/nvvm.py in check_error(self, error, msg, exit)     118                 sys.exit(1)     119             else: --> 120                 raise exc     121      122   nvvmerror: failed compile  libnvvm : error: -arch=compute_52 unsupported option nvvm_error_invalid_option 

i tried numbapro examples , same error ocurrs. don't know if it's bug of numbapro doesn't support 5.2 compute capability or it's problem of nvidia nvvm... suggestions?

in theory it should supported, don't know happening.

i'm using linux cuda 7.0 , driver version 346.29

finally found solution here

  • solution 1:

conda update cudatoolkit

fetching package metadata: .... # requested packages installed. # packages in environment @ ~/.anaconda3: # cudatoolkit               6.0                          p0 

it looks me updating cuda toolkit doesn't update cuda 7.0. second solution can done:

  • solution 2

conda install -c numba cudatoolkit

fetching package metadata: ......  solving package specifications: . package plan installation in environment ~/.anaconda3:  following packages downloaded:      package                    |            build     ---------------------------|-----------------     cudatoolkit-7.0            |                1       190.8 mb  following packages updated:      cudatoolkit: 6.0-p0 --> 7.0-1  proceed ([y]/n)? y 

before:

in [4]: check_cuda() ------------------------------libraries detection------------------------------- finding cublas     located @ ~/.anaconda3/lib/libcublas.so.6.0.37     trying open library...   ok finding cusparse     located @ ~/.anaconda3/lib/libcusparse.so.6.0.37     trying open library...   ok finding cufft     located @ ~/.anaconda3/lib/libcufft.so.6.0.37     trying open library...   ok finding curand     located @ ~/.anaconda3/lib/libcurand.so.6.0.37     trying open library...   ok finding nvvm     located @ ~/.anaconda3/lib/libnvvm.so.2.0.0     trying open library...   ok     finding libdevice compute_20... ok     finding libdevice compute_30... ok     finding libdevice compute_35... ok -------------------------------hardware detection------------------------------- found 1 cuda devices id 0      b'geforce gtx 970'                              [supported]                       compute capability: 5.2                            pci device id: 0                               pci bus id: 7 summary:     1/1 devices supported passed out[4]: true 

after:

in [6]:  check_cuda() ------------------------------libraries detection------------------------------- finding cublas     located @ ~/.anaconda3/lib/libcublas.so.7.0.28     trying open library...   ok finding cusparse     located @ ~/.anaconda3/lib/libcusparse.so.7.0.28     trying open library...   ok finding cufft     located @ ~/.anaconda3/lib/libcufft.so.7.0.35     trying open library...   ok finding curand     located @ ~/.anaconda3/lib/libcurand.so.7.0.28     trying open library...   ok finding nvvm     located @ ~/.anaconda3/lib/libnvvm.so.3.0.0     trying open library...   ok     finding libdevice compute_20... ok     finding libdevice compute_30... ok     finding libdevice compute_35... ok -------------------------------hardware detection------------------------------- found 1 cuda devices id 0      b'geforce gtx 970'                              [supported]                       compute capability: 5.2                            pci device id: 0                               pci bus id: 7 summary:     1/1 devices supported passed out[6]:  true 

Comments

Popular posts from this blog

python - argument must be rect style object - Pygame -

webrtc - Which ICE candidate am I using and why? -

c# - Better 64-bit byte array hash -