-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA推理结果有问题 #13
Comments
CUDA适配包括Metal适配都卡在某一个算子上了,适配好了会在这个issue留言并发布release |
问一手,现在有进展吗() |
最近忙了点😂,国庆尝试修复下 |
对照--no-gpu,目前已经确认到有三个地方有问题:
|
我是通过放开下面代码注释,打印tensor里面的值来对比的
|
第三点+1,简直有毒,要不手写? |
norm没有注意到有个scale,感谢指出。im2col我看看再排查一下,先看看官方能不能修复,实在不行再想办法手写吧 |
第一点pr已经提了,注释内和没注释的都是可以跑的,都可以测一下。 |
今天是修好了吗,我今天up之后发现看起来结果正常了。想问下是怎么解决的? |
我把im2col算子单独放在cpu上计算,虽然相比全放在gpu上损失了部分速度,im2col能够跑通了。但是我自己测试在V100没跑通😂,cuda上的softmax出现nan了,所以没发release。 |
CUDA 12.0测试正常,最新 CUDA 12.6出现nan,等官方修复了 |
音频来源:iic/SenseVoiceSmall模型,examples/zh.mp3
用ffmpeg转成-ar 16000的wav。
CPU推理的结果:
GPU(4080Super)推理的结果:
nvidia-smi信息:
The text was updated successfully, but these errors were encountered: