yolov8_xj3_deploy

yolov8的训练以及在旭日x3派上的部署

https://github.com/hzbupahaozi/yolov8_xj3_deploy

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (3.9%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

yolov8的训练以及在旭日x3派上的部署

Basic Info
  • Host: GitHub
  • Owner: Hzbupahaozi
  • License: agpl-3.0
  • Language: Python
  • Default Branch: main
  • Size: 4.94 MB
Statistics
  • Stars: 20
  • Watchers: 1
  • Forks: 2
  • Open Issues: 4
  • Releases: 0
Created about 3 years ago · Last pushed 10 months ago
Metadata Files
Readme Contributing License Citation

README.md

yolov8在旭日x3派板子上的部署

主要内容如下:

  1. 用修改后的yolov8训练自己的数据集
  2. Pytorch到ONNX模型转换
  3. ONNX到BIN转换
  4. 旭日x3派上的实时检测

yolov8n比yolov5s参数量少一倍的情况下,精度可以持平yolov5s,同时与yolov6和yolov7相比也是更胜一筹。

get started

sh git clone https://github.com/Hzbupahaozi/yolov8_xj3_deploy.git cd yolov8_xj3_deploy pip install -r requirements.txt python setup.py install

train

旭日x3派开发板的亮点在于其BPU 5T的int8算力\ BPU的高效源于其特别的网络结构————可变组卷积(VarGNet),具体可以参考https://arxiv.org/abs/1907.05653\ 因此对原本的yolov8网络结构进行修改,即将yolov8.yaml修改为x3pimodelconfig/yolov8-vargnetct.yaml\ 准备好yolo格式的数据集后就可以开始训练了 sh yolo task=detect mode=train model=./x3pi_config/yolov8n-vargnetct.yaml data=./mydata.yaml batch=32 epochs=80 imgsz=640 workers=8 device=0 训练结果如下图 x3true_yolov8n_80

pytorch模型到onnx模型的转换

以下是三木大佬的原话博客:\ 由于 YOLOv8 原始仓库转换的模型把预测特征解码放到了模型里面,这其实会导致两个问题: 1. 目标检测模型并不是要计算所有框并且回归的,通过置信度可以筛选超过 90% 的框,这些都不需要额外计算的,因此我提供了一个传参 x3pi=True 用来删除解码等操作。 2. 原始模型的解码操作 BPU 不能很好的加速,那索性就全用 CPU 处理好了。 转换命令: sh yolo export detect model=best.pt format=onnx x3pi=True

onnx到bin的转换

这里使用docker挂载天工开物的开发工具包实现\ 教程参考:BPU部署走出新手村\ 需要注意的是要下载最新的2.5.2的容器和2.5.2的oe文档,上述教程使用版本过旧(导致我搞了好一段时间没啥进展), ```sh docker run -it --rm -v "D:\docker\horizonxj3openexplorerv2.5.2-py3820230331":/openexplorer -v "D:\docker\BPUCodes":/data/horizonx3/codes openexplorer/aitoolchainubuntu20xj3cpu:v2.5.2

cd /openexplorer/ddk/samples/aitoolchain/horizonmodelconvertsample/04detection/ mkdir ./yolov8/ cd ./yolov8/ 为了模型的轻量化在转换为bin模型时会将模型参数通过PTQ转换为int8,需要准备校准数据集,具体脚本为make_calib.py\ 然后把一些相关文件(onnx模型,转换文件my_config.yaml,校准数据集文件夹)拷贝进去 sh docker cp 本地文件路径 容器:/openexplorer/ddk/samples/aitoolchain/horizonmodelconvertsample/04detection/yolov8/ 其中容器可通过以下命令查看 docker ps 然后就可以开始转换了 sh

check

hb_mapper checker --model-type onnx \ --model best.onnx \ --march bernoulli2

export

hbmapper makertbin --config myconfig.yaml --model-type onnx ``` 然后就可以看到很多日志

```

Node ON Subgraph Type Cosine Similarity Threshold In/Out DataType

HZPREPROCESSFORimages BPU id(0) HzSQuantizedPreprocess 0.999949 127.000000 int8/int8
Conv
0 BPU id(0) HzSQuantizedConv 0.999869 1.502627 int8/int8
Conv2 BPU id(0) HzSQuantizedConv 0.999687 17.772688 int8/int8
Conv
4 BPU id(0) HzSQuantizedConv 0.999460 16.475630 int8/int8
Split6 BPU id(0) Split int8/int8
Conv
7 BPU id(0) HzSQuantizedConv 0.999311 14.938033 int8/int8
Conv9 BPU id(0) HzSQuantizedConv 0.999322 16.493052 int8/int8
Conv
11 BPU id(0) HzSQuantizedConv 0.999112 19.128515 int8/int8
Conv13 BPU id(0) HzSQuantizedConv 0.999251 18.809891 int8/int8
UNIT
CONVFORAdd15 BPU id(0) HzSQuantizedConv 0.999442 14.938033 int8/int8
...CONV
FORonnx::Concat4440.14999TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNITCONVFORinput.240.14999TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat
16 BPU id(0) Concat 0.999392 14.938033 int8/int8
Conv17 BPU id(0) HzSQuantizedConv 0.999088 19.048702 int8/int8
Conv
19 BPU id(0) HzSQuantizedConv 0.999166 11.896363 int8/int8
Conv21 BPU id(0) HzSQuantizedConv 0.999127 13.429939 int8/int8
Split
23 BPU id(0) Split int8/int8
Conv24 BPU id(0) HzSQuantizedConv 0.996399 13.270452 int8/int8
Conv
26 BPU id(0) HzSQuantizedConv 0.996832 9.714583 int8/int8
Conv28 BPU id(0) HzSQuantizedConv 0.995260 6.936894 int8/int8
Conv
30 BPU id(0) HzSQuantizedConv 0.996167 5.460835 int8/int8
UNITCONVFORAdd32 BPU id(0) HzSQuantizedConv 0.998495 13.270452 int8/int8
Conv33 BPU id(0) HzSQuantizedConv 0.996459 8.126689 int8/int8
Conv
35 BPU id(0) HzSQuantizedConv 0.996397 5.057761 int8/int8
Conv37 BPU id(0) HzSQuantizedConv 0.995583 4.751214 int8/int8
Conv
39 BPU id(0) HzSQuantizedConv 0.995153 6.242482 int8/int8
UNITCONVFORAdd41 BPU id(0) HzSQuantizedConv 0.998127 8.126689 int8/int8
...CONVFORonnx::Concat4690.09842TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNIT
CONVFORinput.880.09842TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNITCONVFORinput.1240.09842TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat
42 BPU id(0) Concat 0.998541 13.270452 int8/int8
Conv43 BPU id(0) HzSQuantizedConv 0.997992 12.499084 int8/int8
Conv
45 BPU id(0) HzSQuantizedConv 0.999056 6.938571 int8/int8
Conv47 BPU id(0) HzSQuantizedConv 0.999112 6.857578 int8/int8
Split
49 BPU id(0) Split int8/int8
Conv50 BPU id(0) HzSQuantizedConv 0.998487 6.594507 int8/int8
Conv
52 BPU id(0) HzSQuantizedConv 0.998260 6.571420 int8/int8
Conv54 BPU id(0) HzSQuantizedConv 0.997727 4.126614 int8/int8
Conv
56 BPU id(0) HzSQuantizedConv 0.996096 4.693301 int8/int8
UNITCONVFORAdd58 BPU id(0) HzSQuantizedConv 0.998729 6.594507 int8/int8
Conv59 BPU id(0) HzSQuantizedConv 0.997990 6.840492 int8/int8
Conv
61 BPU id(0) HzSQuantizedConv 0.997279 4.396142 int8/int8
Conv63 BPU id(0) HzSQuantizedConv 0.996712 3.522115 int8/int8
Conv
65 BPU id(0) HzSQuantizedConv 0.995933 4.622054 int8/int8
UNITCONVFORAdd67 BPU id(0) HzSQuantizedConv 0.998451 6.840492 int8/int8
...CONVFORonnx::Concat5070.05330TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNIT
CONVFORinput.1880.05330TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNITCONVFORinput.2240.05330TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat
68 BPU id(0) Concat 0.998793 6.594507 int8/int8
Conv69 BPU id(0) HzSQuantizedConv 0.998581 6.768975 int8/int8
Conv
71 BPU id(0) HzSQuantizedConv 0.999188 4.436905 int8/int8
Conv73 BPU id(0) HzSQuantizedConv 0.999031 4.389203 int8/int8
Split
75 BPU id(0) Split int8/int8
Conv76 BPU id(0) HzSQuantizedConv 0.998980 4.278706 int8/int8
Conv
78 BPU id(0) HzSQuantizedConv 0.998558 5.475333 int8/int8
Conv80 BPU id(0) HzSQuantizedConv 0.998531 3.166869 int8/int8
Conv
82 BPU id(0) HzSQuantizedConv 0.998872 4.244014 int8/int8
UNITCONVFORAdd84 BPU id(0) HzSQuantizedConv 0.999180 4.278706 int8/int8
...CONVFORonnx::Concat5450.03857TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNIT
CONVFORinput.2880.03857TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat85 BPU id(0) Concat 0.999096 4.278706 int8/int8
Conv
86 BPU id(0) HzSQuantizedConv 0.998955 4.898155 int8/int8
Conv88 BPU id(0) HzSQuantizedConv 0.998856 4.396799 int8/int8
MaxPool
90 BPU id(0) HzQuantizedMaxPool 0.999222 6.915278 int8/int8
MaxPool91 BPU id(0) HzQuantizedMaxPool 0.999414 6.915278 int8/int8
MaxPool
92 BPU id(0) HzQuantizedMaxPool 0.999525 6.915278 int8/int8
Concat93 BPU id(0) Concat 0.999353 6.915278 int8/int8
Conv
94 BPU id(0) HzSQuantizedConv 0.999698 6.915278 int8/int8
ConvTranspose96 BPU id(0) HzSQuantizedConvTranspose 0.999779 3.459090 int8/int8
UNIT
CONVFORonnx::Conv5380.04263TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat
98 BPU id(0) Concat 0.999320 5.414181 int8/int8
Conv99 BPU id(0) HzSQuantizedConv 0.999117 5.414181 int8/int8
Split
101 BPU id(0) Split int8/int8
Conv102 BPU id(0) HzSQuantizedConv 0.998763 5.000547 int8/int8
Conv
104 BPU id(0) HzSQuantizedConv 0.998287 5.897664 int8/int8
Conv106 BPU id(0) HzSQuantizedConv 0.997791 4.177810 int8/int8
Conv
108 BPU id(0) HzSQuantizedConv 0.997375 4.131814 int8/int8
...CONVFORonnx::Concat5800.03904TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNIT
CONVFORinput.3720.03904TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat110 BPU id(0) Concat 0.998700 5.000547 int8/int8
Conv
111 BPU id(0) HzSQuantizedConv 0.998120 4.958530 int8/int8
ConvTranspose113 BPU id(0) HzSQuantizedConvTranspose 0.998221 5.232376 int8/int8
UNIT
CONVFORonnx::Conv5000.05410TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat
115 BPU id(0) Concat 0.998037 6.871005 int8/int8
Conv116 BPU id(0) HzSQuantizedConv 0.997310 6.871005 int8/int8
Split
118 BPU id(0) Split int8/int8
Conv119 BPU id(0) HzSQuantizedConv 0.996089 5.632505 int8/int8
Conv
121 BPU id(0) HzSQuantizedConv 0.995166 5.500845 int8/int8
Conv123 BPU id(0) HzSQuantizedConv 0.993983 6.010270 int8/int8
Conv
125 BPU id(0) HzSQuantizedConv 0.993902 7.229157 int8/int8
...CONVFORonnx::Concat6040.04773TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNIT
CONVFORinput.4360.04773TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat127 BPU id(0) Concat 0.996177 5.632505 int8/int8
Conv
128 BPU id(0) HzSQuantizedConv 0.994840 6.062278 int8/int8
Conv130 BPU id(0) HzSQuantizedConv 0.996329 7.191900 int8/int8
...R
onnx::ConvTranspose5970.05140TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat
132 BPU id(0) Concat 0.997258 6.528210 int8/int8
Conv133 BPU id(0) HzSQuantizedConv 0.996429 6.528210 int8/int8
Split
135 BPU id(0) Split int8/int8
Conv136 BPU id(0) HzSQuantizedConv 0.994795 6.808836 int8/int8
Conv
138 BPU id(0) HzSQuantizedConv 0.994417 6.371107 int8/int8
Conv140 BPU id(0) HzSQuantizedConv 0.993589 6.077781 int8/int8
Conv
142 BPU id(0) HzSQuantizedConv 0.993910 6.707007 int8/int8
...CONVFORonnx::Concat6290.05270TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNIT
CONVFORinput.5040.05270TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat144 BPU id(0) Concat 0.995724 6.808836 int8/int8
Conv
145 BPU id(0) HzSQuantizedConv 0.995444 6.693432 int8/int8
Conv147 BPU id(0) HzSQuantizedConv 0.993256 6.435788 int8/int8
...R
onnx::ConvTranspose5730.03540TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat
149 BPU id(0) Concat 0.998485 4.496263 int8/int8
Conv150 BPU id(0) HzSQuantizedConv 0.997724 4.496263 int8/int8
Split
152 BPU id(0) Split int8/int8
Conv153 BPU id(0) HzSQuantizedConv 0.997050 4.260736 int8/int8
Conv
155 BPU id(0) HzSQuantizedConv 0.996641 5.090493 int8/int8
Conv157 BPU id(0) HzSQuantizedConv 0.994862 3.914743 int8/int8
Conv
159 BPU id(0) HzSQuantizedConv 0.992631 4.832358 int8/int8
...CONVFORonnx::Concat6540.03318TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
UNIT
CONVFORinput.5720.03318TOFUSESCALE BPU id(0) HzSQuantizedConv int8/int8
Concat161 BPU id(0) Concat 0.995999 4.260736 int8/int8
Conv
162 BPU id(0) HzSQuantizedConv 0.995250 4.214143 int8/int8
Conv164 BPU id(0) HzSQuantizedConv 0.995858 7.191900 int8/int8
Mul
166 BPU id(0) HzLut 0.996182 7.299980 int8/int8
Conv167 BPU id(0) HzSQuantizedConv 0.997211 7.295052 int8/int8
Mul
169 BPU id(0) HzLut 0.997212 7.081635 int8/int8
Conv170 BPU id(0) HzSQuantizedConv 0.998338 7.075689 int8/int32
Conv
172 BPU id(0) HzSQuantizedConv 0.995082 7.191900 int8/int8
Mul174 BPU id(0) HzLut 0.993972 6.291241 int8/int8
Conv
175 BPU id(0) HzSQuantizedConv 0.994323 6.279607 int8/int8
Mul177 BPU id(0) HzLut 0.994310 10.864396 int8/int8
Conv
178 BPU id(0) HzSQuantizedConv 0.999579 10.864189 int8/int32
Conv180 BPU id(0) HzSQuantizedConv 0.996984 6.435788 int8/int8
Mul
182 BPU id(0) HzLut 0.996698 6.473208 int8/int8
Conv183 BPU id(0) HzSQuantizedConv 0.997236 6.463228 int8/int8
Mul
185 BPU id(0) HzLut 0.996391 7.075638 int8/int8
Conv186 BPU id(0) HzSQuantizedConv 0.998851 7.069662 int8/int32
Conv
188 BPU id(0) HzSQuantizedConv 0.995746 6.435788 int8/int8
Mul190 BPU id(0) HzLut 0.995045 5.152664 int8/int8
Conv
191 BPU id(0) HzSQuantizedConv 0.995736 5.123033 int8/int8
Mul193 BPU id(0) HzLut 0.995775 6.424686 int8/int8
Conv
194 BPU id(0) HzSQuantizedConv 0.999827 6.414289 int8/int32
Conv196 BPU id(0) HzSQuantizedConv 0.994093 3.806277 int8/int8
Mul
198 BPU id(0) HzLut 0.993954 4.152104 int8/int8
Conv199 BPU id(0) HzSQuantizedConv 0.994331 4.087798 int8/int8
Mul
201 BPU id(0) HzLut 0.994222 4.594001 int8/int8
Conv202 BPU id(0) HzSQuantizedConv 0.999736 4.548010 int8/int32
Conv
204 BPU id(0) HzSQuantizedConv 0.998809 3.806277 int8/int8
Mul206 BPU id(0) HzLut 0.998360 3.759366 int8/int8
Conv
207 BPU id(0) HzSQuantizedConv 0.999285 3.673772 int8/int8
Mul209 BPU id(0) HzLut 0.999141 4.446995 int8/int8
Conv
210 BPU id(0) HzSQuantizedConv 0.999996 4.395507 int8/int32 2023-04-30 12:46:11,222 INFO [Sun Apr 30 12:46:11 2023] End to Horizon NN Model Convert. 2023-04-30 12:46:11,268 INFO start convert to *.bin file.... 2023-04-30 12:46:11,299 INFO ONNX model output num : 6 2023-04-30 12:46:11,302 INFO ############# model deps info ############# 2023-04-30 12:46:11,303 INFO hbmapper version : 1.15.5 2023-04-30 12:46:11,303 INFO hbdk version : 3.44.1 2023-04-30 12:46:11,303 INFO hbdk runtime version: 3.15.17.0 2023-04-30 12:46:11,303 INFO horizonnn version : 0.16.3 2023-04-30 12:46:11,304 INFO ############# modelparameters info ############# 2023-04-30 12:46:11,304 INFO onnxmodel : /openexplorer/ddk/samples/aitoolchain/horizonmodelconvertsample/04detection/myyolov8/best.onnx 2023-04-30 12:46:11,304 INFO BPU march : bernoulli2 2023-04-30 12:46:11,304 INFO layeroutdump : False 2023-04-30 12:46:11,305 INFO loglevel : DEBUG 2023-04-30 12:46:11,305 INFO working dir : /openexplorer/ddk/samples/aitoolchain/horizonmodelconvertsample/04detection/myyolov8/modeloutputs 2023-04-30 12:46:11,305 INFO outputmodelfileprefix: yolov8horizon 2023-04-30 12:46:11,305 INFO ############# inputparameters info ############# 2023-04-30 12:46:11,305 INFO ------------------------------------------ 2023-04-30 12:46:11,306 INFO ---------input info : images --------- 2023-04-30 12:46:11,306 INFO inputname : images 2023-04-30 12:46:11,306 INFO inputtypert : nv12 2023-04-30 12:46:11,306 INFO inputspace&range : regular 2023-04-30 12:46:11,306 INFO inputlayoutrt : NHWC 2023-04-30 12:46:11,307 INFO inputtypetrain : rgb 2023-04-30 12:46:11,307 INFO inputlayouttrain : NCHW 2023-04-30 12:46:11,307 INFO normtype : datascale 2023-04-30 12:46:11,307 INFO inputshape : 1x3x640x640 2023-04-30 12:46:11,307 INFO inputbatch : 1 2023-04-30 12:46:11,308 INFO scalevalue : 0.003921568627451, 2023-04-30 12:46:11,308 INFO caldatadir : /openexplorer/ddk/samples/aitoolchain/horizonmodelconvertsample/04detection/myyolov8/calib640f32 2023-04-30 12:46:11,308 INFO caldatatype : float32 2023-04-30 12:46:11,308 INFO ---------input info : images end ------- 2023-04-30 12:46:11,309 INFO ------------------------------------------ 2023-04-30 12:46:11,309 INFO ############# calibrationparameters info ############# 2023-04-30 12:46:11,309 INFO preprocesson : False 2023-04-30 12:46:11,309 INFO calibrationtype: : default 2023-04-30 12:46:11,309 INFO ############# compilerparameters info ############# 2023-04-30 12:46:11,310 INFO hbdkpassthroughparams: --O3 --core-num 2 --fast 2023-04-30 12:46:11,310 INFO input-source : {'images': 'pyramid', 'defaultvalue': 'ddr'} 2023-04-30 12:46:11,316 INFO Convert to runtime bin file successfully! 2023-04-30 12:46:11,316 INFO End Model Convert ``` 我们可以看到所有算子都是在BPU上跑了,所以可以起到加速推理的作用,同时也说明对yolov8模型的修改是有效的!因为我尝试过使用最原本的yolov8n结构,结果还是有一部分不能在bpu上跑\ 模型的输出结果在modeloutput中,其中的yolov8_horizon.bin就是我们需要的模型

最后一步 上板!

首先我们可以测试以下地平线预装模型yolov5672x672nv12.bin,yolov5s672x672nv12.bin的效果 ``` hrtmodelexec perf \ --modelfile yolov5s672x672nv12.bin \ --modelname="" \ --coreid=0 \ --framecount=500 \ --perftime=0 \ --threadnum=1 \ --profile_path="."

hrtmodelexec perf \ --modelfile fcos512x512nv12.bin \ --modelname="" \ --coreid=0 \ --framecount=500 \ --perftime=0 \ --threadnum=1 \ --profile_path="." 结果如下: Running condition: Thread number is: 1 Frame count is: 500 Program run time: 36346.591000 ms Perf result: Frame totally latency is: 36325.519531 ms Average latency is: 72.651039 ms Frame rate is: 13.756448 FPS

Running condition: Thread number is: 1 Frame count is: 500 Program run time: 36308.297000 ms Perf result: Frame totally latency is: 36286.542969 ms Average latency is: 72.573090 ms Frame rate is: 13.770957 FPS 然后就是我们模型的效果 hrtmodelexec perf \ --modelfile yolov8s-vargnetct-det.bin \ --modelname="" \ --coreid=0 \ --framecount=500 \ --perftime=0 \ --threadnum=1 \ --profile_path="." 日志如下: Running condition: Thread number is: 1 Frame count is: 500 Program run time: 20303.004000 ms Perf result: Frame totally latency is: 20181.697266 ms Average latency is: 40.36.3396 ms Frame rate is: 24.626898 FPS ``` 强! 最后就是调用usb摄像头进行实时检测,脚本为camera.py\ 代码中 72-144 行是解析模型输出的关键内容,由于这部分在 CPU 跑,因此我主要使用 numpy 高效并行优化了一下后处理中的框回归计算,然后使用 opencv-python 提供的 NMS 接口对多余重复框去重复\ 代码中 170-176 首先推理了 10 次空图做了一下预热,让模型运行更稳定\ 代码中 199-202 打印了推理过程中,除了画框部分的耗时情况\ 执行代码后,会打印如下日志:

微信图片_20230430171956\ 前处理耗时:7 ms\ 模型推理: 44 ms\ 后处理耗时: 8 ms\ 整体耗时: 59 ms

前前后后耗时将近一个月,中间一段时间将近放弃部署直接在电脑上跑,这两天看了一些资料加班了一下最后总算是搞完了,最后感谢以下地平线论坛里的工作人员,还有最最感谢的就是三木大佬,本次参考triple-Mu/yolov8

后续再记录一下,实现了在本地的使用onnxruntime进行推理,在inference文件夹中,其中包括 1. 只导出backbone+neck的分别是自制数据集和coco数据集的onnx模型的推理\ 2. 正常export后onnx模型的推理\ vargnet_backbone+neck的onnx模型推理\ origin_backbone+neck的onnx模型推理\ 可以看到旭日x3派经过BPU部署后的性能和笔记本的cpu推理的性能差不多

地平线官方实现的检测算法的精度以及速度记录

image

To Do List

  1. python的多线程推理
  2. C++封装python
  3. C++推理

Owner

  • Name: hzpahaozi
  • Login: Hzbupahaozi
  • Kind: user
  • Company: scut

Citation (CITATION.cff)

cff-version: 1.2.0
preferred-citation:
  type: software
  message: If you use this software, please cite it as below.
  authors:
  - family-names: Jocher
    given-names: Glenn
    orcid: "https://orcid.org/0000-0001-5950-6979"
  - family-names: Chaurasia
    given-names: Ayush
    orcid: "https://orcid.org/0000-0002-7603-6750"
  - family-names: Qiu
    given-names: Jing
    orcid: "https://orcid.org/0000-0003-3783-7069"
  title: "YOLO by Ultralytics"
  version: 8.0.0
  # doi: 10.5281/zenodo.3908559  # TODO
  date-released: 2023-1-10
  license: AGPL-3.0
  url: "https://github.com/ultralytics/ultralytics"

GitHub Events

Total
  • Issues event: 2
  • Watch event: 5
  • Push event: 1
Last Year
  • Issues event: 2
  • Watch event: 5
  • Push event: 1