Evaluation Setup
Cross-compilation on host, verification and benchmarking on Android mobile devices
Benchmarks:
- MobileKernelBench (C++ Kernel Implementation) [New]
Metrics:
- Compilation Success Rate (CSR)
- Pass Rate (functional correctness)
- Performance Speedup (vs native MNN)
- Statistical methodology: Not explicitly reported in the paper
Key Results
| Benchmark |
Metric |
Baseline |
This Paper |
Δ |
| MobileKernelBench |
Compilation Success Rate |
45.3 |
93.7 |
+48.4
|
| MobileKernelBench |
Kernels with Speedup > Native |
0.0 |
27.4 |
+27.4
|
| MobileKernelBench |
Performance Parity Rate |
16.3 |
Not reported in the paper |
Not reported in the paper
|
Main Takeaways
- Standard LLMs and even fine-tuned variants fail to generate compilable mobile kernels >54% of the time due to API hallucinations.
- MoKA's agentic approach with repository-aware tools effectively bridges the data scarcity gap, achieving SOTA compilation rates (93.7%).
- While significant speedups are achieved (27.4%), the majority of generated kernels still do not outperform highly optimized native libraries, indicating room for improvement.