While high-performance inference exists, there is a massive gap in low-latency *edge* multimodal serving. A developer should focus on porting these techniques to run efficiently on mobile hardware or Raspberry Pi class devices.
Suggested repo: edge-infer
"Server-grade speed, on your edge device."
Estimated effort: 90h