MCP
Vllm Mlx
OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.
Claim this listing
Connect your GitHub to prove you own or maintain this listing. We verify repo access automatically — most publishers are confirmed in seconds.
1Connect GitHub
2Submit your claim
3Auto-verified, or reviewed within 48h