MCP

Vllm Mlx

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.

Claim this listing

Connect your GitHub to prove you own or maintain this listing. We verify repo access automatically — most publishers are confirmed in seconds.

1Connect GitHub

2Submit your claim

3Auto-verified, or reviewed within 48h