Image Segmentation with PaliGemma 2 Mix and MLX

In this post, we are going to explore Google’s PaliGemma 2 mix vision-language model (VLM), and its capabilities to perform image segmentation. What’s interesting is that we are going to perform this task by only using Apple’s MLX framework, and MLX-VLM. This would eliminate the dependency of using JAX/Flax as in the original Google’s segmentation script, and would allow us to fully and seamlessly utilise Apple’s unified memory. Medium post can be found here.

Fine-Tuning a Model for Function-Calling with MLX-LM

In this post, we explore the process of fine-tuning a language model for function-calling using MLX-LM. Following the Hugging Face Agents course notebook, we’ll walk through the steps from setting up the environment to training the model with LoRA adapters. The goal is to empower the model with the ability to intelligently plan and generate function calls, making it a versatile tool for interactive applications. Medium post can be found here

Qwen2.5-vl with MLX-VLM

In this post, we are going to show a tutorial on using the Qwen2.5-VL model with MLX-VLM for visual understanding tasks. We are going to cover:

Fine-Tuning LLMs with LoRA and MLX-LM

This blog post is going to be a tutorial on how to fine-tune a LLM with LoRA and the mlx-lm package. Medium post can be found here and Substack here.

links

social