Thinking Backwards: The "Reversal Blessing" in LLM Multiple-Choice Reasoning

Most modern languages are written from left to right, thus we assume that thinking from left to right is the most natural way to process information expressed with these languages. This is particularly true for Large Language Models (LLMs) which are typically trained to predict the next word in a sequence, known as left-to-right (L2R) language models. But what if, for certain tasks, thinking backward could actually be better?

Image Segmentation with PaliGemma 2 mix, Transformers, Docker, FastAPI, and GitHub Actions

In today’s fast-paced machine learning landscape, deploying AI models is just as important as developing them. In this blog post, we are going to walk through an image segmentation application using Google’s PaliGemma 2 Mix model and transformers, containerized with Docker, and served through a FastAPI backend. We are also going to discuss the CI/CD pipeline using GitHub Actions to automate building the Docker image and pushing it to Docker Hub. Let’s explore this service, why we chose these technologies, and how you can get started and use the service yourself!

Chat with Qwen3 on your iPhone: A Step-by-Step Guide

Have you ever wanted to run a powerful large language model directly on your iPhone without sending your data to the cloud? Thanks to Apple’s MLX Swift framework, you can now run the remarakably capable Qwen3 models right on your iPhone.

Image Segmentation with PaliGemma 2 Mix and MLX

In this post, we are going to explore Google’s PaliGemma 2 mix vision-language model (VLM), and its capabilities to perform image segmentation. What’s interesting is that we are going to perform this task by only using Apple’s MLX framework, and MLX-VLM. This would eliminate the dependency of using JAX/Flax as in the original Google’s segmentation script, and would allow us to fully and seamlessly utilise Apple’s unified memory. Medium post can be found here.

Fine-Tuning a Model for Function-Calling with MLX-LM

In this post, we explore the process of fine-tuning a language model for function-calling using MLX-LM. Following the Hugging Face Agents course notebook, we’ll walk through the steps from setting up the environment to training the model with LoRA adapters. The goal is to empower the model with the ability to intelligently plan and generate function calls, making it a versatile tool for interactive applications. Medium post can be found here

links

social