A Job Postings Tool: A Guide to MLX-LM Server and Tool Use with the OpenAI Client

Building intelligent applications that can interact with real-world data requires more than just Large Language Models (LLMs), it requires the ability to call external functions and tools. Tool calling transforms a conversational LLM into an agent that can execute code, query APIs, and perform tasks. In this blog post, we are going to create a job search assistant using the MLX-LM Server, connect it to the OpenAI client , and utilise the Qwen3-8B model’s tool‐calling abilities. We are going to build a tool that scrapes job postings from DEV.BG, a popular Bulgarian job board, and provides intelligent responses about available positions.

Thinking Backwards: The "Reversal Blessing" in LLM Multiple-Choice Reasoning

Most modern languages are written from left to right, thus we assume that thinking from left to right is the most natural way to process information expressed with these languages. This is particularly true for Large Language Models (LLMs) which are typically trained to predict the next word in a sequence, known as left-to-right (L2R) language models. But what if, for certain tasks, thinking backward could actually be better?

Image Segmentation with PaliGemma 2 mix, Transformers, Docker, FastAPI, and GitHub Actions

In today’s fast-paced machine learning landscape, deploying AI models is just as important as developing them. In this blog post, we are going to walk through an image segmentation application using Google’s PaliGemma 2 Mix model and transformers, containerized with Docker, and served through a FastAPI backend. We are also going to discuss the CI/CD pipeline using GitHub Actions to automate building the Docker image and pushing it to Docker Hub. Let’s explore this service, why we chose these technologies, and how you can get started and use the service yourself!

Chat with Qwen3 on your iPhone: A Step-by-Step Guide

Have you ever wanted to run a powerful large language model directly on your iPhone without sending your data to the cloud? Thanks to Apple’s MLX Swift framework, you can now run the remarakably capable Qwen3 models right on your iPhone.

Image Segmentation with PaliGemma 2 Mix and MLX

In this post, we are going to explore Google’s PaliGemma 2 mix vision-language model (VLM), and its capabilities to perform image segmentation. What’s interesting is that we are going to perform this task by only using Apple’s MLX framework, and MLX-VLM. This would eliminate the dependency of using JAX/Flax as in the original Google’s segmentation script, and would allow us to fully and seamlessly utilise Apple’s unified memory. Medium post can be found here.

links

social