MLX Community Projects #654
Replies: 16 comments 8 replies
-
Text generation: mlx-tuning-fork |
Beta Was this translation helpful? Give feedback.
-
text generation: https://github.com/mzbac/mlx-moe-models |
Beta Was this translation helpful? Give feedback.
-
An implementation of Reinforcement Learning algorithms in MLX based in the Implementations from CleanRL. Still WIP because it’s missing a benchmark and some other minor things, but the implementations work correctly. |
Beta Was this translation helpful? Give feedback.
-
mlx-models. Currently supporting vision models by loading/converting from PyTorch checkpoints. Will later add support for text and audio models as well. |
Beta Was this translation helpful? Give feedback.
-
Hi I would love to add chat-with-mlx. It is a Chat UI + RAG Implementation on MLX. I wIll add more features later on (more advanced RAG pipeline + multimodal) |
Beta Was this translation helpful? Give feedback.
-
I have an example of training a simple language model using BitLinear instead of nn.Linear. It's a port of Karpathy's minGPT to MLX along with a custom implementation of a BitLinear module. https://github.com/adhulipa/mlx-mingpt I noticed this collection already has the far more meatier |
Beta Was this translation helpful? Give feedback.
-
Transformer Lab https://github.com/transformerlab/transformerlab-app is an LLM research platform that allows you to run, train, perform RAG, and evaluate LLMs through a GUI. |
Beta Was this translation helpful? Give feedback.
-
MLX RAG with GGUF Models: https://github.com/Jaykef/mlx-rag-gguf The code here builds on https://github.com/vegaluisjose/mlx-rag, it has been optimized to support RAG-based inferencing for .gguf models. I am using BAAI/bge-small-en for the embedding model, TinyLlama-1.1B-Chat-v1.0-GGUF as base model and the custom vector database script for indexing texts in a pdf file. Inference speeds can go up to ~413 tokens/sec for prompts and ~36 tokens/sec for generation on my 8G M2 Air. |
Beta Was this translation helpful? Give feedback.
-
@Jaykef Very cool, thanks for sharing |
Beta Was this translation helpful? Give feedback.
-
Vision: MLX3D A library for deep learning with 3D data using mlx. |
Beta Was this translation helpful? Give feedback.
-
JSON schema decoding (allowing function calling, including an OpenAI-compatible server with tools) using MLX: https://github.com/otriscon/llm-structured-output |
Beta Was this translation helpful? Give feedback.
-
Hello for text generation part, I'm happy to share with you that I've proposed and contributed to the integration of MLX with LibreChat.ai. So now you can use your local LLM powered by MLX through a fancy interface privately, enjoy! :D See danny-avila/LibreChat#2580 If in the future the community proposes an API servers supporting also multimodality, transcription, image generation for example, I will add them into LibreChat ;) It could be great also to have and LLM API supporting /models endpoint and multiple models simultaneously :D |
Beta Was this translation helpful? Give feedback.
-
Hello, mlx community, we are happy to share with you that we have contributed the first strong sub-4 bit LLM model zoo for MLX community.
The modern LLM families include Llama3/2, Phi-3, Mistral, 01-Yi, and Qwen. A mlx-style inference toolkit is also shared for the local web chatting.
We are an active team here, supporting the better low-bit community on the local platform. Enjoy! |
Beta Was this translation helpful? Give feedback.
-
mlx_micrograd - mlx port of Karpathy's micrograd - a tiny scalar-valued autograd engine with a small PyTorch-like neural network library on top. Installationpip install mlx_micrograd Example usageExample showing a number of possible supported operations: from mlx_micrograd.engine import Value
a = Value(-4.0)
b = Value(2.0)
c = a + b
d = a * b + b**3
c += c + 1
c += 1 + c + (-a)
d += d * 2 + (b + a).relu()
d += 3 * d + (b - a).relu()
e = c - d
f = e**2
g = f / 2.0
g += 10.0 / f
print(f'{g.data}') # prints array(24.7041, dtype=float32), the outcome of this forward pass
g.backward()
print(f'{a.grad}') # prints array(138.834, dtype=float32), i.e. the numerical value of dg/da
print(f'{b.grad}') # prints array(645.577, dtype=float32), i.e. the numerical value of dg/db |
Beta Was this translation helpful? Give feedback.
-
This one is a little stale, but I've taken the approach used for adding LoRA to LLMs and applied it to LlaVA in mlx-examples Can use this as a starting point for fine tuning VLMs as datasets get more popular, like https://huggingface.co/datasets/HuggingFaceM4/the_cauldron |
Beta Was this translation helpful? Give feedback.
-
Let's collect some cool MLX integrations and community lead projects here for visibility!
If you have a project you would like to feature, leave a comment, and we will add it.
Text Generation
Vision
Speech and Audio
Multi-modal
Misc
Educational
Beta Was this translation helpful? Give feedback.
All reactions