Benchmarks: Running Large Language Models on a 14-inch MacBook Pro M2 Pro###

We tested the performance of Lama 3.1, Lama 3.2, Mistol, Gemma 1, Gemma 2, 53, and Quen 2.5, on a 14-inch MacBook Pro M2 Pro with 48GB of RAM. Our goal was to determine how fast these models could run on this computer and assess their storytelling capabilities.

We used a text prompt to generate a 500-word story about how AI took John’s job as a French to English translator. We analyzed the results for efficiency, emotional depth, and avoidance of melodrama.

Lama 3.2 emerged as the top performer, generating a highly detailed story with strong character development at a blazing-fast speed of 64 tokens per second. Gemma 2 and Quen 2.5 also impressed, delivering compelling stories at respectable speeds.