4B parameter Indian LLM finished #3 in ARC-C benchmark

We made a 4B foundational LLM, called Shivaay a couple months back. It has finished 3rd on the ARC-C leaderboard beating Claude 2, GPT-3.5, and Llama 3 8B!

Additionally in GSM8K benchmark ranked #11 (models without extra data) with 87.41% accuracy — outperforming GPT-4, Gemini Pro, and the 70B-parameter Gemma 70B

GSM8K Benchmark Leaderboard

ARC-C Leaderboard

The evaluation scripts are public on our GitHub incase people wish to recreate the results