Inference tool promises higher performance
AI hardware startup Cerebras has created a new AI inference solution that could potentially rival Nvidia’s GPU offerings for enterprises. The Cerebras Inference tool is based on the company’s Wafer-Scale Engine and promises to deliver staggering performance. According to sources, the tool has achieved speeds of 1,800 tokens per second for Llama 3.1 8B, and […]