Last December, EnterpriseWeb CEO, Dave Duggal, was in NYC for an event hosted by STAC, the Strategic Technology Analysis Center. STAC is an independent research organization that supports technology discovery and assessment for vertical industries so they can optimize their spend for best outcomes. They collaborate with leading industry players as part of the STAC Benchmark Council to develop benchmarks for objectively evaluating vendor offerings.
Dave was on stage as a panelist for the “Lean AI” session. The interactive discussion, which was moderated by James Corcoran, Head of AI and Analytics at STAC, also included Nataraj Dasgupta, VP Engineering, Syneos Health, Yi Ding, CEO, AnswerMagic.ai and Paul Yang, Developer Relations, Runhouse.
Dave challenged the brute force approach behind conventional Generative AI implementations as he did in a recent LinkedIn post “Is Jensen Huang Wrong”. In short, the assumption is – inference accuracy requires more tokens, more tokens incurs more latency, performance requires more GPUs. It’s a cycle that ties every advance to increasing infrastructure costs. Dave made a strong case for CPU-first for GenAI inference. He described EnterpriseWeb’s low-token, low-latency, energy-efficient methods for optimizing LLM accuracy and eliminating hallucinations so inferences can be used safely in mission critical operational processes. Dave elaborates on the topic in this post – Agentic AI vs Agent-based solutions.