LLMs for Engineers

LLMs for Engineers

Home
Archive
About

evaluation

LLMs Know More Than What They Say
... and how that provides winning evals
Aug 15, 2024 • 
Ruby Pai
16

Share this post

LLMs for Engineers
LLMs for Engineers
LLMs Know More Than What They Say
Which Llama-2 Inference API should I use?
understanding the complete trade-offs of Llama-2 providers
Oct 31, 2023 • 
Wenzhe Xue
2

Share this post

LLMs for Engineers
LLMs for Engineers
Which Llama-2 Inference API should I use?
How do I evaluate LLM coding agents? 🧑‍💻
...aka when can I hire an AI software engineer?
Aug 31, 2023 • 
Arjun Bansal
1

Share this post

LLMs for Engineers
LLMs for Engineers
How do I evaluate LLM coding agents? 🧑‍💻
Llama-2 and the open source LLM 🌊
Anyone can own and run full stack LLM applications like never before
Aug 3, 2023 • 
Arjun Bansal
1

Share this post

LLMs for Engineers
LLMs for Engineers
Llama-2 and the open source LLM 🌊
Evaluating LLM Agents and Applications
A lot of AI research such as HELM and BigBench has been devoted to building test suites to evaluate the accuracy of large language models.
Jul 11, 2023 • 
Arjun Bansal
4

Share this post

LLMs for Engineers
LLMs for Engineers
Evaluating LLM Agents and Applications
1
Evolution of LLM Agents
...and how to avert a crisis on further progress!
Jun 21, 2023 • 
Arjun Bansal
 and 
Niklas Nielsen
4

Share this post

LLMs for Engineers
LLMs for Engineers
Evolution of LLM Agents
© 2025 Arjun Bansal
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share