LLMs for Engineers
Subscribe
Sign in
Home
Archive
About
ai
Low-Budget Judge for High-End Hallucination Verdicts
… boosting LLM accuracy by >5% amidst label scarcity and budget constraints.
Nov 21
•
Daniel Omeiza
4
Share this post
LLMs for Engineers
Low-Budget Judge for High-End Hallucination Verdicts
Copy link
Facebook
Email
Notes
More
LLMs Know More Than What They Say
... and how that provides winning evals
Aug 15
•
Ruby Pai
15
Share this post
LLMs for Engineers
LLMs Know More Than What They Say
Copy link
Facebook
Email
Notes
More
Hybrid Evaluation: Scaling human feedback with custom evaluation models
...how to really get model based evals to work for you
Nov 15, 2023
•
Ansup Babu
and
Arjun Bansal
8
Share this post
LLMs for Engineers
Hybrid Evaluation: Scaling human feedback with custom evaluation models
Copy link
Facebook
Email
Notes
More
Which Llama-2 Inference API should I use?
understanding the complete trade-offs of Llama-2 providers
Oct 31, 2023
•
Wenzhe Xue
2
Share this post
LLMs for Engineers
Which Llama-2 Inference API should I use?
Copy link
Facebook
Email
Notes
More
Ready, Set, Test: Building Evaluation into Your LLM Workflow
... with llmeval
Oct 13, 2023
•
Niklas Nielsen
Share this post
LLMs for Engineers
Ready, Set, Test: Building Evaluation into Your LLM Workflow
Copy link
Facebook
Email
Notes
More
How do I evaluate LLM coding agents? 🧑💻
...aka when can I hire an AI software engineer?
Aug 31, 2023
•
Arjun Bansal
1
Share this post
LLMs for Engineers
How do I evaluate LLM coding agents? 🧑💻
Copy link
Facebook
Email
Notes
More
🕵️🗺️ Where do I deploy Llama-2? 🦙🦙
We share the most cost efficient way to run Llama-2
Aug 22, 2023
•
Arjun Bansal
3
Share this post
LLMs for Engineers
🕵️🗺️ Where do I deploy Llama-2? 🦙🦙
Copy link
Facebook
Email
Notes
More
Llama-2 and the open source LLM 🌊
Anyone can own and run full stack LLM applications like never before
Aug 3, 2023
•
Arjun Bansal
1
Share this post
LLMs for Engineers
Llama-2 and the open source LLM 🌊
Copy link
Facebook
Email
Notes
More
Evaluating LLM Agents and Applications
A lot of AI research such as HELM and BigBench has been devoted to building test suites to evaluate the accuracy of large language models.
Jul 11, 2023
•
Arjun Bansal
4
Share this post
LLMs for Engineers
Evaluating LLM Agents and Applications
Copy link
Facebook
Email
Notes
More
Evolution of LLM Agents
...and how to avert a crisis on further progress!
Jun 21, 2023
•
Arjun Bansal
and
Niklas Nielsen
4
Share this post
LLMs for Engineers
Evolution of LLM Agents
Copy link
Facebook
Email
Notes
More
3 ways to improve LLM Agent chains with debugging
Tl;dr: Cost, reliability & accuracy
May 3, 2023
•
Arjun Bansal
3
Share this post
LLMs for Engineers
3 ways to improve LLM Agent chains with debugging
Copy link
Facebook
Email
Notes
More
3
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts