Skip to content

Persona Leaderboard

Here are the rankings for each candidate based on the median of the Burrows Delta and Andrew Model evaluation metrics across 20 questions.

Description Burrows Delta Andrew Model
Answers v3 (Miso Studio) 0.5599 0.8726
Answers from chatgpt web app 0.2991 0.7669
Output from tmpt.me on 2025-06-05 0.5178 0.7843
CL Gradio System Prompt Example 1 0.3468 0.6802
gpt-3.5-turbo 0.5983 0.6947
Basic prompt with gpt4o-mini 0.6289 0.6496
Basic openai system prompt with gpt-4o 0.6342 0.5844
Prompt generated by openai (from original doc) 0.5432 0.5563

Here's the link to the CL Gradio System Prompt Example 1.

Methodology

Each model was asked the same 20 questions and then evaluated based on an average of the Burrows Delta and Andrew Model metrics. The questions are:

idx Question
0 I'm a new engineering leader and my team struggles with meeting deadlines. How can I improve our team's effectiveness and ensure we meet our goals?
1 I'm managing a team that lacks psychological safety, and team members are hesitant to share ideas. What strategies can I use to foster a more open and innovative environment?
2 I'm leading a diverse team, and while diversity is beneficial, we're facing coordination challenges. How can I create an inclusive environment that leverages our diversity effectively?
3 I'm a new engineering leader and my team is delivering a lot of code, but I'm not sure if we're focusing on the right things. How can I ensure that we're being effective and not just efficient?
4 As an engineering leader, I'm trying to balance the productivity of my team with the quality of our output. How can I measure and improve both efficiency and effectiveness without sacrificing one for the other?
5 I'm leading a team that has recently transitioned to using microservices and Kubernetes. While we've improved our deployment speed, user feedback indicates no significant improvements in performance or usability. How can I refocus my team to ensure our technical advancements translate into real user benefits?
6 I'm a new engineering leader and I'm struggling to define what effectiveness means for my team. How can I go about establishing a clear definition that aligns with our organization's goals?
7 I'm an engineering leader with some experience, and I'm facing challenges with empowering my team to take ownership of their work. What strategies can I use to foster autonomy and accountability?
8 I'm an experienced engineering leader, and I'm looking to scale the effectiveness of my team across the organization. What are some advanced strategies for expanding our success patterns to larger teams?
9 I'm a new engineering manager and I'm struggling to provide effective feedback to my team. What strategies can I use to improve my feedback skills?
10 I'm an engineering leader trying to create a more inclusive team environment. What actions can I take to ensure all team members feel valued and supported?
11 I'm leading a large engineering team and want to ensure psychological safety. How can I foster an environment where team members feel safe to express their ideas and concerns?
12 I'm a new engineering leader and I've noticed that one of my team members is always the go-to person for a specific module. How can I ensure that this doesn't become a problem for the team?
13 I'm managing a team where one engineer is trying to work on multiple areas but isn't mastering any. How can I help them focus and develop expertise?
14 As an experienced engineering leader, I'm looking to create a more balanced team dynamic. How can I prevent the formation of knowledge silos and ensure that expertise is distributed across the team?
15 I'm a new engineering manager transitioning from an individual contributor role. I'm struggling with letting go of my technical tasks and focusing on people management. What strategies can help me make this transition effectively?
16 As an engineering manager in a large organization, I'm finding it challenging to navigate complex team dynamics and ensure effective communication. What strategies can I use to improve team collaboration and communication?
17 As an experienced engineering manager, I'm looking to refine my management strategy to better balance innovation and stability. How can I effectively assess and manage calculated risks within my team?
18 I'm a new engineering leader and I'm struggling to understand the difference between leadership and management. How can I effectively balance both roles in my team?
19 As an engineering manager, I find it challenging to align my team with organizational priorities while maintaining their motivation. What strategies can I use to achieve this balance?

Meta-Evaluation

I used openAI to jusdge the strengths and weaknesses of each aproach for each questions, and then produced a summary of this evaluation. Here are the results:

basic-gpt3.5

  • Strengths: Clear, structured, and straightforward. Provides practical strategies.
  • Weaknesses: Lacks depth, engagement, and a personal touch. Often feels generic.

cl-gradio-pe1

  • Strengths: Conversational and concise. Emphasizes open communication and trust.
  • Weaknesses: Lacks depth and specific strategies. Often feels simplistic.

chatgpt

  • Strengths: Engaging, conversational, and comprehensive. Provides detailed analysis and actionable strategies. Often includes quotes and structured sections.
  • Weaknesses: Can be lengthy and overwhelming for some readers. Informal tone may not resonate with all audiences.

basic-gpt4o-mini

  • Strengths: Professional, methodical, and well-structured. Covers a wide range of strategies.
  • Weaknesses: Lacks a personal touch and can feel formulaic. Sometimes overly detailed.

openai-generated-prompt

  • Strengths: Friendly, approachable, and balanced. Emphasizes clarity and practical advice.
  • Weaknesses: Lacks depth and specific examples. Can feel generic.

tmpt-me

  • Strengths: Concise and direct. Emphasizes strategic planning and communication.
  • Weaknesses: Lacks depth and specific actionable steps. Feels somewhat generic.

basic-gpt4o

  • Strengths: Comprehensive and structured. Covers a wide range of strategies.
  • Weaknesses: Lengthy and formal. Lacks emotional resonance and personal anecdotes.

Tally of Best Models

  • chatgpt was selected as the best model for the majority of questions due to its engaging style, comprehensive approach, and practical strategies. It was chosen for questions related to team effectiveness, psychological safety, inclusivity, team dynamics, balancing productivity and quality, and more.
  • basic-gpt4o was selected as the best model for questions related to transitioning to management and balancing innovation and stability, due to its comprehensive and structured approach.
  • openai-generated-prompt was chosen for the question about balancing leadership and management roles, due to its conversational tone and structured approach.

Overall, chatgpt emerged as the most frequently selected best model, demonstrating its strength in providing detailed, engaging, and practical responses across a variety of leadership and management topics.