Very impressive! Do you have benchmark to test the reliability? A paper would be awesome to contribute to the science.
- 0 Posts
- 2 Comments
Joined 2 years ago
Cake day: August 20th, 2023
You are not logged in. If you use a Fediverse account that is able to follow users, you can follow this user.


I understand, no idea on how to do it. I heard about SWE‑Bench‑Lite that seems to focus on real-world usage. Maybe try to contact “AI Explained” on YT, he’s the best IMO. Your solution might be novel or not but he might help you figuring that. If it is indeed novel, it might be worth it to share it with the larger community. Of course, I totally get that you might not want to do any of that. Thank you for your work!