[D] SWE bench: Is there any public list of performances on this test? | allainews.com

April 10, 2024, 9:48 p.m. | /u/the_snow_princess

Machine Learning www.reddit.com

I have seen that Devin broke the record in the SWE bench score, followed by SWE-agent (an open-source Devin). I have seen that Claude 2 got around 5%. But what about other projects?
What are the sources to check this?

And in general, what is your view on the test? I've seen people saying that the tasks are very easy (for humans), which of course doesn't mean machines are able to deal with them well.
But anyway, do you …

agent check claude claude 2 devin general list machinelearning performances projects public swe test the record

More from www.reddit.com / Machine Learning

[D] Is there a more systematic way of choosing the layers or how deep the … 8 hours ago | www.reddit.com

architecture deep learning least machinelearning +6

[D] Where does the real value of a data scientist come from? 12 hours ago | www.reddit.com

code companies data data scientist +11

[D] NVIDIA GPU Benchmarks & Comparison 14 hours ago | www.reddit.com

a100 ada cards cloud +15

[N] 1st Workshop on In-Context Learning at ICML 2024 15 hours ago | www.reddit.com

context context learning icml in-context learning +2

[R] A Careful Examination of Large Language Model Performance on Grade School Arithmetic 16 hours ago | www.reddit.com

abstract benchmark benchmarks claim +21

[D] [R] Are there any methods/works that enable extracting high-quality dense feature map from CLIP/OpenCLIP … 18 hours ago | www.reddit.com

clip compute feature finetuning +8

[P] [D] Is inference time the important performance metric for ML Models on edge/mobile? 23 hours ago | www.reddit.com

apps devices edge embed +15

[D] UI-based Agents - the next big thing? 1 day ago | www.reddit.com

agents ai agents become big +10

[D] Any-dimensional equivariant neural networks 1 day ago | www.reddit.com

abstract assumptions authors cases +18

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net

AI Engineer Intern, Agents

@ Occam AI | US

View on ai-jobs.net

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net