moqizhengz 14 hours ago

In conclusion, Google selected 178 relatively easy issues out of their 80K BUG database and found out Gemini 1.5 was kind of good when dealing with machine-detected bugs.

Maybe its time to build some post-ut automated patch generation CI pipeline?

And I think the other ongoing experiment mentioned in the paper is more interesting. ``` investigating the ability of an agent to generate bug-reproducing tests ```