Three Loops, No Ship
转载声明:本文为技术资讯聚合,来源于 DEV Community。本站保存公开 Feed 中提供的摘要/摘录和原文链接,方便读者发现内容,不声称原创。
I spent three iterations on an auto-fix pipeline that still doesn't work reliably. Here's what I learned. Loop 1 Wrote a background script. Pull tickets from Azure DevOps, run them through a local model, hand to a coding agent, push the result. Poll → triage → fix → push. Worked 40% of the time on trivial tickets. Anything that crossed file boundaries or needed real context — stalled or hallucinated. I shipped it any...
原文摘录
I spent three iterations on an auto-fix pipeline that still doesn't work reliably. Here's what I learned. Loop 1 Wrote a background script. Pull tickets from Azure DevOps, run them through a local model, hand to a coding agent, push the result. Poll → triage → fix → push. Worked 40% of the time on trivial tickets. Anything that crossed file boundaries or needed real context — stalled or hallucinated. I shipped it anyway. That was naive. Loop 2 Made it smarter. Pre-selected relevant files. Broke big tickets into sub
tasks. Turned complex edits into atomic steps with verification between each. Got it to 55% or so. But every fix created two new edge cases. The complexity was compounding faster than the reliability. Loop 3 Went all in. Embeddings for dedup. Multi-repo routing. Auto-revert. A learning loop that fed failures back into future runs. The model server started dying. 890 memory errors in a day. Root cause: two independent consumers hitting the same local model server, each with its own retry loop. When memory filled up,
retries amplified instead of staggering. The system was making itself worse. Fixes were simple in hindsight — stop retrying OOM, serialize access, use the local binary not npx. But the pattern kept repeating: add more to fix the last thing, break something else. Where I'm At The pipeline still only works on easy tickets. Hard ones need a human. After three rounds, the main thing I learned is that local models hit a wall before your ambition does — not in quality, in working memory. And adding features doesn't fix r
eliability gaps. It just moves them around. The 507 retry spiral taught me more than any successful deploy this year. Because it was entirely my fault. Not the model's, not the framework's. I built concurrent consumers with independent retry loops and expected them to coordinate. They didn't. What's Next I'll do a fourth loop. Smaller. A dedicated fast model for cheap work, the big model only for editing. One consumer at a time. Might work. Might be loop 5's prologue. I'm looking for people building similar things.
Local agent pipelines, auto-fix loops, small-model orchestration — the stuff that's not quite working yet but you keep iterating on. No Slack. No Discord. No newsletter. Just people who build this stuff and want to compare notes. What media would you gravitate around? A private GitHub org? A Telegram group? Occasional calls? Reply or find me — curious what works. Failure post, not a success story. If you're building something similar — don't retry OOM, serialize your consumers, and measure what your model server ca
n actually hold.
版权归原作者及原站点所有,如原站点不希望被聚合,请联系本站删除。
来源 Feed:DEV Community