Tbf LLMs are pretty incredible accessibility tools.
Speech to text using whisper is almost perfect.
I once worked with someone who wasn’t fluent in English, but could read it really well. He had whisper running during meetings because he could read at the speed it translated, but couldn’t keep up with our casual speech.
Honestly, we didn’t even notice he was using speech to text to keep up with us for a few weeks. We only noticed because of screen sharing.
I imagine that would be a big social boon for deaf people as well.
All that to say, it’s not like there was _no_ good done with all that money.
This isn't true. On benchmarks whisper is not SOTA. It is said to be noise resistant but it doesn't compare well with Conformer based architectures ever on Librispeech mixed. Definitely not perfect, and it doesn't work for medical transcription.
It seems the answer in this article is reasoning models are expensive and they are becoming the norm.
Reasoning/chain of thought seem like diminishing returns to me and I worry they are a bit of a dead end/local optimum. Reasoning models call the language models tens of times so it will be tens of times less efficient than the underlying language model but the quality is not tens of times better. It also feels finicky to me. The bump from Chat GPT-3 to Chat GPT-4 was a reasonable positive shift across the board. The reasoning models produce answers with a different vibe, maybe better overall but worse at some tasks better at others. I can use O1 at no additional cost so I do use it fairly often but I often consciously opt for 4o either because I prefer the results of the quality boost from O1 isn't worth the wait.
the "throwing in the towel" that you see out of the market is some of the bigger earlier players agreeing to get essentially acquihired back into Big Tech. Biggest one that comes to mind: Inflection AI.
My guess is that the people with enough money to invest into AI are too busy stroking their... egos to realize that it's a bubble. They're betting on this because they want to replace all of their human workers while keeping all the profits. Just talking about it is enough to give a stiffy to these people. And as we all know horniness clouds our judgement.
The faster this train runs, the bigger the bang once it hits the wall that isn't going anywhere.
We could have accomplished a lot of good with those resources.
Tbf LLMs are pretty incredible accessibility tools.
Speech to text using whisper is almost perfect.
I once worked with someone who wasn’t fluent in English, but could read it really well. He had whisper running during meetings because he could read at the speed it translated, but couldn’t keep up with our casual speech.
Honestly, we didn’t even notice he was using speech to text to keep up with us for a few weeks. We only noticed because of screen sharing.
I imagine that would be a big social boon for deaf people as well.
All that to say, it’s not like there was _no_ good done with all that money.
> Speech to text using whisper is almost perfect.
This isn't true. On benchmarks whisper is not SOTA. It is said to be noise resistant but it doesn't compare well with Conformer based architectures ever on Librispeech mixed. Definitely not perfect, and it doesn't work for medical transcription.
Not a shadow of what good could have been done.
Any day now you'll barely notice he isn't even there anymore; which seems to be the end goal.
It's not like this mad rush into la-la-land doesn't have negative consequences for society.
It seems the answer in this article is reasoning models are expensive and they are becoming the norm.
Reasoning/chain of thought seem like diminishing returns to me and I worry they are a bit of a dead end/local optimum. Reasoning models call the language models tens of times so it will be tens of times less efficient than the underlying language model but the quality is not tens of times better. It also feels finicky to me. The bump from Chat GPT-3 to Chat GPT-4 was a reasonable positive shift across the board. The reasoning models produce answers with a different vibe, maybe better overall but worse at some tasks better at others. I can use O1 at no additional cost so I do use it fairly often but I often consciously opt for 4o either because I prefer the results of the quality boost from O1 isn't worth the wait.
https://archive.ph/ptVBA
Because everyone is too afraid to be the first to throw in the towel
the "throwing in the towel" that you see out of the market is some of the bigger earlier players agreeing to get essentially acquihired back into Big Tech. Biggest one that comes to mind: Inflection AI.
I would estimate that the people spending the money have not run out yet.
Some of them can't really run out. Monopoly profit margins guarantee them cash flow that they can reinvest into AI.
My guess is that the people with enough money to invest into AI are too busy stroking their... egos to realize that it's a bubble. They're betting on this because they want to replace all of their human workers while keeping all the profits. Just talking about it is enough to give a stiffy to these people. And as we all know horniness clouds our judgement.
Paywalled. I'm assuming some variation on the Sunk Cost Fallacy cognitive bias.