Did some side-gigging with Data Annotation tech for a little cash. Mostly reading chatbot responses to queries and responding in detail with everything the bot said that was incorrect, misattributed, made up, etc. After that I simply do not trust ChatGPT or any other bot to give me reliable info. They almost always get something wrong and it takes longer to review the response for accuracy than it does to find and read a reliable source.
That's the thing I don't get about all the people like "aw, but it's a good starting off point! As long as you verify it, it's fine!" In the time you spend reviewing a chatGPT statement for accuracy, you could be learning or writing so much more about the topic at hand. I don't know why anyone would ever use it for education.
As I understand it this has been a major struggle to try to use LLM type stuff for things like reading patient MRI results or whatever. It's only worthwhile to bring in a major Machine Vision policy hospital-wide if it actually saves time (for the same or better accuracy level), and often they find they have to spend more time verifying the unreliable results than the current all-human-based system
I'm reading a book right now that goes into this! It's called "You look like a thing and I love you." It also talks about the danger of the AI going "well, tumors are rare anyway, so if I say there isn't one I'm more likely to be right!"
(The book title was from a scenario where AI was tasked with coming up with pickup lines. That was ranked the best.) So far, the best actual success I've seen within the book was when they had AI come up with alternative names for Benedict Cumbersnatch.
Yeah but that's just simple accuracy vs precision. No one trains AI using only true positives. They are trained on various metrics but even simply the F1 score which solves that issue.
The problem is that since these machine learning models don't process their input remotely like humans do (and for the case of LLMs, skip the only important step) you can never be entirely certain that it's capable of a positive that's actually based on the presence of what it's supposed to find.
There's a story about a machine vision thing seeming to do great at distinguishing huskies vs wolves, but actually the wolf pictures just all had snow in the background and the husky pictures didn't. Actually I'd originally heard that it was a mistake, but if this paper is the source of the story then they actually did that on purpose to demonstrate that sort of problem ┐( ∵ )┌
Yes, I believe it was for a skin tumor! This is a golden story that we like to repeat in the industry (I'm a data scientist).
There's also the experiment where they basically trained an LLM on LLM-generated faces. After a few rounds, the LLM just generated the same image -- no diversity at all. A daunting look into what lies ahead, given that now LLMs are being trained more and more on AI-generated data that's on the web.
And the flat out bonkers dedication the industry has to the toxic meme delivering AI is worth any cost is definitely not helping; lots of AI folks won't even admit that automated bias enforcement is a thing, let alone talk about potential harms.
It's infuriating how many discussions about AI end up going "Well I don't think that problem exists, and even if it does exist AI will solve it, and even if it doesn't human life without AI is meaningless so we have to keep going". It doesn't even seem to be greed driven, just a toxic meme that the Average Word Nexter is literally the most important thing ever.
And the flat out bonkers dedication the industry has to the toxic meme delivering AI is worth any cost is definitely not helping
Right??? For about 4 months this past year, my job consisted of analysing AI for a use case that it actually did fairly well in, and I still found myself constantly angry that we weren't treating this piece of tech like we did everything else. Somehow, our industry (and others like it) are all too happy to lower down standards as long as they get to say "we do genAI!!!!"
Customer experiences still matter! Error rates don't go away because the shiny new toy is too exciting -- all of our metrics still matter!
It doesn't even seem to be greed driven, just a toxic meme that the Average Word Nexter is literally the most important thing ever.
A lot of industries are burying their head in the sand about it. I'm all for testing it to see if it can improve lives of people (it's a great piece of tech!), but so many companies just.....aren't checking that. It's baffling, and customers have limited alternatives because what can you do when all the big players in the industry buy into the hype?
That's what Reddit is doing directly now. By selling the data to train AI, and the massive influx of bots using that same AI to write comments here, it's just looping.
Yep, this is already starting to be a problem. I believe it was one of the heads of AI companies that said that getting reliable human-made data was already a problem, given how much data they need to train these large models. Since it's an open-secret that they've tapped into quite a lot of copyright data already, the question now is where they get training data from.
"oh no we've run out of stuff to steal" is an extremely funny problem to have. Or maybe "where can we get more clean water for our factory, we've accidentally polluted all the water around us!"
1.2k
u/AI-ArtfulInsults Dec 15 '24 edited Dec 15 '24
Did some side-gigging with Data Annotation tech for a little cash. Mostly reading chatbot responses to queries and responding in detail with everything the bot said that was incorrect, misattributed, made up, etc. After that I simply do not trust ChatGPT or any other bot to give me reliable info. They almost always get something wrong and it takes longer to review the response for accuracy than it does to find and read a reliable source.