• 0 Posts
  • 79 Comments
Joined 1 year ago
cake
Cake day: April 19th, 2024

help-circle
    1. The inability to objectively measure model usability outside of meme benchmarks that made it so easy to hype up models have come back to bite them now that they actually need to prove GPT-5 has the sauce.
    2. Sam got bullied by reddit into leaving up the old model for a while longer, so its not like its a big lift for them to keep them up. I guess part of it was to prove to investors that they have a sufficiently captive audience that they can push through a massive change like this, but if it gets immediately walked back like this, then I really don’t know what the plan is.
    3. https://progress.openai.com/?prompt=5 Their marketing team made this comparing models responding to various prompts, afaict GPT-5 more frequently does markdown text formatting, and consumes noticeably more output tokens. Assuming these are desirable traits, this would point at how they want users to pay more. Aside: The page just proves to me that GPT was funniest in 2021 and its been worse ever since.





  • FredFig@awful.systemstoTechTakes@awful.systemsThe rise of Whatever
    link
    fedilink
    English
    arrow-up
    16
    ·
    edit-2
    1 month ago

    This caused me to reconsider something. I had kinda assumed that everything sucks because the bar of quality for software is so low, and that’s pulling it down for every other field now that software proliferated into eating the world.

    But I didn’t examine that the relationship could work in both directions. Software sucks some of the time, but it doesn’t excuse shit like how Crowdstrike can still be in business, and we should probably look into what’s caused us to develop the attitude about not caring that shit is shit, just because the shit salesmen told us it’ll be less shit in the future.










  • FredFig@awful.systemstoTechTakes@awful.systemseating our own dogshit
    link
    fedilink
    English
    arrow-up
    6
    ·
    edit-2
    3 months ago

    If you’re referring to genetic algorithms, those work by giving the computer some type of target to gun for that’s easy to measure and then letting the computer go loose with randomly changing bits from the original object. I guess in your mind, that’d be randomly evolving the codebase and then submitting the best version.

    There’s a lot of problems with the idea of genetic codebase that I’m sure you can figure out on your own, but I’ll give you one for free: “better code” is a very hard thing for computers to measure.