GPT-5 Has Landed & OpenAI Opens Up | [Sidecar Sync Episode 95]

Summary:

In this episode of Sidecar Sync, Amith Nagarajan and Mallory Mejias dive into two groundbreaking releases from OpenAI: the long-awaited GPT-5 and the surprising return to open source with gpt-oss. They unpack the shift toward simplified AI interaction, the implications of automatic model selection, and what GPT-5’s real-world performance means for power users. Then they explore how the new OSS models may radically reshape how associations build and deploy custom AI solutions. Plus: updates from ASAE Annual, thoughts on AI hallucinations, and why association leaders should start thinking more like engineers.

Timestamps:

00:00 - Welcome Back & ASAE Annual Buzz
06:44 - What’s New with GPT-5
08:38 - Model Routing & “Think Harder” Mode
14:00 - Voice Mode Wins, But Is It Smarter?
19:15 - GPT-5 API Pricing & Market Strategy
22:00 - Strategic Takeaways for Associations
28:37 - OpenAI Goes Open Source
35:08 - Open Source Use Cases & Tradeoffs
40:32 - Picking the Right Model for the Job
47:08 - Final Thoughts & Experimenting with AI

🎉 Thank you to our sponsor

https://cimatri.com/

📅 Find out more digitalNow 2025 and register now:

https://digitalnow.sidecar.ai/

🤖 Join the AI Mastermind:

https://sidecar.ai/association-ai-mas...

🔎 Check out Sidecar's AI Learning Hub and get your Association AI Professional (AAiP) certification:

https://learn.sidecar.ai/

📕 Download ‘Ascend 2nd Edition: Unlocking the Power of AI for Associations’ for FREE

https://sidecar.ai/ai

🛠 AI Tools and Resources Mentioned in This Episode:

ChatGPT (GPT-5) ➡ https://chat.openai.com

Claude 4.1 Opus ➡ https://www.anthropic.com

Soundpost ➡ https://www.soundpost.ai

LM Studio ➡ https://lmstudio.ai

Hugging Face ➡ https://huggingface.co

Quen3 ➡ https://qwenlm.github.io/blog/qwen3/

Kimi K2 by Moonshot AI ➡ https://www.moonshot.cn

OpenRouter ➡ https://openrouter.ai

👍 Please Like & Subscribe!

https://www.linkedin.com/company/sidecar-global

https://twitter.com/sidecarglobal

https://www.youtube.com/@SidecarSync

Follow Sidecar on LinkedIn

⚙️ Other Resources from Sidecar:

More about Your Hosts:

Amith Nagarajan is the Chairman of Blue Cypress 🔗 https://BlueCypress.io, a family of purpose-driven companies and proud practitioners of Conscious Capitalism. The Blue Cypress companies focus on helping associations, non-profits, and other purpose-driven organizations achieve long-term success. Amith is also an active early-stage investor in B2B SaaS companies. He’s had the good fortune of nearly three decades of success as an entrepreneur and enjoys helping others in their journey.

📣 Follow Amith on LinkedIn:
https://linkedin.com/amithnagarajan

Mallory Mejias is passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space.

📣 Follow Mallory on Linkedin:
https://linkedin.com/mallorymejias

Read the Transcript

🤖 Please note this transcript was generated using (you guessed it) AI, so please excuse any errors 🤖

[00:00:00] Amtih: Welcome to the Sidecar Sync Podcast, your home for all things innovation, artificial intelligence and associations.

[00:00:14] Greetings, everybody, and welcome to the Sidecar Sync Your Home for content at the intersection of artificial intelligence and the world of associations. My name is Amith Nagarajan.

[00:00:26] Mallory: My name is Mallory Mejias,

[00:00:28] Amtih: and we're your hosts. And today we have some exciting topics, as always, uh, at that intersection of ai and we're gonna contextualize it with the association side of the intersection.

[00:00:39] So very excited to talk all about that. How you doing today, Mallory?

[00:00:43] Mallory: I'm doing pretty well, Amith, I'm not gonna lie. I was excited for this GPT five episode. I had seen the teasers, you know, back in July of this year, 2025, that GPT five would be released in August. And so I've been looking forward to this convo.

[00:00:57] How have you been?

[00:00:59] Amtih: Been [00:01:00] well, uh, got back to New Orleans, which is great. Uh, kids are back in school, which is extra great. And uh, you know, it's, uh, for those of you outside of the southern part of the United States, you're wondering, well that's kind of weird that you're back in school so soon. It's only kind of mid-August by the time you guys hear this, but, um, it is, uh, pretty normal down here.

[00:01:19] I think most of the south starts around this time of August, but the kids do get out a little bit earlier than the west coast and the east coast where when I was growing up in California. Is always, uh, labor Day to kind of just after Memorial Day. Um, you get out in kind of mid-June typically, but uh, these kids get out I think in late May is usually Yeah.

[00:01:37] What it's so, but anyway, doing great. Uh, it's always nice to have that happen because then we can kind of get back into a normal routine. But had a great summer up in Utah, so. Um, I too have been excited to talk about, uh, all the open AI topics from last week, uh, GPT five, of course, and the open source models and, uh, share my perspectives.

[00:01:57] I've, uh, done quite a bit of work with both sets of those [00:02:00] models, uh, since they came out and our teams have been digging in as well. So have some, some, uh. Live feedback that we can share when we get to the right point in the episode. But, uh, it's just an exciting time in ai. You know, there's so much going on in the world of artificial intelligence.

[00:02:14] I also think that associations are really, really, really paying attention to this stuff. Uh, we're seeing that certainly at sidecar with the growth of our association, AI professional certification. We've got hundreds of people that have gone through the program already, a few thousand people that are in progress in the program, associations that are signing up as a team.

[00:02:34] All the time, which is super exciting because that results in everyone on their team having access to a great learning experience. Um, about ai, obviously for the association sector. And then also, uh, we had probably, I don't know, 15, maybe 20 people out in Los Angeles. Uh, this week. I didn't, I didn't go myself 'cause I was in transit back home.

[00:02:53] But a SAE had their annual conference, which we always love to go to. It's uh, they did a fantastic job as [00:03:00] they always do, uh, and put together, uh, a great show for I believe, about 4,000 Association, uh, attendees. Uh, I'm not sure if that's the total number or if that's the number of association people, but somewhere in that neighborhood.

[00:03:12] And, uh, we had a good number of, of folks across the Blue Cypress family, and the main takeaway I heard is, you know, people are doing stuff. They're out there. They're experimenting, they're deploying ai. There's lots and lots of use cases. So you know, mid 2025 is very different than mid 2024, which is very, very exciting and rewarding for us to hear here at Sidecar.

[00:03:32] Mallory: Absolutely man. Thousands in progress on the A A IP, I feel like every time I get on LinkedIn I'm seeing, not even just one, but a few posts of just got my A A IP. Just got my A A IP. So that's been really exciting to see. And Amit, have you heard any specifics from teammates at a SA annual who led sessions about how those went or, I don't know, kinds of questions they were asking.

[00:03:53] I know it just ended, I think maybe yesterday's, so we're pretty fresh off the event, right? Yeah.

[00:03:58] Amtih: That's right. Yeah. Uh, a [00:04:00] number of our folks I think, flew home yesterday, but, uh, yeah, it's wrapped up. I think everyone's, uh, left or uh, going home. But, uh, really positive feedback in terms of session turnout. I know Erica wrench from, uh, the sidecar team, who's our CMO, uh, here at Sidecar.

[00:04:14] Uh, she led a session on ai, and, uh, I saw some pictures and it looked like there was literally standing room only. Uh, so that. Is great. And I know there were quite a number of other sessions and talks about AI throughout. So I think, uh, from that perspective, it certainly was the topic that was a buzz all over la.

[00:04:33] Mallory: And I saw, uh, Andrew Schwartz, crane of Soundpost. He was playing an instrument at his booth. Did you see that on LinkedIn? I, me,

[00:04:40] Amtih: I did. I was really happy to see that. Yeah, Andrew is a, uh, really talented musician and, uh, technician. So it's interesting how sometimes you find musicians that they're also brilliant software engineers and, uh, he is, he is both of those things.

[00:04:53] So yeah, he's launched a really cool product, uh, that's an AI product, uh, that helps accountants, uh, ensure that the [00:05:00] numbers. In their accounting system like NetSuite or QuickBooks or Sage, actually tie back to the source systems, which sounds pretty straightforward. You know, you'd think, well, I have Shopify, or I have WooCommerce, or I have, uh, an event registration system like a cve.

[00:05:16] And the data should, of course, tie to my accounting system, but something as seemingly simple. Is in fact an extraordinary pain in the butt for most associations, uh, for a lot of reasons. Um, and so he's, he's applied, uh, some good old fashioned software engineering, but some really cool uses of AI to bring that data in to the accounting system in a novel way that is really gonna solve this problem.

[00:05:39] So he calls this sound posts bridge product, uh, which is all about, uh, you know, how to make the accounting data rock solid and to simplify that process. So he was, yeah, he was going around the convention center with his violin, getting attention for that. So I think, I think people had some idea what the product was, but uh, it was pretty cool seeing him do that.

[00:05:59] Mallory: I love [00:06:00] that, man. A musician and a software engineer Walk into a bar and you get Andrew Short screen. That Sound Coast. That's right. I love it. We'll link to that too in the show notes if any of you all have interest in learning more about sound posts. But today, today's kind of a dedicated open AI episode.

[00:06:16] We're gonna be diving into two major releases that represent kind of opposite ends of the AI deployment spectrum, both dropped in August this month, 2025. So first, we'll explore GPT five open AI's flagship model. That takes a radically different approach by automatically handling complexity for you. And then we will look at open ai.

[00:06:36] I surprising return to open source with their G-P-T-O-S-S models that can run entirely on your laptop. So first and foremost, GPT five was released on August 7th, 2025. Unlike previous iterations that required more careful, prompt engineering and model selection, GPT five is fundamentally changing how we interact with ai.

[00:06:58] As Ethan Molik, [00:07:00] AI researcher and profess professor at Wharton School of Business describes it. This is an AI that quote just does stuff. So let's get into some tech specs first. GPT five includes four variants, so we've got GPT five, which is the deep reasoning model, GPT five mini, the lightweight and fast option.

[00:07:19] We've got GPT five Nano, which is ultra low latency and GPT five Chat, which offers advanced multilingual conversation. The context windows or the amount of text and AI model can remember and process at a given time supports up to 272,000 tokens for inputs and 128,000 tokens for outputs. In practical terms, this means you can feed it the equivalent of a four to 500 page document, and it can respond with up to 200 pages of content.

[00:07:49] GPT five is also multimodal, so that's integrated text image and voice processing in relatively real time with architecture ready for native video processing as [00:08:00] well. The key innovation here, I would say, is the automatic model selection, so GPT five acts as an intelligent router automatically selecting the appropriate model variant based on task complexity.

[00:08:13] Simple questions get instant responses from lighter models. While complex problems trigger deeper reasoning capabilities. This theoretically removes the burden of model selection from users. Though I did read in Ethan Malik's summary of his experience with GPT five that he can't quite find out how the model is deciding which when to use a more complex model and when to use a smaller model, so that might not be quite as up to speed as we think.

[00:08:38] When given a task, GPT five goes beyond answering it anticipates needs and executes. So ask it to create startup ideas and it might deliver business plans, landing pages, financial projections, and marketing copy without additional prompting. For associations, this proactive approach might transform your project execution.[00:09:00]

[00:09:00] Amis, I know we've got a lot to unpack here with G PT five reception has been somewhat mixed, but before we get into that, I wanna hear what you think is kind of perhaps most impressive about the release of GPT five.

[00:09:14] Amtih: Well, the first thing I wanna say is OpenAI being the, by far, the dominant player in terms of volume.

[00:09:20] They have 700 million weekly users. I don't know how that compares to Claude or Gemini or other, uh, tools that are out there, but clearly the largest one out there. They have, uh, therefore the largest challenge and not so much even the, just the AI side. There's the AI side, but there's also the product management side.

[00:09:38] There's the user perspective that comes from having, you know, that diverse, that global of an audience. So you think about it, it's like roughly one out of 10 people on the planet. Uh, you know, of all of all ages, right? Are using chat GPT at least once a week. That's kind of a staggering thing to think about.

[00:09:56] And the speed at which they've gotten to that. They'll probably be at a billion weekly active users. [00:10:00] And I also at the same time say that, um, what they've done is, I think directionally good. Probably a little bit abrupt. A lot of people have said that they miss having access to the model selector. So as much as people said, oh, the model selector is such a pain, and most users didn't even bother changing it, and therefore they just got the default, which was previously GPT-4 oh, uh, which was a very good model.

[00:10:26] Um. A lot of people are now saying, well, I really would like to have more control over that. Now, one thing that we have to remember with product feedback, especially people that are on Twitter and Reddit and all these other places, is that these are the power users or the self anointed so-called power users.

[00:10:40] And so they were probably, they probably speak for less than 1%, probably one 10th of 1% of the users and what the average user is experiencing out there with chat GPT probably won't really be determinable for another month or two until, you know, the open AI guys get. Significant both in product feedback through usage and perhaps some surveys and research like [00:11:00] that, and I think that will really tell a lot.

[00:11:02] So when you have that wide of a distribution of users, I think directionally what they're doing and making it simpler does make sense for power users who know what they want and know that, oh, I want to invoke a reasoning model, or I want to invoke a research tool, or I want it to do a web search. We've gotten used to that.

[00:11:20] Um, and I do think there's value in having. Access to those individual tools. Um, but at the same time, my sense of it is, is that over a period of time, most AI will go this way because there's this proliferation of models and options and power levels and tools, and the vision they have, which they clearly articulated was it's just.

[00:11:39] A model. It's just not even a model. It's a system that has multiple models in it, and you don't have to worry about it anymore. You just use the thing and get your work done. And I do think most people will find that very attractive. Now, uh, the flip side of it is, is they have had some challenges. Uh, part of it is, is the sheer amount of volume.

[00:11:55] They have people rushing to try it out. Um, it hasn't been quite as [00:12:00] fast as they showed in the demos. Of course, the demos, you know, I'm sure they allocated a lot of, of resources too. And the, the more deeply technical people have had some issues with it. But, um, I would say that directionally, I think they're going down the right path over the long run, even though they're, they're getting some bumps and bruises right now from the feedback loop.

[00:12:19] Mallory: Have you tried it out yourself?

[00:12:20] Amtih: Yes, I tried it out, uh, pretty much immediately and, um. I was not particularly impressed with it, but for a different reason. So for me, it wasn't about the automatic model selector and this and that, I think that's great. But for the type of things that I do with it, again, that I would, I would put into this category of more technical or more power user, uh, realm.

[00:12:40] Um, it just wasn't particularly smart. For a lot of the use cases I was throwing at it, it wasn't notably different than, uh, certainly O three PRO I think is a better model than GPT five because you don't get access to that level of reasoning by default unless you go to the GT five thinking and then you have to specifically change [00:13:00] it.

[00:13:00] Um, and then, you know, the, the regular. Requests that you put into it. I, I think it's routing some of those to many and some of those to nano. And those are, you know, they're going to be kind of very basic responses that you're gonna get from those much, much smaller models. Now, keep in mind, GPT five mini and GPT five nano is, they're, I don't know this to be true, but I believe it's probably benchmarking somewhat similar to like the first version of GPT-4 or above.

[00:13:24] So it's, it's not like it's a dumb model, it's just not at the level of incredible intellect that, um, has been touted. So. I haven't tested a ton as a user in chat. GPT I'm primarily a Claude user, so chat GPT to me is more about voice. I am a big chat GPT user on my mobile phone when I'm taking walks or when I'm driving.

[00:13:45] And I just got done driving for 30 hours packed from Utah. And so I had an opportunity to use chat CPT five through voice mode, and it was, it was really good. Um, the new voice capability is significantly better than four. Oh, so that, that's definitely worth testing. [00:14:00]

[00:14:00] Mallory: Okay. I haven't tried GPT five voice, but I did a little bit of preliminary testing too.

[00:14:06] As you all know, if you listen to the pod before, I'm also an AVID cloud user, so I mainly use chat GPT for image generation, but I was somewhat impressed. I have read a note online. I think Ethan Molik said this first, but if you want to trigger that deeper thinking model, you can just put think harder, kind of over and over in your prompting and you'll get there.

[00:14:26] So I didn't include that. I had a really simple prompt of. As if we hadn't formed the Sidecar Sync Podcast, like, Hey, I wanna form an AI podcast for associations. Can you help me think through that? And I did notice the proactive part, so it, it, you know, gave me all these ideas. And then at the end said, do you want me to outline the first 10 episodes?

[00:14:42] I said, sure. It outlined the first 10. And it said, do you want me to come up with a, a play byplay for the first episode? Like a full script? So I appreciated that, but I guess it didn't seem. So novel. I haven't tested it out for any blog writing purposes, so I'm not really sure how the writing compares, [00:15:00] but ugh, I don't know.

[00:15:01] I'm just such an avid Claude user. It would be hard to pull me away from that.

[00:15:04] Amtih: Well, and speaking of Claude for just a second too, the Claude four series, uh, which has been out, what, for maybe three months now. Um. Does also have the same capability where the reasoning can get invoked by virtue of the nature of the prompt.

[00:15:18] You can ask it to think hard or apparently there's a keyword that you can use, which is ultra think so when you're talking to Claude, you can say, ultra think this. And apparently that puts it in, needs to. Even I, I just learned this maybe in the last couple weeks. It's quite okay. Quite funny. But you can, you can see the same thing with Claude.

[00:15:33] And Claude is able to think more, uh, when you tell it to do that. And, and these are, these are great capabilities. Um, I think this is gonna become pretty standard where, you know, the, the model or the system that houses the model is, is. Kind of aware of the complexity of the prompt and the work involved, and therefore, you know, puts the necessary amount of resourcing.

[00:15:53] I don't think any of us are gonna be talking about this particular topic in a year because it'll just be normal that the models [00:16:00] automatically scale up and scale down their, their compute resources based on the nature of the problems. So I think that's, again, this is not a negative point of view. I just think it's, it's exciting for now, and pretty soon it'll be like talking about the fact that you can chat with an ai, which nobody talks about anymore, even though.

[00:16:15] Couple years ago, that was a pretty big thing.

[00:16:17] Mallory: Pretty big deal. Yep. I mean, do you think that this leap thus far, and mind you we're recording this on August 13th, so it re, the model was really just released and I'm sure it's gonna take a few weeks, a few months, for people to really push it to its limits, but do you think that the leap to naming it GPT five is warranted?

[00:16:37] Do you feel like thus far it's that impressive?

[00:16:40] Amtih: I think as a system, the combination of capabilities definitely warrant a major version release. I think it's fair. Uh, I think that from a developer perspective, it doesn't provide any real significant increase in capability. Um, it is not notably better than the O Series models.

[00:16:57] In terms of co-generation. I don't [00:17:00] think it's any better than Claude 4.1 opus in terms of code generation, which is a major use case that we are constantly hammering over here. Um, but. For, you know, ordinary day-to-day use, uh, for various business use cases. I think it is, um, going to be a really powerful tool because of the routing, because of the level of tooling, how they've integrated everything into a single, a single system.

[00:17:23] Um, but you know, it, it, it's, again, is it necessarily a release that's about the technology a little bit. There's some, there's some good things they've done, but more than anything, to me it's about two things. One is simplicity to make it more accessible to their 700 million, soon to be a billion people.

[00:17:38] Most of whom don't listen to this podcast or others that are digging deep into ai. And, you know, they're, they're using it in a very casual way. So it's really important to make it, uh, if you wanna widely and evenly distribute AI's benefit to as many people as possible. Gotta find a way to make it simple.

[00:17:54] So I admire and applaud them for doing that. Uh, the other part of it is cost. There's two things that are [00:18:00] really important to note. Number one, GPT five, which is the best model, is available to free users in chat, GPT. So you go from, you know, the majority of those 700 million people having access just to the four oh model previously, which was the only model available in the free tier of chat GPT, to now having access to GPT five, which is the latest, most powerful model.

[00:18:20] I don't know if they constrain it in some way in terms of how much thinking time you get or whatever, but they have access to it, and so that's a big deal because you put. Hundreds of millions of people into a far more powerful model. If you compare four oh to to five, it's, it's a pretty big leap. Um, and so that's a big thing, right?

[00:18:37] And so especially as chat CPT, if their brand continues to be what it is now, they're going to have probably a couple billion users within the next year or two. And so you're talking about more and more of humanity having access to AI for the first time with that level of models. So that is a big thing.

[00:18:54] Of course, the, the significant benefit. Crews to open ai if they're able to execute that because then [00:19:00] they become, you know, even further, the leader in the space. The other side of it is API access. So this is something that a lot of business folks haven't really paid much attention to, but the cost of API access for the underlying AI models is worth talking about for just a minute.

[00:19:15] So these models are priced when you access them through API, which means you as a developer don't wanna use chat GPT through the window, through the app, through the browser, but instead, you're gonna write a program that's gonna talk to the model and have it do something for you. Oftentimes through an agentic framework or perhaps, uh, just writing code that interacts with the model, there's.

[00:19:35] Tons of use cases for this. And so the models are priced based on what we call millions of input tokens and millions of output tokens, which is what it sounds like. So, um, for every million input tokens versus every million output tokens, usually the output tokens are a little more expensive. Um, but to give you an idea, um, the cost of GPT five is way lower than what anyone thought it was gonna be.

[00:19:59] It's [00:20:00] around one 10th the cost of Claude 4.1 Opus. Um. So that's worth noting because all of a sudden GPT five is priced, in fact, it's less expensive than some of their older models that are going out. Now when AI models come out and there's a new generation, usually a vendor will cut the price of the old model considerably.

[00:20:21] Um, I don't believe that OpenAI did that. Um, in fact, they're just lowering GPT five. To the point where everybody's like, well, let's just use GPT five and GP PT five Mini is super cheap and GGPT five nano is almost free. It's not free, but it's almost free to use through the API. So that tells you a little bit of what they're trying to do.

[00:20:38] So one thing they're trying to do is they wanna be the default model of choice, and they don't want economics to get in the way, way. Other way to put it is they're trying to use their size to basically clobber everyone else in the space, which is, you know, that's a pretty well proven competitive tactic for the leader in the space to, to try to do that.

[00:20:54] So, uh, we'll see how that shakes out. Um, I would say this, their model is still [00:21:00] extremely slow. Compared to models that are inferencing on platforms like GR and cereus, which are fast inference providers that run on, on alternate uh, technology stacks. So grok, for example, you can inference models like Quin three, uh, which is a very powerful reasoning model that's roughly on par with GPT-4 oh.

[00:21:19] Um, that model can inference on at hundreds and hundreds of tokens per second for very little, like much, much less money than GPT five, GT five mini, which is probably comparable. Is, uh, one 10th the speed of GRS Quinn three implementation. So, um, there are still better choices from an API perspective, even though GPT five is super cheap.

[00:21:39] Uh, partly because the speed, but partly because the cost is much lower than Anthropic, but it's not really low. It's like you can get really, really cheap inference, um, if you want to through other sources. So I just wanted to mention that because that is a big, big part of the GPT five strategy. Mm-hmm.

[00:21:55] It's basically let's go and own the whole market, is the goal. [00:22:00]

[00:22:00] Mallory: So it sounds like the two takeaways then for our association listeners are one, the simplicity of the GPT five system, and then two, the reduced cost. Is there anything else you think association leaders need to know about this release or need to keep an eye on?

[00:22:14] Amtih: I think that people should keep in mind the same things we've talked about in the past, that new releases from vendors like OpenAI, Andro, and others, as exciting as they are. They don't necessarily change your approach to how to deploy as AI or association, so you still have to think deeply about things like governance.

[00:22:33] Where do you allow your data to go? Just because GPT five has all these additional tools for ingesting data and allowing you to do analytics and things like that. Do you really wanna do that? Do you really wanna hand your data over to OpenAI or anyone else for that matter? Or is that something that you want to have a little bit more protection around?

[00:22:51] Um, and the other thing to be asking is, um, what is the best tool of use across the association should you necessarily [00:23:00] standardize on a single tool like Open AI's Chat, GPT, or CLO philanthropic, or should you potentially allow a mixture of tools because like we've talked about in this conversation. Um, different users might benefit from different things.

[00:23:11] If you're a big multimedia person, you lot of image generation, you wanna interact with the model through voice chat. GPT still has a pretty significant advantage. I mean, in the area of image generation. Anthropics Claw, I don't believe does that at all yet. No, but, uh, there's other. Alternatives for that.

[00:23:26] But you know, having that all in one place is certainly convenient. And voice mode wise, I think Chat t's mobile app is still, uh, in, in the lead clearly there. So, uh, but at the same time for writing quality and for depth and kind of, you know, uh, conversational interactions. I think Claude, I. Has had an advantage and still does.

[00:23:43] Um, it's my belief in that of many other people that are in kind of a professional software development world that at Claude Opus 4.1 is still superior to GPT five in many, many ways. Uh, independent of what the academic benchmarks may say that, you know, it's, it's a very, very powerful model for [00:24:00] coding.

[00:24:00] And, you know, if you listen carefully, TOS announcement of Claude 4.1 Opus, which was. Two days before GPT five came out, I think it was on the Monday, um, they said, Hey, stay tuned. 'cause we have a lot more coming. And those guys are pretty like understated most of the time. So I would keep an eye out on what they're up to.

[00:24:18] So all I'm trying to say is don't let your pendulum swing wildly from one vendor to the next. Have a thought process around what the use cases are in your association, how you're going to extract as much value from this technology as possible, both internally for your staff's processes. Also, how do you enable your association to engage with your members through this technology?

[00:24:39] And so many of those use cases may require improvements to your website. Uh, putting in place maybe some custom AI models that, uh, allow your members to interact with you. There's lots of different things you can do. Um, the GT five announcement, as cool as it is, should be thought of really as like a milestone on our journey as opposed to necessarily something that causes you to change [00:25:00] any of that.

[00:25:01] Mallory: That's really helpful. As a kind of fun aside, Amit, I was telling you I went to on a, a girl's trip to New York City. At this point, maybe two weeks ago, and it was really fun and chat, GPT came up a few times in terms of like, oh, let me ask chat, GPT, that, whatever. And I was talking to my friends and I think I sent them a screenshot of something that I asked Claude.

[00:25:21] And it's funny because we're in this bubble and I always assume kind of everybody around me probably knows the same things I do, which is not really a fair assumption. And they said, what is Claude? Like? You don't use chat GPT. And I said, oh. Oh my gosh. I guess you guys don't listen to the Sidecar Sink Podcast, but just a reminder, if you're listening to this, don't assume everybody around you is an avid clawed user too.

[00:25:43] Even just your awareness of that, you know is is important. Keep it in mind.

[00:25:47] Amtih: Well choice leads to at least the opportunity to make different decisions. You may ultimately choose to do the same thing that most people are doing, but at least creates the, the path for you to consider alternatives. And I think actually [00:26:00] just the, the sheer fact that you go through that process makes you smarter about this stuff because you're having to, you know, put a little more analytical rigor into your decision making.

[00:26:08] Mallory: Mm-hmm. The next part of this episode I'm excited to talk about because OpenAI is finally open and a surprising move. Perhaps OpenAI has released two open weight language models, G-P-T-O-S-S one 20 B and G-P-T-O-S-S 20 B. Great naming conventions. We love to say a marking their first significant open source release since GPT two in 2019.

[00:26:35] You as associations can run powerful open source AI models entirely on your own hardware. No internet connection. No API costs no data leaving your servers. Keep in mind, open models aren't new. Of course, we've seen some incredible releases from deep seek minstrel and the Microsoft PHI family of models, but it's certainly interesting to see open AI enter or technically return to this open arena.

[00:26:59] So let's [00:27:00] talk text. BES for the G-P-T-O-S-S one 20 B model, it's about 117 billion parameters and it requires a high-end Nvidia GPU or a powerful Mac to run it. The 20 B model is of course, lighter weight it, and it can run on consumer laptops with about 16 gigabytes or more of memory. As HubSpot's co-founder and CTO Dharmesh Shaw excitedly shared on LinkedIn, the entire 120 billion parameter model is just 65 gigabytes, small enough to fit on a $15 USB stick.

[00:27:35] So he's now running it locally on his MacBook Pro, noting how mind boggling it is that all this capability writing prose, possessing knowledge across domains, reasoning through complex problems. Fits in such a compact set of numbers. In terms of licensing, they're on Apache 2.0, which is free for commercial use, modification and redistribution.

[00:27:57] But OpenAI has not released the actual [00:28:00] training data for the models, only the model weights and code. In terms of capabilities, right now they're text only, so no vision and audio, and they support chain of thought, reasoning, code, execution, and tool use. In terms of performance, we're seeing state-of-the art on the benchmarks among open models, though interesting to note, hallucination rates are around 49 to 53% and higher than close models.

[00:28:27] If you want to test this out, you can do an immediate download via hugging face and run it completely offline. I would say the power of local ai according to Dharmesh Shaw, who I just mentioned from HubSpot, he says, really, this is building on the shoulders of giants. So you as associations can customize powerful AI for your specific needs without relying necessarily on external services.

[00:28:53] Amis a lot to unpack here. I know you're an avid advocate for open source. Uh, well, let me just hear your [00:29:00] initial take before I go into my, my list of questions.

[00:29:03] Amtih: Well, it's exciting to see OpenAI get back into this game. I think that they've been kind of, uh, giving clues to the market for a number of months that they were heading in this direction, but there wasn't really a definitive, you know, statement about timing or what kind of model they'd be putting out there.

[00:29:18] But a couple things that I think are, are really important to note the capabilities of this model in terms of overall performance. Are very similar to O three Mini and O four mini, which previously were the state-of-the-art reasoning models. Not quite the top tier, which was O three Pro, but very, very powerful in terms of their overall capabilities.

[00:29:37] The hallucination rates actually is, it's an interesting thing to unpack for a minute. 'cause you think, well, it's hallucinating half the time. That sounds like a pretty unreliable ai. So this AI is not designed to be a knowledgeable AI by itself. So if you go and ask it typical chat GPT questions, um, you'll be unimpressed because a good bit of the time it will be.

[00:29:55] Wrong. This particular set of models is designed for tool calling. [00:30:00] So from an association viewpoint, the question might be then how does that affect us? The main point I would make isn't so much that you should run out and you should download these models and use them yourself. Although you can, especially if you have a MacBook Pro like I do, you can download actually either of those models that run the larger one, and it works quite well on a fairly new MacBook Pro with a good bit of memory.

[00:30:21] Um, the more general and important concept is, I think, twofold. First, the fact that open AI is in the open source game tells you a few things. Number one, it tells you how important open source is. Number two, it tells you that they felt they were actually on the wrong side of history in terms of not having open source, open weights, models.

[00:30:40] In fact, that's literally what Sam Altman said a few weeks ago that they're coming down this path because they felt they were on the wrong side of history. 'cause previously, back in the GPT-3 era, they said that they felt models that were so powerful really couldn't be entrusted in the hands of everyday people because who knows what they would do with them.

[00:30:59] [00:31:00] Essentially it was kind of the language used a couple years ago. Um. Everyone out there has access to incredibly powerful models. There's the Llama four series, which we don't talk about too much. Um, there's the Quinn three series. There's Deep Seek, there's Kimi K two for Moonshot ai, yet another Chinese, uh, uh, lab that released something.

[00:31:20] Misra just released Misra 3.1 Medium, which is a very powerful model. Uh, so there's a whole bunch of options. And so part of this is just pure market dynamics. If you're not in the open source game, you can't win ai. You have to have a play. In the open source gain, uh, because open source is going to be really, really good for use cases that don't require the absolute highest end intelligence.

[00:31:44] Um, and in fact, maybe eventually they get there, but I, I think that's even besides the point. The open source models are somewhere in the 90 to 95% power level of the most powerful frontier ai, and they cost anywhere from 3% to 20% [00:32:00] as much as the frontier models to run. So when something is. An order of magnitude cheaper and almost the same in terms of power.

[00:32:09] People are going to definitely be interested in checking that out. So not having a model in that game was a major strategic gap for open ai. So that's the number one thing they were filling. Um, there are other American, uh, AI companies that are trying to do open source. Obviously there is meta. They kind of had a whiff earlier this year with the Llama four release.

[00:32:29] You haven't heard a ton about it because it's not a great product, and so people aren't really widely adopting it. I suspect they're gonna come back around in the fall some time and have a llama five that hopefully is dramatically better. Um, but the broader idea that open source is important is really worth taking a moment to think about.

[00:32:47] Open source and open weights basically means that anyone can run these models on any hardware in any country. They can take the model and they can modify it by, by tuning it, they [00:33:00] can use the model to produce outputs that are then used to train other models, which is an important part of distillation and the whole idea of model compression that's been happening for years.

[00:33:11] A smaller model like the 20 B variant, that is still extremely powerful. It's a reasoning model with a high level of intellectual capability. It knows how to call tools, it knows how to act cannabis. The brain of an agent is a really exciting option to have available. Now, we've tested OSS one 20 B and OSS 20 B, and they're competent models.

[00:33:29] They're similar to the Quinn models, so Quinn three, which was released I think in April. It's a model from Alibaba, the Chinese, uh, giant. Um, that Quinn model is very, very good. It's also a reasoning model, and it's been out there for a while. OSS 20 and one 20 seem comparable to similar size models or classes of models from Quinn.

[00:33:49] I don't see them as being better, but. OpenAI has a horse in the race now. So that's my analysis of it. I'm really excited to see OpenAI in it. Um, but I don't necessarily think there's an [00:34:00] advantage to using OpenAI stuff over Quinn or some of the other open source op, uh, options that are out there, at least at the moment.

[00:34:07] Mallory: Mm-hmm. So my biggest question for you, which you kind of answered, was why is Open AI entering this open arena? General and why now, and you kind of answered that. You said these open models are, have perhaps 90 to 95% of the level of power at a lower cost. I'm still struggling to understand though, like if we looked at GPT five nano, which I'm not sure of the exact size of it, but I would assume it's quite small.

[00:34:33] Why exactly does having it open source mean that it's a fraction of the cost?

[00:34:38] Amtih: Well, so essentially when you're providing the full model and you're providing inference, so GPT five nano or mini, you are, you're still open AI providing access to something that only you have access to. So if people want to come through that channel, um, there's an inherent kind of margin accretion that's happening to the provider in that scenario.

[00:34:59] Whereas at open [00:35:00] source in the world of. Uh, the OSS 20 and one 20 B models, they're the exact same models. You can get them on gr with a queue. You can get them on cereus, you can get them through Open Router, through which you can get access to dozens of different inference providers. You can inference them on Azure and a whole bunch of other places, right?

[00:35:17] So. You have, it's, it becomes more commoditized, which ultimately commoditization of any resource drives cost down. So that, that gets exciting because you know what you're gonna get. So you could say, well, yeah, JPT five mini is really similar to the Haiku Model series from Claude. Um, but it's not exactly the same.

[00:35:35] But OOS 20 B, it's gonna be the same everywhere. Assuming that it's not a quantized version, meaning it's the full size model, um, you're gonna get OSS 20 B performance. Exactly the same, whether it's on GR or cerebral or Azure or through any number of other vendors through open rev routers. So I think there's a lot of opportunity out there to drive costs down, and the further down and the closer we approach zero with the high level of intelligence, the better it is for everyone who's trying [00:36:00] to deploy this stuff.

[00:36:01] The last thing I would say just really quickly is I do think, um, open source from the country that you reside in is probably worth thinking about in terms of long term because these models are increasingly being used for thinking, for decision making, for writing, uh, for. Providing advice. And so there is something to be thought about in terms of where does the model come from.

[00:36:22] So this isn't about does your data go to China? Just to be clear. So if you use the Quinn models from Alibaba and they're running on a US based cloud, your data is not going to China. That is actually kind of silly because, um, and I know that it doesn't sound silly to people who are like worried about it, but there you can't put a backdoor into the software that's not detected immediately because there's all sorts of network infrastructure that would detect that.

[00:36:46] And it's open source. So we know exactly what the software is doing in terms of the actual algorithmic approach. The weights, of course, are still somewhat of a mystery to everyone. But we would know if the software was attempting to open a network connection. The [00:37:00] waits can't do that by themselves. They have to be able to call tools in order to do that.

[00:37:03] So there really isn't a concern if you inference a Chinese model on US hardware that your data's going back to China. That's not really the legitimate issue. The issue though might be that you're worried about, well, what were the values and what was the content used to train the model? And so therefore, when I'm getting advice from the model that was trained in China, did it include Western values?

[00:37:24] Did it include values that I align with? Will it help me? Will it help me make decisions? Will it help me write content? Will it help me with advice that. I find is good relative to my culture. And so this is where I think there's a lot of opportunity around this idea of taking AI models and making AI models localized to a country or even a particular region so that the AI really reflects and honors the value system of that region and the people who live there.

[00:37:52] And that's really important for a consumer application of ai, of course. Um, but I think it's also a geopolitical question as well. That's one of the reasons [00:38:00] you see a lot of people on the political stage making a big, big deal about open source and their country.

[00:38:05] Mallory: I was gonna ask for our association listeners that are maybe in the process of building out some custom AI solutions, do you recommend that for the power and cost analysis that they mainly look at open models, uh, to run?

[00:38:20] Or do you think there's value in looking at the closed models via a PII guess for certain things like decision making. I

[00:38:26] Amtih: think there's, there's scenarios where they both make sense. I think by default what most people I know are doing is they're saying, Hey, I'm gonna sign up for an open AI API account, and I'm gonna build my app against their stuff, just because that's what they've heard of and it's the safe bet.

[00:38:41] It's like picking IBM in the 1960s through eighties. You didn't get fired for buying IBM. And so the same thing is kind of where OpenAI is going. So that's, that's great for OpenAI. Um, I think there's a lot of wasted money and you get a substandard product a lot of the times. Um, I know some associations that created apps that were based on GPT-3 [00:39:00] 0.5 back when GPT-3 0.5 was awesome.

[00:39:02] That, by the way, for those that don't recall, was the original model that chat GPT launched on. So it was a pretty good model. But other models have come along since then that are a lot better. And so my, my point of view essentially is that you should be using models that suit your use case. And there's a ton of great models available through a lot of different providers that are really fast and are secure and.

[00:39:29] Ultimately are a better choice for a lot of applications. That being said, having access to this unbelievable frontier of AI three API and paying a little bit extra for it at times could make sense, but think of it as complimentary. Um, you can use a variety of different techniques. To make it so that your software is agnostic and you can easily switch models and providers.

[00:39:49] That's a really important concept. We've talked about that a little bit on the PO in the past, where you shouldn't directly hard code into your whatever you're building, that it's only tied [00:40:00] to open AI or anthropic or anybody else for that matter. You should have, uh, essentially a layer of, of abstraction between the two so that you can more easily switch.

[00:40:07] And there's, there's a number of ways to do that, um, these days very easily.

[00:40:11] Mallory: You kind of touched on this, Amit, but you don't sound as concerned about the higher hallucination rates because these aren't meant to be kind of these general knowledge assistance. Is that right? The open. The open models, I mean.

[00:40:24] Amtih: Yeah, I'm not concerned about it. In the case of the OSS 2120 B models, if they're used for what they were built for, they were used to be really good at tool calling. So there's other models that are out there that have much lower hallucination rates, and if you try to get them to invoke tool use, a lot of times they'll just ignore the tool.

[00:40:39] So for example, if I give another model that has a lower efficacy rate of tool calling and I say, Hey, I've got a tool that you can use to search the web. So whenever a user asks for something, use this tool. Always use this tool and that model's adherence to those instructions may be substandard. It may use the [00:41:00] tool sometimes and not use the tool.

[00:41:01] At other times, if it thinks it has the answer, it might ignore the instruction and the system prompt to use the tool. Whereas the OSS 2120 B models were really, really dialed in to use tools as their first resort. So they, they are focused not so much on the knowledge within the model, but rather to use the tools.

[00:41:20] So for, that's why for AG agentic use cases, OSS 20, 0 1 20 should be really good. Um, as I said, you know, we find the Quinn three series to be excellent at tool calling and general knowledge and, uh, co-generation and a variety of other things. So we're. Pretty big on them at the moment, but, um, we don't get tied to any particular model for very long because there's always something new.

[00:41:41] Um, you know, if you're looking for a world-class code generation model, the Kimmy K two model from moonshot AI is extraordinary. That's been out for about three weeks now. Um, much better than the OSS models at code Generation, but OSS models are really good for, for a lot of the basic decision making that agents are doing all the time.

[00:41:59] So very [00:42:00] useful. Very pleased to see OpenAI put them out there. And I think, uh, by virtue of the fact that it's an OpenAI model, it's gonna get a lot of adoption. It'll probably very quickly become the number one model on open router, on, on platforms like Rock and Cereus, just because it's open AI's model.

[00:42:14] Mallory: Mm-hmm. Do you think we'll see Anthropic follow suit soon with some open models?

[00:42:20] Amtih: Excellent question. I would love to see them do that. I dunno if they will. One really important thing to note is, uh, I think OpenAI strategy is smart, is that they're not trying to give you like an older model that's not really that useful anymore and say, Hey, we're in the open source game.

[00:42:33] They're actually trying to say, Hey, like for open source and smaller models, this would be really useful. Along with. Our Frontier proprietary models and so build solutions with all of that, right? Whereas like GR with a K, Elon Musk's X ai, they're saying, oh, we're committed to open source two, we're gonna give you the crap that we don't use anymore.

[00:42:52] So basically it's like Grok three is gonna get open source, but Grok three is like completely outta date. So I don't know that there's any value to that Plus. [00:43:00] The model wasn't built to be open source. So even if you open source it, it's probably too big of a model for most people to run. So I'm not sure that that's the most viable open source strategy, just to check the box.

[00:43:09] So my suspicion would be that if Anthropic enters the game, which I think they probably, I mean, I think, I think they should, whether they will or not is a great question. Um, but I think there's a good chance they, they will, I suspect they do something similar, that they would have purpose-built models that they maintain in parallel with their frontier proprietary models for the purpose of, of what?

[00:43:30] Open source use cases, you know, really, really makes sense. So, um, I think open source is going to see a lot of interesting evolutions. We gotta remember we live in a global world and there are players all over the place who have the ability to train these models, and that's gonna get easier over time.

[00:43:45] Sam Altman said that it, it took a very small amount of resources to train the OSS models, and that would be true for a lot of other players that are out there that know how to do this and have access to the compute and have access to the training data, which is increasingly. [00:44:00] Easy because the training data comes from other models more so than anything else at this point.

[00:44:05] So that means that we are going to see an even greater explosion of open models, more choice, more power, and a lot of people think that by the end of the year, Chinese open models will be as powerful if not. Surpassing the capabilities of Frontier Proprietary models. So I don't know that I have a, a bet on that statement, but, uh, a lot of people are saying that, and I think, I mean, if you look at the trend line in terms of what's happening and how the Chinese ecosystem they're building off of each other, I could see the argument for that.

[00:44:34] And even if it's not exactly as powerful, it's, it's already extremely useful. And, um, I think we need to pay attention to that

[00:44:41] Mallory: for everybody that's still here with us at the end of this great OpenAI episode. Amit, what do you think is the. One single takeaway.

[00:44:50] Amtih: Number one takeaway for me is go play with this stuff.

[00:44:52] You know, you can hear us talk about it and hopefully that's really helpful. But I think you should go and log into GPT five and try it out if you're even a [00:45:00] little bit technically inclined or technically curious. Um, if you have a Mac, download something called LM Studio. Uh, it's a free piece of software that allows you on your Mac to very easily download models and play with them.

[00:45:12] It's got a chat chief PT like interface. It's slightly more technical looking than that, but uh, it allows you to do local inference. You can download a whole bunch of different models from different providers and check 'em out. And that's a fun way to kind of get a sense of what's happening on the pc. I know you can do it as well, but I've, uh, switched to a Mac a few months ago, so I'm not sure if LM Studio is available for the PC or not, but I would definitely recommend if you're, you know, a chat JPT person, go in there and check it out if you haven't been in there in the last few days.

[00:45:37] And, um, throw some things at it that you thought. AI couldn't solve for you, right? That's what I always suggest when a new, a big new model comes out is think about something that didn't work six months or 12 months ago that you'd like to try again or give it a bigger task, right? Maybe it's good at writing one article at a time, but to your earlier point, Mallory, could it write like an entire month's worth of articles that are all good?

[00:45:59] Uh, so I think [00:46:00] there's just a lot of things you can do, and the best way to become really competent at AI is to go and play with ai to use it.

[00:46:07] Mallory: Absolutely everybody. Thank you for tuning into today's episode. We will see you all next week.

[00:46:15] Amtih: Thanks for tuning into the Sidecar Sync Podcast. If you want to dive deeper into anything mentioned in this episode, please check out the links in our show notes, and if you're looking for more.

[00:46:26] In depth AI education for you, your entire team or your members head to sidecar ai.

Tags:

Post by Mallory Mejias
August 15, 2025

Mallory Mejias is passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space. Mallory co-hosts and produces the Sidecar Sync podcast, where she delves into the latest trends in AI and technology, translating them into actionable insights.

OpenAI’s Deep Research & the Dilemma of Jevons Paradox | 69

32 min read

Cartoon Avatars, Talking Robots, and AI That Draws | [Sidecar Sync Episode 78]

36 min read

GPT-5 Has Landed & OpenAI Opens Up | [Sidecar Sync Episode 95]

Read the Transcript

Tags:

Free Intro to AI Webinar
Sign Up Today!

Categories

Recent Posts

GPT-5 Has Landed & OpenAI Opens Up | [Sidecar Sync Episode 95]

Read the Transcript

Tags:

Related Articles

OpenAI’s Deep Research & the Dilemma of Jevons Paradox | 69

Cartoon Avatars, Talking Robots, and AI That Draws | [Sidecar Sync Episode 78]

Free Intro to AI Webinar Sign Up Today!

Categories

Recent Posts

Free Intro to AI Webinar
Sign Up Today!