35 min read
Enterprise Platforms Prepare for AI Agents & Diffusion LLMs Prove Their Production Value | [Sidecar Sync Episode 134]
Mallory Mejias
:
May 21, 2026
Summary:
In this episode of Sidecar Sync, Amith Nagarajan and Mallory Mejias dig into two major shifts happening beneath the surface of AI: how enterprise software vendors are responding to the rise of AI agents, and why diffusion language models may be moving from research curiosity to real-world infrastructure faster than expected. They unpack Salesforce’s open, agent-friendly “Headless 360” strategy, SAP’s more restrictive API stance, and what these moves mean for associations trying to maintain control over their data. Then, they revisit diffusion LLMs through the lens of Inception Labs’ Mercury 2, exploring why faster, cheaper models could matter for voice agents, enterprise search, taxonomy work, content classification, and the future of model flexibility.
Timestamps:
00:00 - Inside the New AI Learning Hub07:45 - Salesforce vs. SAP: The Agent Access Showdown
17:27 - Are Association Tech Vendors Locking Down Data?
19:39 - When the Interface Stops Being the Product
22:07 - Why Members May Stop Logging Into Your Portal
29:07 - Diffusion LLMs Six Months Later
38:58 - Where 10x Faster AI Really Matters
43:52 - Building on Bedrock, Not AI Sand
👥Provide comprehensive AI education for your team
https://learn.sidecar.ai/teams
📅 Register for digitalNow 2026:
https://digitalnow.sidecar.ai/digitalnow
🤖 Join the AI Mastermind:
https://sidecar.ai/association-ai-mas...
🎀 Use code AIPOD50 for $50 off your Association AI Professional (AAiP) certification
📕 Download ‘Ascend 3rd Edition: Unlocking the Power of AI for Associations’ for FREE
🛠 AI Tools and Resources Mentioned in This Episode:
Sidecar AI Learning Hub ➔ https://learn.sidecar.ai/
Salesforce Agentforce ➔ https://www.salesforce.com/agentforce/
Cursor ➔ https://cursor.com
Mistral AI ➔ https://mistral.ai
MemberJunction ➔ https://memberjunction.org
Betty ➔ https://www.bettybot.ai
Inception Labs ➔ https://www.inceptionlabs.ai
DeepSeek ➔ https://www.deepseek.com
Kimi ➔ https://www.kimi.com
CrewAI ➔ https://www.crewai.com
SAP ➔ https://www.sap.com/
https://www.linkedin.com/company/sidecar-global
https://twitter.com/sidecarglobal
https://www.youtube.com/@SidecarSync
⚙️ Other Resources from Sidecar:
- Sidecar Blog
- Sidecar Community
- digitalNow Conference
- Upcoming Webinars and Events
- Association AI Mastermind Group
More about Your Hosts:
Amith Nagarajan is the Chairman of Blue Cypress 🔗 https://BlueCypress.io, a family of purpose-driven companies and proud practitioners of Conscious Capitalism. The Blue Cypress companies focus on helping associations, non-profits, and other purpose-driven organizations achieve long-term success. Amith is also an active early-stage investor in B2B SaaS companies. He’s had the good fortune of nearly three decades of success as an entrepreneur and enjoys helping others in their journey.
📣 Follow Amith on LinkedIn:
https://linkedin.com/amithnagarajan
Mallory Mejias is passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space.
📣 Follow Mallory on Linkedin:
https://linkedin.com/mallorymejias
Read the Transcript
🤖 Please note this transcript was generated using (you guessed it) AI, so please excuse any errors 🤖
[00:00:00:14 - 00:00:09:17]
Amith
Welcome to the Sidecar Sync podcast, your home for all things innovation, artificial intelligence and associations.
[00:00:14:23 - 00:00:27:05]
Amith
Greetings and welcome to the Sidecar Sync, your home for everything that matters at the intersection of artificial intelligence and the world of associations. My name is Amith Nagarajan.
[00:00:27:05 - 00:00:29:04]
Mallory
My name is Mallory Mejias.
[00:00:29:04 - 00:00:30:19]
Amith
And we are your hosts.
[00:00:32:06 - 00:00:40:20]
Amith
We usually have really boring topics, Mallory. It's just the same old stuff every week. Nothing changes much in the world of AI, right? So it's just the same thing again today, right?
[00:00:40:20 - 00:00:55:11]
Mallory
Yep, no exciting topics. Don't worry. Just click off this episode. Just kidding. Stay tuned because we do have a very exciting episode lined up for you. Amit, I like what you did there with the everything that matters. That's kind of a bold statement, but I like to think we do that on the Sidecar Sync.
[00:00:55:11 - 00:00:58:07]
Amith
I think we live up to that a good bit at the time.
[00:00:59:10 - 00:01:05:11]
Amith
We have a lot that matters. Associations have a lot to think about, a lot to do.
[00:01:06:14 - 00:01:27:14]
Amith
They're super excited, I think, for the most part about AI. Some people are understandably still concerned about a lot of things and should be. I think there's just a lot going on. So people have a lot going on in their brains. We try to be that home where you can rely on relevant and useful content that you can put to work in your organization no matter what role you're in.
[00:01:28:23 - 00:01:40:01]
Mallory
I know we recently launched the new version of the AI Learning Hub with, I believe, 70 courses. Amit, have you gotten any direct feedback on that so far? I know it's only been maybe a few weeks, but what have you heard so far?
[00:01:40:01 - 00:03:28:05]
Amith
Yeah, I think it was April 1st that that launch was officially, I think, with 70 courses being launched. I think people might have thought that was some kind of silly April's Fool joke or something like that, but it was real legit. Yeah, 70 courses and growing. It might be higher by now. The feedback I've received is pretty high level, but it's been extremely positive. People love the fact that we have these learning tracks that are specific to different departments. So there's a track for membership and a track for finance and a track for marketing and for education and so forth. And many of those tracks do have overlapping courses. So if you are going through any of those tracks, you will certainly all receive the prompting course, the fundamentals course, things like that that are kind of the foundational layer. And then on top of that, we then take you down a path of learning that's very, very specific to your role. So if you work on credentialing in your organization and that's all you think about all day, it would be great to have a lot of content really focusing on the use cases that would mean the most to you that would have the most impact. And so that's exactly what these tracks do. And so there's a lot of new content, but part of the idea behind 70 courses is we really created a playlist essentially that is really clean and really tailored for each individual in the organization. So super pumped about that and so much more to come. It's a challenge because, you know, Sidecar, we try to punch well above our weight class, so to speak, but we're a fairly small team and we're trying to serve a global market of very diverse associations with different folks in different roles. And so we, of course, have to heavily lean on all of that that we teach and use a ton of AI to produce this content. So I've received really good feedback to answer your question.
[00:03:29:09 - 00:04:11:00]
Mallory
That's great to hear. I remember maybe two years ago at this point being on a call with a vendor in the association space and telling them what we do at Sidecar and providing that AI education for association professionals. And they had said something to me like, well, isn't AI education just AI education? Like what's the difference for association professionals or for not? And sure, I think at its core, I don't know what percent you would say, Amith, but 80 to 90 percent probably of AI education is the same. But I do think it's that 10 percent, 15 percent layer on top of that that's specific to how your organization runs or specific to your role within that organization that makes all the difference in terms of applying the education.
[00:04:11:00 - 00:04:24:23]
Amith
Totally. You know, when you're thinking about like learning a skill, like learning how to drive a car, if you're going to drive a truck or a school bus or a delivery vehicle or a passenger vehicle or maybe a race car,
[00:04:26:04 - 00:04:55:09]
Amith
some similar skills, certainly there's some overlap, but there's a lot of differentiation, a lot of specialization that makes you really good at applying that technology to your specific role. So in a very similar fashion, I think it's important to have industry specific learning in large part, actually, even if the skills were identical, which of course they're not, the last 10, 20 percent really makes a big difference. But even if they were identical, translating the information into language and terminology
[00:04:56:13 - 00:05:26:21]
Amith
and semantics that make sense to people in their industry, in their profession and in their role makes a world of difference. That's pretty much what I just got done saying in terms of the education tracks. There is certainly a lot of new content in Sidecar Learning Hub since April, but there's a lot of content that carried over, that we updated, but really what we've been able to do is repackage it so that it's even more specific, right? So not just association content, which is what it was before. We had, you know, eight or nine courses and they were all basically everybody did the same course.
[00:05:27:23 - 00:05:29:20]
Amith
Sorry, so everyone did the same set of courses.
[00:05:30:22 - 00:07:09:07]
Amith
Now there are 70 courses. I doubt any person will do them all, although that's an open challenge as of right now. If you are in the Sidecar Learning Hub and you can do all 70 of those courses, I will send you some Sidecar swag that you're not expecting. That'll be cool. So show me that you completed all 70 are overachievers, but most people will not do that. Most people will take their track and it will be great for them. But again, it's hyper specialized, right? You're speaking to the education and certification person saying, "Hey, this is how you use this tool for your exact workflow, for your types of things using their terminology." And that's the same thing in your industry or your profession. If you serve architects or landscape designers or engineers or whatever the case may be, it would be great if you could deliver AI education to them that's hyper specific to the profession and even further more specialized per role. And in fact, of course, as many of you know, Sidecar actively partners with associations to do exactly that. We build custom tailored Learning Hub content for each of our clients. And that's separate from the Learning Hub we deliver to associations, but it's the same idea just delivered under your brand in your industry. But I think it's the most important thing that we have to do with AI hands down. I mean, we do all sorts of other things with Blue Cypress, but my focus so actively on Sidecar and on content is because I believe this is by far the highest leverage thing that everyone can do on the planet to be prepared for all that is to come. All that's already happened obviously, but really what's coming next will make everything that's happened through date with AI seem kind of toy-like.
[00:07:10:11 - 00:07:44:21]
Mallory
And hopefully this episode right here is just one step preparing you all for that crazy AI journey that we're all on. So two stories today, both about how the infrastructure underneath AI is shifting. First a strategic split in enterprise software, Salesforce and SAP made big announcements in the last few weeks, taking opposite bets on how AI agents should interact with their platforms. And then second, a six month update on diffusion language models, the architecture we covered back in November, which has gone from research bet to production deployed enterprise infrastructure faster than expected.
[00:07:45:21 - 00:09:35:22]
Mallory
So let's get into the Asian era's two camps, Salesforce and SAP. Back in February on this podcast, we covered HubSpot's monitor, meter and monetize comments and the broader question of whether enterprise software vendors would start charging AI agents for access to customer data. Two months later, that question has hardened into a strategic divide. In the same window, two of the largest enterprise software companies on the planet, Salesforce and SAP made announcements that take diametrically opposite bets on how agents should interact with enterprise systems. And for any association whose AMS, LMS or financial system runs on top of these platforms, the split sets up a real question about which side your vendors may eventually land on. So on April 15 of this year, Salesforce announced what it's calling Headless 360, I like the name, the most significant architectural change in the company's 27 year history. Co-founder Parker Harris framed the bet with a question he posed publicly in the lead up. Why should you ever log in to Salesforce again? Every capability inside Salesforce is now exposed as an API, an MCP or Model Context Protocol Tool or CLI command line interface command with over 100 new tools shipped on launch day. Coding agents like Claude Code and Cursor now have complete live access to a customer Salesforce data, workflows and business logic and the platform integrates with models from OpenAI and Thropic, Google, Meta, Mistral, so on. The early proof point is a travel platform called InGen that built a customer service agent in only 12 days using Agentforce and that agent now handles 50% of customer cases autonomously through Slack and API calls so no human ever has to open a CRM tab.
[00:09:36:23 - 00:10:35:06]
Mallory
Now late in April, SAP took the opposite bet kind of quietly. They published a new version of their API policy where one section prohibits API use for interaction or integration with semi-autonomous or generative AI systems that plan, select or execute sequences or API calls. Outside SAP endorsed pathways and in plain language basically saying third party AI agents are not permitted to access SAP data through APIs unless SAP explicitly approves the architecture. CEO Christian Klein walked back some of the message on a subsequent investor call which is kind of similar to what we saw happen with HubSpot saying SAP wants to keep the platform open and that the intent is to protect performance and IP but the written policy language is still in force and the asymmetry is withdrawn the criticism. SAP controls which AI systems are permitted and SAP's own AI product, JUUL, of course sits on the permitted side of that line.
[00:10:36:12 - 00:10:42:03]
Mallory
Consultants and analysts have called it lock-in and the German SAP user group raised some formal concerns.
[00:10:43:05 - 00:10:52:20]
Mallory
So Amith were revisiting this conversation two months-ish later. In your opinion, do you think there's a right position when we're talking about Salesforce versus SAP?
[00:10:55:00 - 00:10:57:10]
Amith
Well, I think there's lots of opinions.
[00:10:58:20 - 00:12:02:19]
Amith
My particular point of view is that open systems are going to win over time because the power and the capability that you get from AI is too great to try to fight and it certainly seems like what SAP is trying to do is extend the dominance they've had with ERP for decades now in big organizations. They're trying to extend that for a period of time into this agentic AI era by trying to force people down the pathways and it'll actually probably work for a while. There will be lawsuits, there will probably be antitrust, there will be all sorts of things that go on to fight against this and there will be customer complaints but people won't stop using SAP overnight obviously. It's like an AMS but even more involved, right? It organizes basically every aspect of how these large companies operate. They're in a position of power due to high switching costs but they're not going to gain any fans with this strategy that's going to open the door for people to start chipping away at them more aggressively which has of course already happened in all sorts of different ways.
[00:12:04:02 - 00:12:11:11]
Amith
I think it's a really big strategic blunder. It's the same position that HubSpot essentially has taken although HubSpot has peddled a little bit softer.
[00:12:12:17 - 00:12:55:05]
Amith
I suspect HubSpot is going to change their mind. They're way too AI forward in their thinking to think that that's actually going to be a winning strategy so I suspect they're going to be changing their tune at some point. But the idea of trying to lock your customers in, saying the API is not available or you have very limited rate limits, these are things that are big problems. We've been saying for a while on this podcast, Mallory, you and I have both said this a bunch, that you have to get your data house in order to fully benefit from artificial intelligence. What we mean by that, listeners, is that it ultimately is up to you to have your data available for your agents, whatever your AI agents as well as human agents, to utilize.
[00:12:56:05 - 00:13:10:07]
Amith
A good example of this is a classical problem people want to solve which is very simple sounding but quite difficult which is, "Mallory, can you please give me a complete picture of this one member right now? Give me all their history, give me everything they've ever done with us."
[00:13:11:11 - 00:14:00:13]
Amith
But one simple question which one would think would be pretty straightforward to answer is extraordinarily difficult because you have to pull data from all these different sources, it's in all sorts of different formats and it's very difficult to actually know what's going on. A lot of times because systems have historically been hard to customize or modify, people have all these ancillary data elements for example. I mean the classic one is a literal post-it note on someone's monitor saying something really important about a member and that still does happen but also just people keeping spreadsheets or Word documents or they have data in systems but they're like the most important data is put into a text field on a CRM record. We're not, they don't have structured fields but they're just putting like all this key information in like a comments field, something like that. That's extremely common as well.
[00:14:01:13 - 00:14:18:15]
Amith
But the reason I like to talk about this data house in order is you have to get your data unified into one physical place in order to let AI solve problems like what's up with Mallory, what's up with the Meath, what's the latest with each of them. You have to get all your data in one place.
[00:14:20:11 - 00:14:32:11]
Amith
Platforms like SAP and HubSpot as well are saying we're not going to let you do that easily. We're going to monitor, meter and ultimately block in the case of SAP your ability to do this at any scale.
[00:14:34:00 - 00:14:36:00]
Amith
One quick comment before I comment about Salesforce.
[00:14:37:22 - 00:15:37:04]
Amith
The move there I think for the association is to start synchronizing and replicating their data out of all of their key systems into an environment they control. We've also been talking about this probably ad nauseam for our loyal and common listeners here who are with us all the time but we talk about the need for a data platform. We have one which is totally free and open source. It's called member junction but it doesn't really matter ultimately what you do. It just means get your data out of those systems. Not that you stop using them by the way. There's a big distinction between saying throw away your AMS or LMS. We're not suggesting that at all. What we're saying is get a bulk export out of there and then after that incrementally sync the data so that even if there are very high API limits, sorry very severe API limits, things like that, you can still incrementally sync your data on a daily basis out of those systems and then you're in a place where you have all your data replicated and you can do your agentic workloads in an environment that you own and you control.
[00:15:38:13 - 00:15:49:15]
Amith
That's my general comment. I think this is one of the most critical things that faces every organization and certainly associations probably have an even greater version of this because they are dependent on so many different vendors.
[00:15:51:05 - 00:16:08:16]
Amith
That's my thought on SAP. I think they're making a giant mistake long term. Time will tell but I wouldn't be surprised if they stick with this position. That's kind of the beginning of their decline and that's a big prediction to make. They're extremely powerful and well positioned in the Fortune 500 but this is a big, big problem.
[00:16:09:17 - 00:17:26:02]
Amith
In comparison, Mallory, I think Salesforce is doing something super cool. I think that they're basically taking a bold bet. They're basically saying the core franchise, which is definitely driven by people having to open a CRM tab, we're going to move with you. We're going to work with you. We're going to expose every single piece of the functionality so that we continue to be relevant and help our customers. I'm excited about that. I know a lot of associations are using Salesforce sometimes directly, sometimes through association specific layers on top but that's really good news for associations that are using Salesforce. I still would say, by the way, just as a little quick side note, that doesn't mean you shouldn't pull your data out of Salesforce. You definitely should because you don't want to depend upon the next CEO coming in. Right now they still have deep founder involvement and founders tend to be a lot more dynamic, a lot more focused on the entrepreneurial pathway but at some point that changes. Do you really want to be dependent upon the whims of what the policy is that you have no control over? Even though that's the case, I would still pull my data out of Salesforce. I'd still use it for everything but I would sync my data out of Salesforce even though that's the policy. I just think it's smarter than to do that because it means their system is part of the solution rather than a blockade to it.
[00:17:27:02 - 00:17:46:02]
Mallory
I'm curious, as someone actively building AI agents for the association space, I'm not asking you to name names by any means, but are you seeing that lockdown approach with any vendors in the association space? Kind of like a trickle down. Are you seeing it difficult to get data out of certain vendors or is it generally pretty open?
[00:17:46:02 - 00:18:34:13]
Amith
Totally. I mean, that's been the case for decades, right? So there have been efforts by a lot of people over a lot of years to try to create like standard data formats. Our good friend Reggie Henry of ASAE for 30 years has been championing the idea of a standardized data format for a variety of different types of association data. That's come and it's gone and it's come and it's gone. But the main reason that it never took off and back when I was in the AMS game, I was strongly in support of that. I thought it was a great idea. But the reason it never went is none of the vendors could agree on the format and that ultimately actually I really feel like the motivation was to try to kill the idea from quite a few of the vendors because the more open those systems are in theory, at least in the protectionist mindset, in theory, the more likely your customers are able to leave easily if their data can leave easily.
[00:18:35:14 - 00:19:38:10]
Amith
You know, in comparison, my old company from literally the first day that it existed in the 1990s had an open API for literally every single piece of data in it. And it was one of the main reasons people picked our platform in the 90s and in the 2000s and beyond because it was open from the very beginning. It wasn't an open source system, but it was open data and so there was always an API for it. And I think it's a really important thing to really zoom in on. It's like, what's the coverage of the API? Is it 20% or 50% of the system? Or is it literally every single object, every single piece of data? Can you do everything that you can do through the user interface, through the API or now through MCP or CLI? So really important questions to be asking if you're in a vendor selection process looking for a new LMS or AMS, you need to look at that. I'd also go so far as to say, make sure your contract says what the API terms of use are and lock that in. And that way, if they try to change their policy and if you have a contract that says you get unlimited use of the API for a flat fee or whatever the case may be, you at least have a little bit of recourse there.
[00:19:39:11 - 00:19:49:09]
Mallory
I feel like Salesforce is essentially saying the interface is no longer the product, which is a big philosophical shift for enterprise software. Do you buy that? Do you think that's where everything's headed?
[00:19:51:02 - 00:20:11:13]
Amith
You know, with enterprise software systems, what I've always found is that you have a core group of users. So if you have 200 employees at your association, you might have 20 to 50 of them that are in the AMS every day and they're doing their business, they're processing transactions, they're doing deeper and deeper things in there.
[00:20:12:14 - 00:20:20:00]
Amith
But then you have the rest of the organization and coming to the AMS is an unnatural act, right? They're not living in the AMS. They don't have that tab open all day.
[00:20:21:01 - 00:20:48:06]
Amith
And for those folks, it's always been a challenge because their natural workflow, their habitat, if you will, is not the AMS. And so if you just kind of translate that to just generally these enterprise systems, if you make it so that people can work where they want to work in a flow that makes sense to them, it's way, way, way lower friction for everyone. It's one of the reasons that agents should be portable, agents should be able to communicate with you anywhere.
[00:20:49:08 - 00:21:55:12]
Amith
So I'll go back to member junction just for a minute. Again, that's our totally free thing. So we're not pitching it to anyone. It's just it's available if anyone wants it. But that open source system has an agent layer to it and you can interact with those agents through the member junction user interface if you want. But you can also interact with them through MCP, through API, through CLI. That's been the case since day one. But also recently we introduced native teams and Slack integration. So you can talk to any agent built on MJ inside Slack or Teams. And so the whole point of it is, is how do you make these things as flexible, as malleable as possible so that they can appear anywhere, interact with other agents or obviously with any humans that want to interact with them. But yeah, I think the Salesforce bet is really smart because 80% of the users, in fact, some of the people that have the biggest votes in whether Salesforce stays or goes don't use Salesforce every day, right? You know, a typical thing like you might have a chief revenue officer that ultimately gets to choose whether to use Salesforce or switch to something else. That person may be in Salesforce some.
[00:21:56:22 - 00:22:07:01]
Amith
They're not living in it the way a rep would. And perhaps also at the next level up, the CEO would tend not to be in Salesforce much at all. That's typically true for any of these enterprise systems.
[00:22:07:01 - 00:22:42:17]
Mallory
Hmm. That's a good point. I want to wrap up this part of the episode with this question posed by co-founder Parker Harris. Why should you ever log into Salesforce again? And I want to zoom out from the enterprise for a moment and talk about individuals or your members who will eventually or maybe already do have personal AI agents who go out and gather information on their behalf. And so to reframe that Salesforce question, why should anyone ever log into your member portal again? Amith, what do you think about that? I know that's like a big question to think about, but what's your take?
[00:22:42:17 - 00:23:58:07]
Amith
I love that question, Mallory. I think that's really insightful to be thinking about it in the context of the product that the association delivers. And I think the exact same thing applies. I think that your association needs to expose its services in a way that's agent era friendly. So you should be thinking about whatever agents you have or are building, are those available through API, through MCP? Can people connect them into their workflows? And it doesn't mean that you necessarily do this for free either, by the way. There may be more value and you may be, you know, they might be at a higher tier of membership or it might be an all-a-cart service. A good example of this is the association's knowledge base. So if you have an AI agent that's an expert in knowledge, and we obviously have one of those in our family, a lot of you know about Betty, but if you have that kind of thing set up, it's great. You can send people to your website and they can ask literally any question about not just the associations like events and policies and membership, but rather the domain of content that you have expertise in. You know, the hundreds of thousands of articles that you have in your archives and all of the proceedings from all of your conference. So tools like that will learn all of that content and be the greatest expert on the planet for that content, which is awesome.
[00:23:59:07 - 00:24:38:17]
Amith
Now, let's say I am an association of accountants. I have members who are doing their work day to day in their accounting firms or in their corporate accounting jobs, and they're not on my website all the time. So just like with the Salesforce quote, coming back to your question, why should they ever have to log into the association's website again to look something up, to ask a question of the AI agent? And I would argue that they shouldn't. And in fact, if their firm or their company decides that they want to pay for a subscription, that AI agent, that knowledge agent should be available through Slack, through Teams, through an API, through whatever they want. They should be able to weave it into their own internal infrastructure.
[00:24:39:22 - 00:25:37:03]
Amith
And the value creation is still enormous. It's just a different surface area. I think people cling to business models of the past in order to try to hang on to revenue models basically. And things are changing. And so I think companies that are willing to go out there and take some risks and experiments are they're learning in real time. Companies that are blockading these things, not only are they pissing everyone off, frankly, but they're also not learning anything. So SAP is not going to be good at interop with everybody else because they have chosen to create a very restricted universe. They might be great at AI internally, but they're not good at interacting with everyone else. Whereas Salesforce is going to have hundreds of thousands of reps of doing this with every kind of thing you can imagine. So that's what you want to look for in vendors is that openness, that mindset. It is a little bit bold, perhaps, to say things like that. But that's exactly what you want out of a partner that you're working with, is someone who's willing to take those risks and put the customer up front or customer up first, I should say.
[00:25:37:03 - 00:25:52:08]
Mallory
And I can sympathize with all of you listening thinking, "Oh gosh, we just embarked on this huge redesign of our member portal." I don't know that necessarily everything's changing now, but it's certainly, if this is kind of where the major software vendors are going, it's certainly something you should be thinking about.
[00:25:52:08 - 00:27:16:10]
Amith
I mean, it's kind of like, you know, the information that you want is dependent upon the context and the timing too. So like, if you're going to the airport, you care an awful lot about, you know, where is your gate that you need to fly out of? Where is security? Where's the bathroom? Where can I refill my bottle of water? Perhaps where are some restaurants if I have time to grab a bite? But when you leave the airport, you don't care about that anymore. How often do you log into your airport's website to look up the map? Like never, right? But if you're at the airport, what do you want? You want to have that information in, quote unquote, the user experience of the airport, right? Like walking around in the maps that you see or just the signs. You want that to be relevant and helpful. And so in a similar sense, if people are having a deeper experience of the association, like coming to an event or they're going through a certification process, by all means they're likely to come to your website, your digital airport, your digital home, and do much more with you. If you have great educational content, they're more likely to spend more time with you. Associations know that well. But a lot of times people are just casually looking to get an answer. They're looking to get a problem solved and they don't want to come to another website to solve the problem. So no matter how slick and how cool and how up to date and how fast your website is, it doesn't solve the problem. You're not solving the problem the user has, which is just to work in their work stream. So it's not either or, it's and.
[00:27:16:10 - 00:27:25:23]
Mallory
Yes, that's a great point. I mean, context is important when I'm at the Atlanta airport. I live on that website and as you said, never go back to it until I'm back at the Atlanta airport.
[00:27:27:03 - 00:29:07:04]
Mallory
Moving to topic two for today, we're going to be talking about diffusion LLM six months later. Back in November, we covered diffusion language models for the first time. Quick refresher, because it's been a couple of months, but every major large language model in production today, claw, GPT, Gemini, generates text one word at a time, left to right, where each word has to wait for all the previous words before it can come out. Diffusion models work completely differently and they borrow the approach from image generators like mid journey. So instead of writing word by word, they start with a rough sketch of the entire response and refine the whole thing in parallel in a few quick passes. The result is text that arrives much faster all at once instead of streaming one word at a time. When we talked about this in November, it was more of a research bet from a startup called Inception Labs with one commercial product called Mercury and Elon Musk publicly predicting diffusion would eventually replace the current architecture. Six months later, picture has shifted from interesting research bet to production deployed enterprise infrastructure. Around that episode back in November, Inception closed a $50 million funding round led by Menlo Ventures and the investor list is the part worth dwelling on. Microsoft's Venture Fund participated, Nvidia's Venture Arm participated, Snowflake Ventures and Databricks investment both put money in. Andrew Ng and Andre Carpathi, two of the most influential researchers in the field, came in as angel investors. The company's most invested in the current word by word architecture. The one diffusion would eventually disrupt are now hedged into the alternative and that is a meaningful signal.
[00:29:08:09 - 00:29:47:00]
Mallory
Back in February of this year, Inception launched Mercury 2, which they're calling the world's first reasoning diffusion, large language model and the fastest reasoning model on the market. The headline from independent benchmarks, Mercury 2 generates output roughly 10x faster than comparable models from Anthropic and OpenAI while matching them on quality. Production customers are using it for voice agents, real time coding tools and enterprise search workloads where pauses break the user experience. So, Amith, I believe recently you met with one of the founders from Inception Labs. What did you learn from that conversation maybe that you didn't already know?
[00:29:47:00 - 00:30:36:23]
Amith
Yeah, you know, so when I talk to people who are doing this kind of work, I ask them a lot of different questions about the journey and so on. And I'll share a little bit of that in a minute. But the first first questions actually asked were very practical. Whenever I talk to someone who has some new Wizzbang technology or some new inference capability, I always ask them, first of all, tell me where your data centers are located. I want to know where they are actually running their inference workloads. And I want to know if they're retaining logs of the API calls that people make. And they answered both of those questions. Well, all their data centers are in the US and they don't retain logs if you don't want them to. They can, but you can turn that off entirely. So it's what we call ZDR or zero data retention. It's a really important concept.
[00:30:38:03 - 00:32:01:05]
Amith
And before going anywhere beyond that, I wanted to make sure that those were handled correctly. Our customers across Blue Cypress are very concerned about where their data goes. They don't want their data going offshore necessarily, but they certainly want to make sure that the calls that are being made to these APIs are not being logged. So just to quickly double click on that for our listeners who aren't super familiar with that, when you make an API call, it's kind of like making a phone call from one computer to another. It's very structured. It has a certain format and there's information that's obviously transmitted in each direction. So you might send across a request that has, you know, a bunch of text. It might be a few hundred or a few thousand characters or more or image or something like that. And then you get back text and images or video or audio. So that's what's happening essentially. The logging means they all log how much of their service you're using. That's how they bill you. But what we don't want them logging is what you're sending. So what we call the input tokens or the input content. We don't want them logging the output because the output essentially is what the model responses with both the inputs and the outputs potentially can contain sensitive data, data that represents part of the content of the associations, you know, vast content, corpus, possibly member data, possibly PII, possibly HIPAA, protected data and so on. So that's a really, really important concept.
[00:32:02:10 - 00:33:38:14]
Amith
Anyway, they pass the flying colors. They do all that stuff right, which is which is what I expected. But what I'm interested in with these guys is the speed and the intelligence and the cost. Right. It's this combination of those three things. It's how smart is the model, how fast is the model and what does it cost? And they really are kind of in a bit of a Goldilocks zone because their model is considerably faster, considerably cheaper at the same level of intelligence as some of the smaller models from the big lab. So on a comparative basis, similar to like the GPT Nano series, the Gemini 3.1 Flashlight model, which is the Google's smallest commercial model and probably comparable to Haiku from the cloud folks. So that that size of model doesn't get a lot of the headlines, but it's actually maybe the most important type of model out there. We've talked about smaller models in this pod a bunch of times and we've talked about them in the context of open source models. You can run yourself. But this is kind of like the smallest model that is still proprietary. That's out there put out by these these frontier labs. So it's the reason it's so important is most of the workloads we do these days can run on inception or on Gemini flashlight. You don't need the most novel intelligence like from Opus 4.7 or GPT 5.5 on extra high. You don't need that for most of the problems. Like now if you're writing a blog post for the Sidecar blog, I don't think you need PhD level intelligence across 40 disciplines, which is what Opus 4.7 gives you, right?
[00:33:38:14 - 00:33:40:16]
Mallory
I might like it, but I don't need it.
[00:33:40:16 - 00:37:07:18]
Amith
Yeah, you might like it. And if you're carrying like eight drafts in parallel and then you're going to have something review them, the eight drafts, maybe they're drafted by a competent writer, but maybe the review is done by that PhD level intelligence. But the idea is is that for every token that needs to go to the absolute highest end model, there's probably 10 or even 100 tokens that need to go to a model that's considerably less intelligent and that's fine. It's not talking about a model that's going to make mistakes. We're not talking about models that are dumb. I'll give you some examples, classifiers. So one of the workhorses in AI data processing is classifying unstructured content. So imagine your association is sitting on a million documents from the last 50 years of your history and you say, you know, we've got this taxonomy. We can never keep up with the taxonomy. And in addition, because we just can't tag all the content, right? It's just too much work. And the taxonomy actually needs to be kind of a living breathing organism rather than a static fossil from the past, right? Because if you're in a discipline that is changing, which is basically almost everything, you need to have a taxonomy that reflects the current state of the art while also still having, you know, elements in it that reflect prior content. So that's just actually maintaining the taxonomy is a challenge, but then probably tagging the content relative to the current taxonomy is hard. But imagine every time the taxonomy changes to go back and reclassify all the old content to then fit the new taxonomy. And because this is so hard, what do associations do? First of all, they don't implement taxonomies most of the time. Then those who do do it partially and they don't actually fully implement it across all their content. And then those who do that, which is a small percentage, which is a big, big investment, don't keep the taxonomy up to date because they have effectively the equivalent of technology debt to the taxonomical choices they made, meaning they have, you know, 500 tags or 500 categories or whatever in a hierarchy and changing that would mean they have to go and reclassify everything that's ever come before, which is obviously not Everest of jobs. Right. So but AI makes that trivial. This is the short version, right? You can do literally everything I just said with AI and you can do it with GPT for class intelligence with the accuracy above that of most humans. And if you use Opus 4.7, you're just wasting your money. It does not make any sense to use the most advanced models to do this task. Even when you have a lot of content and a lot of nuance to your professional taxonomy, you don't need really for any of the professions I'm aware of, which is a bunch of different, fairly diverse professions, you don't need the most advanced frontier model. But you do need the ability to run these models fairly frequently and to run them across a wide array of content. You might have a hundred billion tokens that you need to process right across all of your all of your content or some massive number. And you need to do that like every six months. It matters a lot how fast and how costly these models are. So what's exciting about this diffusion model is it's a new architectural paradigm. It has the potential to have actually higher intelligence long term for a couple of reasons I'll come back to in a sec. But it's right now really good, but it's also super fast and cheap. So those are the things that got our attention. We're testing it right now and the results have actually been pretty much what they advertise. It's roughly on par. Intelligence wise with the models already mentioned.
[00:37:09:01 - 00:37:29:21]
Mallory
I mean, a few notes there. One thing is I feel like when it comes to you and AI, I know you pretty well just because of the podcast. We spend, you know, upwards of an hour together every week talking about it. I actually did not know those were your first two questions when you were meeting with these companies about the API logs and data centers. I don't think I've ever heard you dilute it into that. So I thought that was interesting. Now I know noted.
[00:37:31:01 - 00:37:54:22]
Mallory
Another thing I want to talk about is so you said you're testing it out with some products, agents in the Blue Cypress family. Which scenarios do you find that 10x faster output really matters the most? I know I mentioned voice agents. I'm assuming maybe like the member service agent loops component. That would be really helpful to have 10x speed. Where are you testing it out?
[00:37:55:23 - 00:38:29:18]
Amith
Yes. So there's there's a few different things. So these different rates of speed. There's two metrics that matter with models. One is the tokens per X, which is like tokens per minute typically as it's measured or TPM. And then you have the ability to measure what's called TTFT, which is another fun acronym here in Techland, but it's called Time to First Token. So if I give you tokens per minute, Mallory, let's say a big number like 1000, 2000 tokens per minute. That means that the amount of content I can get to you on an ongoing basis per minute or per second is very, very high.
[00:38:30:19 - 00:39:18:16]
Amith
And then the time to first token on the other hand is how long does it take to get that very first response, that first letter, essentially first word of the response. That's actually really, really important for latency. So if I have batch processing where I'm going to take a whole bunch of content and I'm running, you know, 1000, even if I have a million tokens per second or some crazy number like that. But if it took me a minute to get the first token, I couldn't use it for voice or video for real time, but I could use it to do batch processing of audio historical content really efficiently. So there's a couple of things to keep in mind. Ultimately, I think for the kinds of things we're focused on. So we have a lot of people using our tools for, you know, bulk content analysis, things like this. And so the classifier tool inside MJ that I was talking about earlier, which does all that stuff that I just mentioned,
[00:39:19:19 - 00:40:15:09]
Amith
that tool is a workhorse in terms of using LLM. So we're going to test Mercury 2 there. I think it will perform great. We ran an initial test and it was both fast and really cheap. Is it dramatically better than Gemini 3.1 flashlight in terms of cost or speed? It is somewhat faster. It does seem to be about the same level of intelligence. And I believe it's a little bit less expensive. But to me, what I'm most excited about is can this technology compound its advantage when there are perhaps hybrid architectures, which people are experimenting with, as well as later this year when there is some accelerated compute offered by NVIDIA, when they announced their they already announced this. But when they get into production broadly, the Vera Rubin architecture, which will include the LPU3, which is the thing they acquired from our friends at Grok, Grok with a Q.
[00:40:16:13 - 00:40:34:20]
Amith
That's going to be a game changer. And if Mercury stuff is running on that architecture at scale, that could be really, really interesting. So I think there's a lot of upside to this. We're exploring it now. I don't know that it would necessarily for us change anything dramatically this second. But I think they're heading in a really interesting direction.
[00:40:34:20 - 00:40:55:21]
Mallory
I think it's a good point that you brought up the time to first token TTFT, because looking back at our November episode, we talked about first word delay. So it sounds like diffusion models at large are faster. I guess tokens per minute, they're faster. But it might take a little bit of time to get that first token. Is that correct? Or to get that first word?
[00:40:57:00 - 00:42:01:18]
Amith
Sort of. I mean, it is true that the the inference process of like, if you think about an image, it's like kind of pixelated and then it kind of comes through. But you can't really do anything with it until you have the final rendered image. But what's happening in the process is that ultimately the math happens all at once in a sense in terms of the diffusion process. But then after that, when you do the decode, which is to actually translate it back into text, that can be streamed. And so that that's there's a little bit of a speed up there. But diffusion is so much faster in the net. It's actually similar. And so when they say there's also reasoning on top, is there essentially reasoning models are still inferencing effectively the same way in the world of Transformers auto aggressive next token. But what they're doing is iterating over their own results and saying, hey, does this make sense? And so there's an element of that. I don't know how adaptive their reasoning is. The workloads we've thrown at Mercury so far, which is very limited so far, we have not tested its need to reason because we're giving it fairly simple tasks. But I think it's it's it's quite exciting that they have a reasoning layer to their architecture.
[00:42:01:18 - 00:42:09:13]
Mallory
Mm hmm. So if everything is moving toward diffusion, let's just say hypothetically in terms of architecture for these large language models,
[00:42:10:14 - 00:42:21:13]
Mallory
do association leaders need to be keeping anything in mind? They need to be planning for that? Or is it going to be as easy as a plug and play? Architecture changes, the models improve, and you just use the new one.
[00:42:21:13 - 00:43:15:15]
Amith
This will not affect anyone's anything really in the association market right now to use a very technical description. It's it's just purely something you should be informed on. It is and underscores actually the first topic from today in terms of making your vets more about not specific vendors or specific models, but more about the way you set yourself up for success. It's more about how you do your foundational planning now so that you have the ability to plug in models, whether it's inception or in the last 30 days, we've also had new models from DeepSeek from Alibaba's QWEN model series from Kimi, from a whole bunch of people not to mention all the things that are happening with Frontier Labs, GPT 5.5 instant was announced in the last 24 hours. There's so many things happening, you know, and actually this is not to speak of the other grok, the grok with the K, they have a really good new model called 4.3,
[00:43:16:16 - 00:44:31:15]
Amith
both a reasoning and non reasoning models. They have some really interesting stuff with voice. There's so much happening that the most important thing is to realize that and to say, like, I can't make the quote unquote best choice. You're not picking an AMS where you're going to like live with it for 10 or 20 years. You're picking something, maybe you will use that particular vendor or model for a while, but most likely something cooler is going to come out, you know, in a month or in a day. And so you want to have optionality. So what does this all boil down to for you as the association? One, stay aware. Know what's coming so that you have your options in your mind saying, hey, there are better tools for this. If you didn't know, for example, that voice was as advanced as it is, you wouldn't be thinking strategically about where you should include voice in your association's engagement strategy. But by knowing that voice is in fact actually already quite good and getting dramatically better by, you know, nearly the minute, you as a user of that technology are empowered to make better strategic decisions, to make better choices in terms of infrastructure as well. So that's really what I would urge people to remember here is, A, if you're, you know, if you're nerding out on this stuff and you think it's cool, it's awesome. Check it out. Go check out Inception Labs. Really cool story.
[00:44:32:18 - 00:46:31:02]
Amith
Really interesting people. But bottom line is, is that most people probably don't need to do that. Most people here who are listening need to be informed and aware and they need to make sure that they're not marrying a specific vendor. So this is where I get really focused in saying, you know, there are great platform choices out there from people like Google Enterprise. Anthropic has a variety of tools. OpenAI just announced new tools for building agents and things with them. But every single one of these approaches is vendor specific. It is lock-in. It means that you will not benefit from any advances from any other vendor if your whole ecosystem is trapped within the OpenAI bubble or the Google bubble or anybody else. And this is not an indictment of their business models. These guys have to do this because they know very well that the model economics are commoditizing. They already are commoditized at a certain level. Everything in AI in terms of models will be effectively, you know, it's like buying gas. You're not going to pay dramatically more to go to Shell versus Chevron or whatever. Even if you like the brand for some reason better, you're going to get the most, you know, cost-effective thing for you. And that's true for any commodity. So they have to go up the stack, which is why you see people like Anthropic building all these apps like Cowork and Cloud Code going into industries, building services arms. Really smart moves. But you as the association leader have to think about the future of your business. And don't be too excited about any of these announcements. They're all actually very exciting. But don't be excited in the sense that you're going to shift your strategy, change your approach. You have to have something now that anticipates this degree of change and be ready to plug in whatever. And there's a lot of ways to do this, right? I talk about MJ a lot. It's what I know. I've been super focused on that for years. That's one way to do what I'm talking about. You can also use tools like Langgraph. You can use tools like Crew AI. You can use other agent building tools. All of them have their pros and cons. But the essence of this is don't build directly on top of one specific company stack in the world of AI, no matter how cool they might see.
[00:46:32:11 - 00:47:05:10]
Mallory
Yep. That was going to be my takeaway too from this episode of me is to build a solid foundation to build on bedrock, not sand. Especially one with your data to make sure you have unfettered access to it because it's yours. Get your data warehouse in order so you're not depending on the whims of companies like SAP and HubSpot and Salesforce. And then on the model side, exactly that, making sure that as new architectures come out, new models, new ways of inference are out there that you can take advantage of them. So with that, thank you for tuning into today's episode and we will see you all next week.
[00:47:06:24 - 00:47:24:20]
Amith
Thanks for tuning into the Sidecar Sync podcast. If you want to dive deeper into anything mentioned in this episode, please check out the links in our show notes. And if you're looking for more in-depth AI education for you, your entire team or your members, head to sidecar.ai.