The Race for AI Speed, Efficiency, & Infrastructure: Haiku 4.5, Diffusion Models, & Project Suncatcher

Summary:

In this episode of Sidecar Sync, Mallory Mejias and Amith Nagarajan dive into three high-octane AI developments reshaping the landscape of innovation. They kick things off with Claude Haiku 4.5, Anthropic’s blazing-fast small model offering near-frontier performance at a fraction of the cost. Then, the hosts explore diffusion models—an architectural shake-up that could revolutionize how AI generates language. Finally, they venture into orbit with Google’s ambitious Project Suncatcher, a plan to power machine learning with solar satellites in space. Along the way, they unpack what these advancements mean for associations and why now is the time to set bold, moonshot goals.

Timestamps:

00:00 - Welcome Back from digitalNow 2025
03:57 - Speed, Efficiency & Infrastructure: Today’s AI Agenda
04:29 - Claude Haiku 4.5: Fast, Affordable, and Powerful
10:53 - Practical Applications for Associations
13:08 - AI Policy Musts: What Leaders Should Do Now
15:48 - The Risk of Unchecked Automation
21:34 – Transformers vs. Diffusion: The Architectural Battle
26:11 – Why Diffusion Models Might Be a Game-Changer
29:29 – The Future of Small Models and On-Device AI
33:31 – Google’s Project Suncatcher: AI Compute in Space
39:22 – Why Associations Need Their Own Moonshot Goals
45:04 – Closing Thoughts

👥Provide comprehensive AI education for your team

https://learn.sidecar.ai/teams

📅 Find out more digitalNow 2026:

https://digitalnow.sidecar.ai/digitalnow2026

🤖 Join the AI Mastermind:

https://sidecar.ai/association-ai-mas...

🔎 Check out Sidecar's AI Learning Hub and get your Association AI Professional (AAiP) certification:

https://learn.sidecar.ai/

📕 Download ‘Ascend 3rd Edition: Unlocking the Power of AI for Associations’ for FREE

https://sidecar.ai/ai

🛠 AI Tools and Resources Mentioned in This Episode:

Claude Haiku 4.5 ➔ https://www.anthropic.com/news/claude-haiku-4-5

Comet AI Browser ➔ https://www.perplexity.ai

Claude for Chrome ➔ https://claude.ai/chrome

Inception Labs ➔ https://www.inceptionlabs.ai/#our-models

NotebookLM ➔ https://notebooklm.google

Project Suncatcher ➔ https://shorturl.at/sVh5B

👍 Please Like & Subscribe!

https://www.linkedin.com/company/sidecar-global

https://twitter.com/sidecarglobal

https://www.youtube.com/@SidecarSync

Follow Sidecar on LinkedIn

⚙️ Other Resources from Sidecar:

More about Your Hosts:

Amith Nagarajan is the Chairman of Blue Cypress 🔗 https://BlueCypress.io, a family of purpose-driven companies and proud practitioners of Conscious Capitalism. The Blue Cypress companies focus on helping associations, non-profits, and other purpose-driven organizations achieve long-term success. Amith is also an active early-stage investor in B2B SaaS companies. He’s had the good fortune of nearly three decades of success as an entrepreneur and enjoys helping others in their journey.

📣 Follow Amith on LinkedIn:
https://linkedin.com/amithnagarajan

Mallory Mejias is passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space.

📣 Follow Mallory on Linkedin:
https://linkedin.com/mallorymejias

Read the Transcript

🤖 Please note this transcript was generated using (you guessed it) AI, so please excuse any errors 🤖

[00:00:00:14 - 00:00:09:17]
Amith
Welcome to the Sidecar Sync Podcast, your home for all things innovation, artificial intelligence and associations.

[00:00:09:17 - 00:00:25:02]
Amith
intelligence and the world of associations. My name is Amith Nagarajan.

[00:00:25:02 - 00:00:26:21]
Mallory
And my name is Mallory Mejias.

[00:00:27:23 - 00:00:40:00]
Amith
And we are your hosts. And today we have an episode on three super interesting, super exciting, and a little bit technical topics. But you'll get the hang of it. Don't worry about that part. It's going to be fun, as it always is.

[00:00:41:07 - 00:00:54:06]
Amith
We just got back from digitalNow. This is the first live episode we're recording after returning home from an awesome conference in Chicago. What did you think, Mallory? This was your third digitalNow, I think, or--

[00:00:54:06 - 00:01:59:16]
Mallory
This was, I think, my fourth. My fourth Digital Now. I posted on LinkedIn, which is kind of funny, but not a joke, that digitalNow is like my Super Bowl every year. Because I feel like my frame of work in my mind is always like, that's near digitalNow. That's before digitalNow. That's after digitalNow. So I was so excited to go in. And last year I was more hands on with the organization of the conference, but this year less so. So I actually was attending kind of like an attendee. I was surprised by some of the content I was seeing as always so inspired by the association leaders there, the ideas that they had. You could just see the passion in which some people listen to the keynotes, taking notes, taking pictures of the slides. And that, for me, is so exciting. Because we often say on the podcast, it's me and you, Ethan. And someone at the conference actually said we have a great rapport, which I appreciate and I agree with. But sometimes it feels like a bubble. And so getting out of that bubble and seeing how excited people are about innovation and how much they care about their members, that to me is just-- can't be compared to anything else, for sure.

[00:01:59:16 - 00:02:44:16]
Amith
Yeah, absolutely. That's the highlight for me every year, is hearing real stories from association folks who are doing cool things. And I think the trend line is year after year, as we run this conference, we first ran it in 2021. So digitalNow, for those who aren't familiar, has actually existed since 2000. We only took over the conference in 2021. And we've run it now since then. And essentially, every year, there is a pretty significant leap ahead in terms of the practical things association leaders, just like you, are doing in their associations day in and day out. And it's super exciting to see. So that's my favorite part, is hearing what associations are up to. I thought we had some amazing keynotes. I thought that everyone seemed to say that they enjoyed the food as well, which is always a highlight.

[00:02:46:05 - 00:02:50:21]
Mallory
I said that. I'm like, man, the food of me was so good. And I'm sure the content was also great.

[00:02:50:21 - 00:03:56:24]
Amith
I heard about the food before anything else from probably 50 people. We had record attendance. We had close to 300 people at the event. We expect it to be even larger in DC next year. We will be at the Hilton, a brand new Hilton hotel in Rosslyn, Virginia, just across the Key Bridge from Georgetown, October 25th through 28th in 2026. We just announced it at digitalNow this year. We had an amazing time at the Lowe's Hotel in Chicago a couple weeks ago. We really are grateful to the wonderful team there for doing an amazing job. And we're super excited about this new property, this brand new Hilton that just opened up in DC that we're going to be taking over in late October next year. And it'll be here before you know it. And I would say your guess is as good as mine, probably Mallory, in terms of where AI will be at that point in time. It's pretty hard to keep up with this stuff, right? And in the days, three topics are aligned with exactly that theme. There's a lot of relentless progress happening in the world of AI. And I think that applications to associations are abundant as well.

[00:03:56:24 - 00:04:28:18]
Mallory
For sure. Today, we are talking about speed, efficiency, and infrastructure. And we think they're all kind of getting at the same question, which is how do we meet the massive computational demands of AI's future? We're going to start with a model that delivers frontier performance at a fraction of the cost. Seems like a trend we talk about all the time on the podcast. And then we'll be exploring a potential architectural revolution and how AI models generate output. Then we'll finish with Google's audacious plan to put data centers in space.

[00:04:29:20 - 00:05:12:15]
Mallory
So starting off with Claude Haiku 4.5. It's Anthropic's latest small model that was just released in October of this year, 2025. What's remarkable about this release is that it delivers performance that would have been considered state of the art frontier level just five months ago or so with Claude Sonnet 4. But it's now one third the cost and more than twice the speed. Anthropic has made Haiku 4.5 available to all free users on Claude.ai, which means anyone can access near frontier intelligence without paying anything. The paid API pricing is $1 per million input tokens and $5 per million output tokens, significantly cheaper than mid-tier models.

[00:05:13:16 - 00:05:58:14]
Mallory
The model supports a 200,000 token context window and can generate up to 64,000 output tokens per response. It also has a new extended thinking mode that provides transparency into the model's reasoning, quote unquote, "process with a visual chain of thought." On coding benchmarks, Haiku 4.5 scores about a 73.3% on SWE bench verified, making it one of the world's best coding models despite being in the small model category. It runs four to five times faster than Sonnet 4.5 while achieving 90% of Sonnet 4.5's performance in agentic coding evaluations. So Amit, what do you think, what's your take on a little old Haiku 4.5?

[00:05:59:17 - 00:06:04:24]
Amith
Fantastic model. I use it pretty regularly. It is incredibly fast and really good.

[00:06:06:01 - 00:06:37:20]
Amith
So the Claude, or I should say the Anthropic team, they put out an amazing series of models with Claude. And I don't think they get quite as much attention as the OpenAI models. They released Haiku 4.5 kind of quietly. It's more their style, I think. But it's something that I use regularly in a couple of ways I think could be really interesting for folks. So the first one is, I'm not sure how many people have access to it now, but Claude has a plugin for Chrome. And have you used that, Mallory, the Claude plugin?

[00:06:37:20 - 00:06:38:11]
Mallory
I have not, no.

[00:06:38:11 - 00:09:49:00]
Amith
So I got access to it maybe three or four weeks ago. And essentially what it allows you to do is it's an agent for Chrome where there's just a little Claude icon in your Chrome toolbar and you click on it. And then Claude can perform tasks in your browser for you. So you've heard about all these AI browsers. There's Comet from the folks over at Perplexity, OpenAI released with a lot of noise, their own browser, which are, by the way, they're all forks of Chrome because Chrome and Chromium is the open source underpinnings of it and then they've taken different parts of that underlying open source stuff and they've layered on top of it, which is great. That's part of the incredible value of open source. But in any event, the question is, is it a brand new browser that you need or do you need a plugin? I'm just using the plugin for Chrome. And the way I use Haiku 4.5 is in this plugin, you can control your browser. So you click the little icon and then you can see the little, it's not like the full Claude experience, but you have like a little window where you can chat with Claude and you can ask it to do things. So what I did the other day was, and Haiku 4.5, because it's so much faster, is really useful for a lot of things. So I use it for some testing. So I spent a lot of my time working with our software development teams across Blue Cypress. And right now we are getting ready to launch the latest generation of skip, which is our analytics product. And we have this AI testing system built into our platform where we can say, hey, here's 20 or 30 or 50 or however many evals. And an eval, if you've heard that term in the AI landscape, it basically means a test. So we're saying, hey, we're gonna take an AI system and we're gonna give it a prompt or a series of prompts and we're gonna test the output. We wanna see what the output looks like. And evals are super, super important because you wanna be able to have some degree of measurability around the performance of not just an underlying model, but an overall system. And you wanna be able to test that over time as changes happen for quality, for regressions, and of course for progress. And so what happens is in the skip system, when we do evals, we'll run 20, 30, 40 questions, which is like a human user going to skip and saying, hey, give me a member retention diagram that allows you to drill down and blah, blah, blah, all these kinds of things, right? And instead of running them one at a time as a user, you just press a button and they just automatically fire off, right? And the question is then how do you evaluate outcome? How do you want to look at these charts and graphs? And it's not just looking at a picture of them, but interacting with them, clicking on them, making sure that the pie chart slices can be clicked on and you can drill down and you can then interact with them in certain ways. You can filter and sort and et cetera, right? And so what we do is we generate these things and then I need something, could be a person, it could be now in this case, Claude, to go in and basically click around this user interface and then see the results. So it'll see, oh, here is this report that was generated, click on the report, does it load? Does it load quickly? Does it look correct? Let me click on the different parts of it, make sure they're all functional. And it's able to give me feedback and a structured output and like a CSV file kind of a thing. And I can download this and say, okay, for the 20 evals, this is how each of them performed.

[00:09:50:03 - 00:10:52:19]
Amith
And so Claude Sonnet, sorry, not Sonnet, but Claude Haiku 4.5 is quite competent at being a really good driver for this type of use case. And I would zoom out and say, well, how does this apply to our friends that are listening from their chairs in the association world? And many times you want to do repeated work with a website, like maybe you have a bunch of applications that came in on a legacy software tool, like you have an older browser-based tool that allows you to download all the applications you got for a volunteer position or something like that, right? And you have to go click a button, download it, save it to a file, click another button, right? It's like a repetitive task. And let's just say that older legacy software doesn't have an API where you can automate it through an API approach. This is where a computer use agent and Haiku 4.5 could be a really powerful tool. So I would encourage people to check that out because it's fast. When you use Sonnet 4.5, it's a little bit smarter for sure, but it's really slow. And Haiku 4.5 rips through these tasks.

[00:10:52:19 - 00:11:22:09]
Mallory
I'm definitely gonna have to try out, we talked about computer use, I guess maybe like a month, a month and a half ago at this point on the podcast. And when we talked about computer use at that time, I think it was scoring like 60% ish accuracy on tasks, which doesn't sound so good, but to your point Amith on that episode, you said, well, in some tasks, it might be near 100% in some tasks. It might be lower than that. That's the average. What percent accuracy anecdotally would you give to the computer use in the experiment you ran?

[00:11:22:09 - 00:12:49:05]
Amith
So I did a very basic test, right? I didn't give it a whole lot of instruction. So, the prompt of course is, your AI's output a lot of times is as good as the prompting. And so the question would be is like how repetitive and how consistent does the user interface look? If it's what I described earlier where you have this, like there's a webpage and it has a list of applications that came in and you need to click on each item in the list, open this page, click on another button to click the download button and save as, that kind of a super repetitive thing. I think it would nail it. I think it would do really, really well with that. Benchmarks are only as good as the overlap with your actual use cases. So you hear about how amazing or how bad models are. You might hear about like a model like GPT-OSS20B, which is like the small open source model from OpenAI. And it's benchmark performance is nowhere near as good as GPT-5 or Claude Haiku or whatever, but it might be really good at what you need done. And so I think part of it is, we don't all need an 18 wheeler to carry like a single pen from coast to coast, right? There are better ways of like taking care of small workloads than using the most massive, most powerful thing, right? Or you don't like load up a jumbo jet with a single pen that you wanna deliver to one customer. You'd use other methods that transport. Same thing with models, but that's what's beautiful about stuff like Haiku45 is, you don't need to throw the highest end model at everything. You might actually be better served by using smaller models for certain tasks like this.

[00:12:50:13 - 00:13:07:11]
Mallory
Sticking with the idea of computer use, because if you have it, I'm gonna go try it. It sounds like it's pretty much available to the masses, to the public at this point. Do you think association leaders listening to this need to have some sort of policy on computer use ASAP for like which types of things it can be used for or not?

[00:13:08:12 - 00:13:59:22]
Amith
I think associations and everyone else for that matter should have an AI policy. And this should be very clear about what they allow, what they don't allow. I do think they should have provisions where people can easily get approval for testing new tools out. The way I would approach it is to say, look, these five tools are approved and we pay for them and they're all allowed and you can do these things, but you have to use our account, like our business account to make sure that the policies with each vendor are set up correctly in terms of retention of information, connectivity to other accounts. That's really, really important stuff. And with computer use, yes, to your point, it's really important to get ahead of this and say, this is here, this is definitely coming for the masses. Let's make sure we're using tools from vendors that we trust. Let's make sure that they're done through our accounts so that we can control the policies that we accept or decline in terms of, because there's lots of options, right? When you set the use of these tools.

[00:14:01:07 - 00:15:47:22]
Amith
So yes, it's very, very important. At the same time, I'm a big experiment and learn quickly kind of person. And so I do think what we should allow people to do is to have experimental access to tools that you don't yet consider mainstream. Make sure that those folks have a little bit of extra training. They're a little bit savvier about understanding what they're doing and give them provisional access. Tell them, hey, try this out for three months and report back. Your job is to give us your feedback and we want to understand these three or four things about Clod's computer use so we can consider if we should add it to our main policy document. It takes time. It doesn't take a lot of time. And by the way, policy documents, a lot of association leaders I talk to are like, I don't want to do that. That sounds like doing any by-law or something, right? And by-laws are the bane of existence for most association leaders because it's basically looks like reading a bill from Congress or something. It's just horribly complex and really, really difficult to parse. That's not what this is. That would be the most useless AI policy ever because nobody would read it. What you want to do with an AI policy is make it simple, make it readable and make it a living document. You want to update it regularly. You want to communicate when those updates are there. Ideally what you do is you have a course where you say, hey, all employees thou shalt complete this 30 minute course, which is our AI policies brought to life, or be creative and take your policy and throw it into notebook LM and create an eight minute podcast and say, hey, you have to listen to it. Every time you update it, give notebook LM the old policy and the new policy and say, hey, create an update episode and just make sure your people are engaging with the content so they understand it. But yes, simpler is better than complex, but you got to do this stuff. If you're not doing this, you're either basically believing that you've locked things down and nobody's using this, which of course is not true. Or you've kind of just said, yeah, do whatever you want, which is actually not helpful because most people want some direction.

[00:15:47:22 - 00:16:42:06]
Mallory
A hundred percent. And in terms of simplicity, something that we've recommended at Sidecar with our staff AI usage guidelines template is having a green light, red light, yellow light system. So kind of quickly, easily categorizing things that you should and can use AI for, red light being, hey, you might want to ask permission through this specific route that we've outlined. I just think for me and Meath with computer use, I feel like there's a lot of potential, which is exciting, but I don't know. This is one where I'm like, if this isn't used well, like I could think of going into our CRM and HubSpot and be like, oh, I'm doing something with this data. Like I'm going to have computer use just handle it for me. And then I don't know, wreaking havoc on our HubSpot. I wouldn't do that, but I'm just saying, if you're a leader listening to this, you probably need to educate your staff on what computer use is and just make sure you have some system that is clearly communicated.

[00:16:42:06 - 00:16:55:24]
Amith
Totally. Mary, I know you've used a lot of marketing tools and you've probably used bulk emailing tools like in HubSpot or MailChimp or whatever, where you've sent out bulk emails. And have you ever gotten kind of nervous right before you hit the send button? Cause you're about to email thousands of people or whatever?

[00:16:55:24 - 00:17:05:12]
Mallory
Yes, I mean, I send myself-- The test email like 10 times and go, okay, if I were another person, like what would I see? It's very nerve wracking for sure.

[00:17:05:12 - 00:19:49:00]
Amith
Or like if you're about to email someone an attachment, do you open it at least once? And sometimes I'll admit I opened it two or three times. So I think that same level of, caution is appropriate with AI tools and but not everyone works that way, right? There are people out there in our world who are wonderful people who don't check the attachment more than once or even they check it zero times. And so those folks, maybe they need a little bit of extra guidance to ensure that they understand the power of these things. Computer use, if you, and this is also where selecting vendors that you're cool with and you know that they focus on safety are, it's a really important thing. Like with Anthropic, they've approached it where there's a whole bunch of settings you have that are very clearly labeled in their agent where you can say, is it totally autonomous? Do you want it to ask for permission? Like what are you allowing it to do? That will not be true for everything. It's not actually difficult to build a browser-based computer use agent. So the way you would do that is you can build a Chrome extension, which by the way, cloud code can very easily build for you. And what you essentially do in these computer use agents, it's actually quite simple. You take a prompt and then the LLM essentially says, okay, let me take a screenshot of what's in the browser. And then that gets fed in along with the prompt and it says, okay, what should I do next? And then it's able to control your computer by positioning the cursor, essentially moving the mouse, right? It's not physically moving your mouse, but it's moving the mouse on the screen. Hopefully it doesn't physically move the mouse. That'd be a little bit creepy. And then it clicks and it also is able to type. And then what it does is says, okay, let me click on this button. Cause it from the screenshot, it's like, oh, there's a button over there. And the user's asking me to click on the button to download the file. So it figures out where the button is through its reasoning over the image, and then it clicks on it. And then it waits, you know, a few hundred milliseconds, right? And it says, okay, what happened? Let me take another screenshot, which is kind of what we're doing in our brains. It's just, we have a lot more frames per second than the agent does. So it moves a little bit slow. That's going to speed up. But that's basically what happens. The next screenshot and said, did I complete the task or not? So what I'm describing is actually very simple and there are open source vision language models that can do what I'm describing quite competently. And people are putting together computer use, like toolkits, there's gonna be an explosion of these things. So much like my favorite thing to pick on are meeting note takers, where you have these, you know, these things that are very similar to malware that just show up in your meeting. And you've never heard of these companies, but yet you have allowed this meeting note taker into a boardroom call or whatever, right? Where you're talking about fairly sensitive stuff, not thinking about it. Same thing is going to happen with this stuff. And it's a potential vector for a lot of malware. So you have to be very, very careful. And even independent of like bad actors, you know, publishing malware in the form of computer use agents, there are plenty of vendors who just don't care that much about things like privacy and safety.

[00:19:50:06 - 00:19:58:17]
Mallory
Or your staff members who will try something out, not realizing, hmm, maybe that's not the best use case. So yes, something to keep an eye on. We'll certainly keep talking about it on the pod, I'm sure.

[00:19:58:17 - 00:20:41:23]
Amith
That being said, these things are super powerful. And coming back to the topic of Haiku 4.5, it is one that I would recommend testing out. Get a handful of your people testing it out. OpenAI's operator agent is also worth checking out. That works with a remote browser, but same concept. And then finally, Comet, and I forget OpenAI's browser name. It's probably called something that needs to be spelled with like the number pad or whatever, but like in any event, OpenAI has a browser that has the computer use agent built in. These are things worth checking out. And Haiku 4.5, the bottom line on that is, highly recommend checking it out. It's a smart model. It has reasoning capabilities and extended thinking. And ultimately it can solve a lot of your day-to-day problems probably three or four times as fast as Sonnet 4.5.

[00:20:41:23 - 00:21:52:05]
Mallory
Shifting gears to diffusion models. For the past several years, transformer architecture has been the dominant approach for large language models. Like GPT, Claude, and Gemini all use transformers. But there is an architectural approach that is gaining momentum, diffusion models. You may have heard of diffusion models for image generation. That's how tools like stable diffusion, Mid-Journey, and Dali work. They start with random noise and gradually refine it into a coherent image. Now that same approach is being applied to text generation. So here's the fundamental difference in how they work. Transformer models are auto-regressive, meaning they generate text sequentially. One word, or we let's say token at a time, left to right, where each word depends on all the words that came before it. Diffusion models work differently. They generate all the words at once in a noisy scrambled state, and then iteratively refine the entire output in parallel through multiple denoising steps until it becomes coherent text. So instead of building a sentence word by word, you're sketching out kind of a blurry version of the entire sentence first and then sharpening it bit by bit.

[00:21:53:06 - 00:22:07:08]
Mallory
Because diffusion models work on the entire output simultaneously rather than waiting for each token to generate before moving to the next one, they can be dramatically faster at inference time, the time it takes to actually generate a response once the model is trained.

[00:22:08:10 - 00:22:36:23]
Mallory
Elon Musk recently stated he believes the majority of AI workloads will shift to diffusion models. He emphasizes diffusion models can work on any bit stream, text, images, audio, video, but there is a UX trade-off. Diffusion models have a delay before the first word appears while the entire output denoises. But then the complete response appears much faster. So Musk questions whether this first sentence delay or first word delay is worth the speed gain since humans read sequentially.

[00:22:37:23 - 00:23:08:16]
Mallory
Last note here is a startup called Inception Labs founded by Stanford, UCLA, and Cornell researchers, including some pioneers behind stable diffusion, mid-journey, and open AI, Sora. They have created Mercury, what they're calling the world's first commercial scale diffusion language model. That's a mouthful. And they're claiming speeds up to 10 times faster than traditional LLMs. So, Amith, what is your thought on the great debate of transformers versus diffusion models?

[00:23:09:20 - 00:23:58:01]
Amith
Well, I think they're both gonna have their place. So transformers are in this class of models, as you just mentioned, that they go sequentially. So they're predicting the next word or the next token. And for those that aren't super familiar with that, the basic idea is that for every word that's come in a sequence, the next word is predicted based on essentially statistics. And so that concept is fairly easy to understand. I think it's kind of logical for how people process information, at least the way we think we do at some level, but it also is very linear in nature. It's sequential fundamentally. So there's some significant disadvantages to it. You can get to that first word nearly instantly because that's just the first token or the first sequence of words that you start reading. And you can stream that out of the model into the user interface. And that's how you start seeing stuff appear so quickly.

[00:23:59:02 - 00:24:21:11]
Amith
Whereas, as you said, with diffusion models, it does generate the whole output at once. The overall output is faster, but it takes a moment to see the first word. I don't think it's gonna matter. Like that particular difference probably will ultimately be irrelevant unless you're dealing with a massive request where it's like, generate 10,000 words and it takes five seconds, 10 seconds to do that. But inference speeds are increasing across the board.

[00:24:22:12 - 00:25:19:11]
Amith
We've had a lot of conversations on the pod about companies like Grok with a Q and Cerebrus and also the work that Google has been doing with their more recent TPU architectures and all of these things are speeding up the way that we do inference. And so that's gonna benefit all the architectures. I think the bottom line is not yet known in terms of which model type will necessarily prevail, but I also think these things learn from each other in the sense of diffusion is the technique that's been mostly in the image and video world and it's now being applied to text. That's really exciting as it gives us other angles to pursue. I think the limitations that come from everyone pursuing the same idea means you're kind of in a little box and there's a big world out there and there's a lot of ideas. So it's cool to see something come to scale that has a lot of promise. I mean, the benchmarks are pretty impressive for this thing. And that's just one of many, many people that are working on these DLMs on the diffusion language models.

[00:25:19:11 - 00:25:25:21]
Mallory
So am I hearing it right that you think there's a place for both or do you think one will prevail over the other in the long term?

[00:25:25:21 - 00:27:20:17]
Amith
You know, over the long term, I think you're gonna have things that are probably very different looking than either of these architectures. I think that each architecture has its pros and cons and we're learning at a really rapid pace. And so I think there are novel techniques that are in the research lab that are now coming to fruition. There's things that are also going to like come up in the next five, 10, 15 years that we have no idea about. So these models still have some significant limitations in terms of their kind of fixed training set. So once you train the model, it is what it is sort of thing. And you can do post-training and additional rounds of that, but it requires a computer science PhD and a lot of GPUs to run that kind of stuff. So the models are kind of fixed in time. That doesn't change with diffusion models. They just have a different approach to the way the algorithm essentially generates its predictions. What I'm excited about is really the idea that there could be something with good enough output for a lot of basic use cases that's incredibly efficient. So yes, we want us to push the boundaries of what frontier AI models can do. So GPT 5.1 is supposed to come out in the next couple of weeks. And that's supposed to have the next set of benchmarks in that world. I'm sure the folks at Entropic have new things coming out. Gemini 3.0 should be imminent. These frontier leading edge models are always exciting. And so if a diffusion model is able to compete at that level, that's awesome. What I'm more excited about is kind of thematically aligned with when I talk about small models. When I say, hey, you know, GPT OSS 20B, which is the small variant of OpenAI's open source model, that you can run this on your Mac. You can run this actually on a lot of PCs. It's actually about as good as GPT 4 Turbo, the very first GPT 4 Turbo that we got. Not the first GPT 4, better than that, but like the GPT 4 Turbo we had in like the fall of 2023.

[00:27:21:19 - 00:28:35:16]
Amith
And what's really impressive about that is it's tiny and it's instant and it can run on your phone or on your computer, right? Or pretty close to running on a phone, I should say. So with a DLM, the idea would be like, okay, is this another step change in efficiency where the power of good reasoning is available, good language models and basic reasoning is available in an even smaller form factor. Can we get a model that works for the workloads we need to operate with a tiny amount of energy and a tiny device? That would be really exciting because, you know, the fat middle of the workloads that you think about the distribution of like a classical chart, where you see kind of, you know, the deviations on the right and the left falling off pretty quickly, but the fat middle of any of these curves really is, you know, anywhere from 60 to 80% of the need. And so if we can put that on a smaller model, it gets really exciting. What this means for associations ultimately is lower cost and more flexibility in where they run these workloads. So if you think, hey, I have a lot of sensitive data, I've got PII or I've got healthcare data or I've got, you know, proprietary secrets and I just can't put them somewhere else, you're gonna have more and more options available to run these workloads, you know, at your will essentially.

[00:28:37:18 - 00:28:53:22]
Mallory
As someone actively developing AI products for associations, if you and me were to switch one of the models that you use for skip, let's say to a diffusion model, would that change anything on your end, assuming it matches capability or is it, does it not matter is what I'm asking?

[00:28:53:22 - 00:29:15:14]
Amith
I mean, from our perspective at the application development layer, when you're working with lots of different models, what you need to do is very carefully test whenever you switch a model. So think of it as like, you know, the engine in a car. If I say, hey, Mallory, you know how to drive a car, I don't know if you've driven an electric vehicle, but if you haven't, if I always say, hey, you need new training to drive an electric car, well, not really,

[00:29:16:16 - 00:30:05:12]
Amith
it still works pretty much the same way. You have to think slightly differently about it in terms of the way it accelerates and breaks, particularly regenerative braking, kind of like you take your foot off the accelerator and it starts slowing down a lot faster than a gas vehicle because it's using that cycle to basically generate power. And it's just kind of weird initially. It doesn't change how you drive really, but it's just, you have to be aware of it. So there's subtleties like that, that might come up in using a DLM, but that's also true when you move from one transformer-based language model to another transformer-based language model. You ultimately have to be thoughtful about your prompts and testing and stuff like that. But the short version of the answer is not much. We haven't tested Mercury yet. We plan to in the next couple of weeks to see how it performs. But from an application development layer, which is where associations need to be thinking about, you have the option to use these things pretty easily.

[00:30:06:20 - 00:30:12:21]
Mallory
So the key takeaway here then, as you said, is lower cost potentially and more flexibility, which are both great things.

[00:30:12:21 - 00:30:59:11]
Amith
Yeah, it's just on that trend line that we keep talking about that AI is becoming less expensive, faster, more accessible, and it is worth noting that this is a totally different approach. It's still a neural network. It works with some of the same principles that we've talked about a lot on the pod, but it uses a different approach to its fundamental prediction algorithm. The way it does prediction is what you described really well earlier. It basically looks at the entire picture, so to speak. In this case, it's text, but the entire picture starts off with random noise and then basically makes guesses with each pass trying to get to the outcome it's trying to go for. So each time it changes the pixels around in an image, it's saying, "Hey, am I closer to or further away for my goal?" And it keeps trying to like denoise it through that process. And that's what it's doing with text as well,

[00:31:00:12 - 00:31:27:15]
Amith
which sounds totally crazy to people compared to the idea of like, "What's the next word?" in a sentence. You know, well, for example, people say that one of the classic examples of teaching people how transformers work is you say once upon a, and then everyone says time, because that's very obvious that that's very high probability gonna be the next word. That's kind of how we think about it. And diffusion models don't work that way. It's kind of foreign to us. It's almost alien, but it's pretty damn cool.

[00:31:27:15 - 00:31:58:19]
Mallory
I was trying to think of an analogy for this just for my own benefit, and I don't know if this makes sense, so I'll share it with the audience and you can be the judge of that. But I've been into like thrift shopping lately, like going to antique stores and stuff. So I was thinking of walking in, maybe you have the goal of decorating your living room or furnishing your living room or something, and you see all this noise, and then you're putting pieces together. Iteratively seeing how they fit, you might put something back up that doesn't really match with this vibe. I don't know, that's like the best analogy I could come up with with Amith, but it helps me think about it.

[00:31:58:19 - 00:32:38:10]
Amith
Yeah, and I think that helps. It's like, and think about like a teenager's bedroom or something like that. I have an 18 and a 16 year old. And so that is kind of like random noise when you look into their bedroom and you're like, I just turn around and leave. That's my approach to it. But ultimately, they start organizing it. Like what's the end state they're going for? And each change that's made, is it closer to or further away from the goal? So when we talk about like, how do we get closer to finding the best next token in the sequence predictor, or in this case, like the end state we're trying to predict, we're trying to determine are we better off or worse off from having taken the step? And I think you can do that in your example as well.

[00:32:40:05 - 00:33:16:07]
Mallory
Shifting gears a little bit, not really, but we're talking about a moonshot here. A moonshot from Google, which is Project Suncatcher. Maybe it's a sunshot. In November of this year, 2025, Google announced Project Suncatcher, a research initiative to build scalable machine learning compute systems in space, using constellations of solar powered satellites equipped with Google's tensor processing units or TPUs. The vision, move computation into orbit where satellites can harness nearly constant sunlight for power and operate as a distributed data center in space.

[00:33:18:00 - 00:34:22:10]
Mallory
A video on LinkedIn posted that the sun emits more power than a hundred trillion times humanity's total electricity production. He described Project Suncatcher as inspired by their history of moonshots from quantum computing to autonomous driving, emphasizing that like any moonshot, it's going to require us to solve a lot of complex engineering challenges. And the engineering challenges are significant. Google tested their latest generation TPUs in a particle accelerator to simulate low earth orbit radiation levels. They survived without damage for up to five years, but major challenges remain like thermal management. How do you cool processors in a vacuum with no air convection, radiation shielding over longer timescales on orbit system reliability and efficient ground communication for data transfer. Google plans to launch two prototype satellites in partnership with planet by early 2027 as the next major milestone. And Google CEO was clear about the experimental nature. More testing and breakthroughs will be needed as we count down to the launch.

[00:34:23:10 - 00:34:51:13]
Mallory
This project represents a response to one of AI's biggest bottlenecks energy consumption and data center capacity. She actually talked about a bit at digital now just last week. The scale of compute needed for frontier AI models is growing exponentially and terrestrial data centers face constraints around power availability, cooling requirements and physical space. Moving compute to space leverages an abundant constant energy source and eliminates many earth bound infrastructure limits.

[00:34:52:15 - 00:34:59:11]
Mallory
Amith, what do you think this moonshot goal from Google is telling us right now about energy and infrastructure constraints for AI?

[00:35:00:20 - 00:36:12:12]
Amith
Well, I mean, clearly there's a large amount of demand and that's growing at a pace none of us can really grapple with or even contemplate the demand for, not compute, but the demand for the outcome of that compute, right? The intelligence that we all seek to weave into our businesses and our daily lives is really stunning because when you think about it, there's really no limit to the things that we can do with this commodity of intelligence. When you think about it that way, it's kind of an abundant unlimited need. And so the question is, is how do you solve for that demand? Well, at the moment, the way we do that is throwing enormous amounts of capital and energy and compute at it. And ultimately it's really, really, really inefficient the way we do it right now. It's amazing what we can do, but it's really inefficient. So I think two things are gonna happen. One is there's gonna continue to be this geometric progression demand, which is gonna lead to people making massive investments and finding new energy sources, new ways to deploy the compute into space, into the deep ocean, into all these different places that have advantages and disadvantages, obviously. At the same time, as we just discussed in our second topic on different approaches to inference, there will be massive progression in efficiency.

[00:36:13:23 - 00:38:30:11]
Amith
There's lots of fundamental science happening in AI and in other areas. We talk about material science at times on this pod in terms of energy storage and energy distribution and being able to do things at a level in a way that's not really understandable right now. Like how does this stuff all work? We live in the world of bits, but obviously the world of atoms powers the world of bits. And so that's where we have a lot of constraints. And we're all just learning, right? We have to have some humility here and realize that all of science is a fairly new endeavor and computer science is an extremely new endeavor and artificial intelligence as a branch, kind of at the intersection of computer science and neuroscience and biology and all those other interesting fields is a really new thing. It's not 10 years young, by the way, like a lot of people think. It's been around about 70 years, but still, it's a very new branch of the field. And so because of that, we have to remember, we're just starting out. So here's why I share all that. We should not assume that there is a nonlinear progression, exciting exponential growth and demand, while at the same time, the constraints of how we deliver to meet that demand are the same, meaning it's the same kind of architecture. We're gonna have major leaps there. At the same time, I think there's gonna be orders and orders of magnitude of growth and demand. So we're gonna have to figure out how to deal with the power and what I'm describing doesn't exist yet in terms of this type of radical efficiency gain. So I think we need all of it, it's the short version, and I'm really excited to see people like Google experimenting with this. SpaceX, which is known best for their launch vehicles, putting satellites into orbit and all the different crazy things that they've done, has really ushered in a new era of space exploration and private space as an industry. And there's tons of companies doing launches and building satellites. And SpaceX has the Starlink internet access capability which is based on low-Earth orbit satellites. They're dramatically improving the next generation of those devices, which will also include compute capability. So I think a lot of people are gonna pursue this. And I'm super pumped about it because it'd be great to get those workloads off of our planet and to utilize totally carbon neutral free energy that's available through that 100 trillion X, our needs that was described.

[00:38:30:11 - 00:38:45:22]
Mallory
I'm also, I'm excited about Project Suncatcher. And on the topic of space, Amith, I was hoping you could talk a little bit zooming out to kind of leadership and strategy about the importance of having a moonshot goal or goals for associations and what that might look like.

[00:38:45:22 - 00:41:21:20]
Amith
Well, I think that it literally is what we were talking about in this particular topic is putting stuff in orbit. And you think about what enables that. Well, up until the last, really the last 10 years but the last 15, 18 years, where it took a crazy idea and a relentless entrepreneur and ultimately the whole industry of people to pursue making this affordable. Because if you rewind in time and say, if everything else in the world was the same but we were 20 years behind where we are today in terms of launch vehicles and the cost of putting a kilogram into space, right? And that's one metric has gone down by I think a thousand fold in the last 15 years. So it enables so much more innovation. And that requires people being willing to completely break the way that they've thought about, you know, this idea of how do you get something into space? In the case of SpaceX, which is now like the norm, idea of a fully reusable launch vehicle was not a thing. You know, every single rocket was thrown away. Like the entire thing was chucked and it's like saying, oh yeah, we used this laptop to produce a really cool spreadsheet. And at the end of the day, I'm gonna throw that laptop in the trash. Not exactly the way you'd probably wanna do, right? Pretty inefficient, but that's how the space industry worked and they're moving towards much higher usability. And it's a really hard engineering challenge. My point would be that we are stacking exponential improvements on top of each other. And so the moonshot idea, coming back to your question about how it applies in association land, to me, it is about being willing to take a shot and to being willing to say like, what would it look like if we could do X? X being some kind of a project that you think is totally unattainable. You know, Sundar Pichai and the team at Google, they have a lot of money and a lot of brilliant people and a lot of other resources. They do not know how to do this. Suncatcher does not have a proven known path. It may fail, it probably will fail. They know that, right? That's their whole point is they're gonna figure out these engineering challenges and some science challenges along the way. And that pays off over time if you do enough of them and if you're relentless about it. And if you're also smart about when to kill these things, Google's had plenty of failed moonshots. And I think associations can be inspired by this, but also learn from it. And your moonshots might not be a literal, put something in orbit kind of thing. Most associations probably won't find an application for that, at least right now. But you can talk about how there can be total transformation of your engagement model, ways of doing things at a scale that you might think totally unattainable, right? Stuff that you do not know how to do. If you know how to do it, it's not a moonshot.

[00:41:21:20 - 00:41:31:10]
Mallory
And I know we talk about BHAGs as well. I mean, big, hairy, audacious goals. So the difference between those is a BHAG is more attainable like in the near term-ish.

[00:41:32:16 - 00:42:05:16]
Amith
I think of a moonshot and a BHAG being pretty similar. I mean, they're both, you know, classically 10 to 20 year goals. So they're goals that are like within our lifetime kind of thing. But they are intended to be, you know, perceived by most people as unattainable. There has to be some fundamental capabilities that you do not possess. And fundamentally you have to be inspired by it, but also not know how to do it. If it's more of like a three year window of time and you're saying, "Hey, like we wanna paint this picture of what the next three years will look like." Oftentimes that's a little bit more practical, a little bit more like extending the current model.

[00:42:06:16 - 00:42:34:05]
Amith
But also in AI timescales, you know, BHAGs I think are compressing as well. Like, you know, our stated mission for Sidecar of educating a million people around the globe on AI in our vertical is a big number. And it's probably represents a third of the workforce of the global association employee community. And we wanna go after that. And we don't know exactly how to do it. We have some ideas, but that's kind of BHAG-ish because we don't know exactly what the mechanics will be to get to that audience and help them in their journey.

[00:42:36:00 - 00:43:31:22]
Amith
I don't know that I'd classify that as a moonshot. It's too small probably of a goal and it's too clear to me of like, there are some pathways we're going, we're not exactly sure how to do it. It's not like saying, "Hey, what's our business plan for 2026 to achieve X, Y, and Z, where we have a very clear set of priorities, we have OKRs." You know, this is a little bit further out, but to me moonshot would be something even bigger than that, right? So broadly speaking for the Blue Cypress family of companies, our goal by the end of the decade is to make associations as powerful as the biggest companies on earth. So we define that as powerful as the Fortune 500. And to us, that's transformative, it's inspiring, it's essentially saying, "Hey, we want this community to no longer be the underdog. We want David to rise up against Goliath kind of thing." Not that corporations are anyone's enemy from association land, but the idea is associations have for far too long said, "We don't have the resources to go and do what Amazon and Netflix and Microsoft do, and now you do, so go do it."

[00:43:31:22 - 00:43:36:22]
Mallory
I actually don't think I've heard the Blue Cypress end of the decade goal. I mean, that's awesome.

[00:43:38:00 - 00:43:46:03]
Amith
Yeah, we're pretty excited about it. It's part of the strategic planning process we've been doing this year, but the idea of associations being as powerful as the Fortune 500,

[00:43:47:04 - 00:44:11:19]
Amith
it's somewhat striking to most people because they're like, "What are you talking about? Like my association has a $7 million annual budget, not, you know, 70 billion. So how are we going to do that?" And then the leveling function is AI, right? We have the ability to remove the constraints that have to do with the dollar amount and enable you with your creative mind and your willingness to experiment, to take moonshots and become that powerful. It is possible to do that in the world we live in today.

[00:44:12:22 - 00:44:32:07]
Mallory
I love that. Well, everyone, these three topics today connect. We've got Haiku 4.5 showing frontier capabilities becoming accessible through optimization, diffusion models showing how architectural innovation could deliver massive speed improvements, and Project Suncatcher showing how far companies are willing to go to solve this infrastructure

[00:44:32:07 - 00:44:37:18]
(Music Playing)

[00:44:48:11 - 00:45:05:10]
Mallory
Thanks for tuning into the Sidecar Sync podcast. If you want to dive deeper into anything mentioned in this episode, please check out the links in our show notes. And if you're looking for more in-depth AI education for you, your entire team, or your members, head to sidecar.ai.

[00:45:05:10 - 00:45:08:16]
(Music Playing)

Tags:

Post by Mallory Mejias
November 21, 2025

Mallory Mejias is passionate about creating opportunities for association professionals to learn, grow, and better serve their members using artificial intelligence. She enjoys blending creativity and innovation to produce fresh, meaningful content for the association space. Mallory co-hosts and produces the Sidecar Sync podcast, where she delves into the latest trends in AI and technology, translating them into actionable insights.

Rapid Fire AI Updates & The Coming Energy Reckoning | [Sidecar Sync Episode 96]

32 min read

The Race for AI Speed, Efficiency, & Infrastructure: Haiku 4.5, Diffusion Models, & Project Suncatcher | [Sidecar Sync Episode 109]

Read the Transcript

Tags:

Free Intro to AI Webinar
Sign Up Today!

Categories

Recent Posts

The Race for AI Speed, Efficiency, & Infrastructure: Haiku 4.5, Diffusion Models, & Project Suncatcher | [Sidecar Sync Episode 109]

Read the Transcript

Tags:

Related Articles

Breaking Math with Google’s AlphaEvolve | [Sidecar Sync Episode 84]

Rapid Fire AI Updates & The Coming Energy Reckoning | [Sidecar Sync Episode 96]

Free Intro to AI Webinar Sign Up Today!

Categories

Recent Posts

Free Intro to AI Webinar
Sign Up Today!