#152: live translation on phones, Meta aims at AGI, AlphaGeometry, political deepfakes
With topics ranging from breakthrough hardware capable of live translation to advancements in AI ethics and safety, the discussions provided insightful perspectives for the marketing audience.
Note: The bullet points are a summarization of the detailed discussions and items mentioned by the hosts in this engaging and conversational episode of the podcast.
Hardware Innovations
Galaxy smartphones by Samsung introduced live translation for calls and texts, a leap towards overcoming language barriers. The technology, backed by Gemini Nano, facilitates real-time translation in multiple languages.
Rabbit R1 is pegged as the new gadget blending AI smart assistance with conventional hardware. Its integration with Perplexity AI optimizes conversational translation by adapting communication styles and leveraging on-device inferences.
Google's updates to Chrome introduce AI-driven tab organization and customizable themes, along with AI-supported text drafting assistance.
Adobe's Premiere Pro features efficient AI-powered audio editing tools, which automate mundane audio tasks and enhance clarity in dialogue recordings.
Autonomous Vehicles and Robotics
Waymo seeks to deploy its driverless robot taxi service in Los Angeles, marking a progression in the rollout of autonomous vehicles in urban settings.
Tesla releases FSD v12, hoping to achieve a significant quality jump in its self-driving solutions through a shift to neural net-driven decision-making.
Figure's humanoid robots gain commercial traction with a new agreement for deployment in BMW Manufacturing, signifying real-world applications for the technology.
Policies and Ethical AI Development
Concerns over OpenAI's clout with Microsoft lead to potential antitrust investigations by DOJ and FTC, spotlighting the delicate balance between innovation and market fairness.
The use of AI for political impersonation prompts OpenAI to suspend the developer behind the Dean Phillips bot, evidencing the emergence of strict enforcement of AI ethics in election campaigns.
Mark Zuckerberg's goals for Meta include harnessing artificial general intelligence responsibly, underscoring the tech giant's ambition for state-of-the-art AI development and potential implications for the broader AI community.
Developments in AI Research
DeepMind's Alpha Geometry system boasts Olympiad-level performance, showcasing neuro-symbolic AI's potential for solving complex mathematical problems.
State-space models explore advancements in vision-related tasks, promising another step towards efficiency and efficacy in neural network architecture.
Depth estimation research benefits from large-scale unlabeled data, reflecting the potential for AI to enhance spatial understanding in various applications.
As the AI landscape continues to evolve, the podcast serves as an essential recount for enthusiasts who seek to keep pace with significant strides and dynamic discussions in the field.
Read the full discussion in the transcript below 👇
#152: live translation on phones, Meta aims at AGI, AlphaGeometry, political deepfakes
Hello, and welcome to Skynet Today's Last Week in AI podcast, where you can hear us chat about what's going on with AI. As usual, in this episode, we will summarize and discuss some of last week's most interesting AI news. You can also check out our Last Week in AI newsletter at lastweekin.ai for articles we did not cover in this episode. I'm one of your hosts, Andrey Kurenkov. I finished my PhD focused on AI last year, before I said it was earlier this year, but now it's last year. And I now work at a generative AI startup. And I'm your other host, Jeremy Harris. I'm the co-founder of an AI safety startup called Gladstone AI. We do a bunch of national security meets AI stuff around extreme risks from the technology. And yeah, I mean, if you ever want to contact me too, I've mentioned this in the previous podcast, but I've had reach out since and people having trouble finding my email. So hello at gladstone.ai will absolutely do it. And you can also reach out to me on, I'm on the Twitter or the X or whatever it's called now. But yeah. And we also do include the contact email if you want to send us any thoughts or suggestions or feedback in the episode description every time. And I'll also go ahead and include Jeremy's email so you can make sure it reaches him. And just a quick shout out, once again, to a couple of new reviews. We love to see it. There was a fun new review on Apple podcasts about someone listening to this on their morning run in Frankfurt, Germany. So that's fun to imagine, I guess. We do go pretty fast, so maybe it goes well together, I'm not sure. And there was another fun, cool review by George, who just is very nice, says we are supposedly fun and thoughtful, which is, we didn't try for it, but we'll see. Well, we have a lot of news this week. There was surprisingly quite a lot that happened. No major sort of world changing news, but a lot of kind of individual, somewhat significant bits and pieces. It will be moving fast and we do hope you keep up and it makes your run more enjoyable. So starting with tools and apps, the first story is that Samsung's latest Galaxy phones offer live translation over phone calls and texts. And that's the story. In the S24 line of smartphones, there will be the ability to receive calls in a language people don't speak and receive a live translation of the call, both audibly and on screen. And this is following up. Also, Meta has a similar feature in their Meta smart glasses of live translation. If you're talking to someone in person, actually we have these speakers in their glasses that do live translation. So it seems significant to start getting hardware that does offer live translation from any language or not any language, but from many languages to many other languages. Language barriers are really going to start becoming less of a thing, it feels, in this next year. Yeah. There are also, there's a lot of interesting stuff that goes into this too, because anytime you're doing conversational translation, there's a bunch of stuff like word choice that's really tricky, right? Because depending on the tone, you might actually mean a different thing. You might have sarcastic tone that changes the meaning of things, like a more whimsical tone. And Samsung's actually doing that. They're allowing people to pick different communication styles and have them pre-programmed into their, actually into the text version of this too. There is a text version of this translation service as well. So you can let it know that, hey, let's get into a casual mode or whimsical mode or whatever. So kind of an interesting Meta parameter that you can tweak there. I think one of the most interesting things though, technically about this, is just the blazingly fast inference speed, inference time that you need in order to do real time translation, right? Because in order for it to be natural, it's basically got to translate your words right as they're coming out of the oven. And that requires very low latency. Apparently this is all happening on the device too. So and this is for privacy reasons. So you want all the computations associated with the translation to happen on the phone itself. So they're not beamed to some central server. But that's really interesting. So all this blazingly fast translation, all these inferences are being done on the Edge device on the phone itself. So I think a really interesting question is, what does the backend look like? How are they pulling this off at speed? This is I think the fastest live translation thing I've heard about. It seems like it's pretty much instantaneous. So I think a pretty important phase transition in just like the effectiveness and marketability, market readiness, I should say, of translation. Exactly. And with regards to the details, it does seem like they're using Gemini Natto, the smallest variant of Google's Gemini that was designed to be on device, designed to go into Android phones and offer these sorts of features really of like on device, super fast translation about access to the cloud or anything. So it'll be very interesting to see how good this is and if having this super fast model that's a slimmed down large language model can still handle very robust translation. And just to be very concrete, they offer 13 languages starting out. So it'll be Chinese, English, French, German, Hindi, Italian, Japanese, and some other ones. A lot of bigger languages in terms of, I guess, population seem to be covered here. So yeah, pretty exciting, I would say, as far as news of on device AI and the sorts of things we'll get in our phones in the near future. Yeah, it's cool to see Gemini Nano kind of get its, it's not big debut, but like this is a really cool concrete application. And like I said, with the backend, it makes me wonder how this is integrated with the hardware, like how, what kinds of optimizations are they running to make this work so smoothly? Yeah, really cool and cool to see Gemini. And speaking of cool AI hardware, the next story is on the Rabbit R1, this little gadget that was just introduced a couple of weeks ago and had a lot of people excited. And the new story here is just specifying that this device will receive live info from Perplexity AI's answer machine. So the Rabbit is this little handheld kind of quasi phone. It has a screen and it has a camera and ability to talk to it. And you can think of it as a sort of AI smart assistant in your hand that you can tell to do things. And the pitch is it would just sort of figure out how to accomplish whatever you wanted to accomplish without you having to scroll and type and all that sort of thing. And yeah, with this knowledge that they'll be providing live info from Perplexity, basically what that means is you can now ask it any question and it will be using this sort of advanced AI driven search that Perplexity offers to be able to reply to any query, kind of a more advanced version of what you might have with existing assistants like Siri, where you can ask it various questions and it can try to answer using all the techniques like knowledge graphs. Yeah, it seems like Perplexity's big differentiator really is the ability to integrate real kind of concrete facts into the response. Obviously, we're seeing that more and more with Bing Chat, I think being the first really to do this at scale. But yeah, it's kind of a competitor to ChatGPT with a little bit better kind of on the grounding side is at least the argument. And this does seem like it's part of a deeper partnership between Rabbit and Perplexity. So apparently the first 100,000 Rabbit R1 purchases are going to come with one year of the Perplexity Pro subscription. So it's sort of like a, it is a deeper business partnership, though that plan also includes access to newer LMs that include like GPT-4 and it's normally 20 bucks per month. So kind of cool. I think at this point, this whole Rabbit thing is really taking off. I guess it was late to the game. I caught on to the announcement that came out. I think this was the week that I was in DC that I was traveling. But yeah, I mean, it's like a little Tamagotchi type thing. Maybe that's an analogy only millennials will get here. But did you play with Tamagotchi? I'm aware of it. I was around that time, yeah. It's always terrible when somebody tells you like, I'm aware of it, like I'm such an old person and this is something, it's like, it wasn't my generation, but I've heard of it. Well, yeah. Anyway, it kind of reminds me of that, like one of these pocket health things. And apparently they're all sold, or the first 50,000 are all sold out now. So they've been doing super, super well. So maybe, you know, maybe that'll be a nice little bump for perplexity. That's right. Yeah. Similar to the AI pin from Humane that was also kind of pretty hyped up. This is kind of trying to imagine a new gadget, a new device that is kind of like an AI version of a smartphone. It seems where, you know, the main thing is to carry around an AI with you in a physical embodiment and let it do things for you. So it'll be very interesting to see once people do get their hands on this and the AI pin, whether that in practice is something that is a game changer or people will start having all the time or if people just, you know, keep using their phones that already can do a lot of this. So yeah, we'll see. Yeah. I'm really curious about that piece, like the form factor, right? Do we really need a new form factor for an AI specialized device? I wonder if part of this has to do as well with, well, the AI hardware, you know, they can ditch basically all the unnecessary phone-related hardware, you know, depending on what they consider necessary, unnecessary for this, but, and then replace it with just AI optimized stuff. It's still kind of interesting because we do have our phones and maybe I just need to get a Rabbit R1 and see how it works. Is it like the iPhone for AI or is it just, you know, another, another of the fad? On to the lightning round. First story is Google is using AI to organize and customize your Chrome browser. So there were a couple of new features here. There's a new tab organizer feature that uses AI to group similar tabs together, which is provisionally for people who have hundreds of tabs, these chaotic users, yeah, which, you know, I've been there sometimes, but I try to avoid it. And besides that, the Chrome theme store is also getting an AI upgrade. You can use a text to image model to automatically generate a browser theme based on their preferences And the last thing is there will be a feature called Help Me Write, which will use an AI to generate a first draft of texts for users in any text box on the web. So a few, yeah, kind of smaller features being thrown in there, but does show that AI is being pushed throughout Google's products in various ways. And actually Google, somewhat surprisingly, a little late kind of entering this space, we've seen obviously Microsoft really leading the way with their integration of Bing and you know, Bing chat and all that, but there have been other browsers all the way going back to like Opera, which, you know, it's been a while for me at least. But yeah, Google kind of coming in late here, they, I don't know, they've had this catch up vibe lately, at least where lately is like the last six months. And this is sort of an interesting next step for them. It's an obvious next step and it's tactically obviously very necessary, but kind of interesting to note that it has been a little while for them to be getting into this space. Next up, Adobe's new AI-powered Premiere Pro features eradicate boring audio editing tasks. So Premiere Pro is one of the main applications for video editing, one of the leading ones that are used professionally and just in general by people who make a lot of videos. And now there will be this AI-powered audio editing feature that will make it easier to do several things. So one of those is audio category tagging. It will automatically identify and label clips as dialogue, music, sound effects, or ambient noise. And besides that, there will also be automatic resizing of waveforms and various kind of little adjustments, updating colors for clips for better visibility and stuff like that. But I guess one last thing to note is the enhanced speech feature, which will improve the clarity of poorly recorded dialogue and will presumably be pretty heavily leaning on AI for that bit. And up next, we have Waymo looks to launch full fleet of robot taxis in LA. So this is a reference to Alphabet, Google's parent company, their autonomous driving unit is Waymo. And they've been around for a while, really doing the rubber's turning to meet the road in a literal way. They're actually starting to expand their driverless robot taxi service in LA. And they've been testing in that market for a while now, for about a year. They initially started off in San Francisco, and now they're looking for a license to expand out into LA. San Francisco, of course, is like your default starting point for most tech things, just because that's where the companies are founded. And so migrating to each new city has a bunch of policy and regulatory hurdles to factor in. It also has a bunch of new risks, because the cities can look different. So you need to kind of patch some edge cases that can come up with your self-driving cars. So anyway, it's sort of interesting that this is now happening. They have a permit right now to operate 250 robot taxis in SF in San Francisco. And at any one time, they're operating about 100 of them. So unclear whether it's going to be at the same scale, but this is kind of interesting that we're finally starting to see this broader rollout of self-driving cars. Yeah. So they're basically expanding the paid version of this. They've been, as you said, already testing there. And now it sounds like they want people to start being able to use the app and call it as has been the case in San Francisco for, I think, much of last year. They've been kind of letting people in for a wait list, letting people hail a robot taxi to pick you up. I've been doing it for several months now and really enjoying it. So yeah, Waymo seemingly continuing to kind of slowly expand and probably this year going to try to commercialize much more and more markets beyond where we've been testing and making sure things work. So far, no terrible incidents with Waymo in six months of being commercially available in San Francisco. Yeah, it's kind of impressive that they're actually rolling out without any major incidents. And on that topic of self-driving, the next story is Tesla finally releases FSD v12. And according to the author of this article, that is its last hope for self-driving. That's in the title of the article. But there is a reason why that is in the title. So this full self-driving beta v12 update, that's what FSD is, their self-driving future suite for Tesla, is a pretty major update that's been kind of hyped up for a while by Elon Musk and the company, where the big deal is they're replacing a lot of the stack, a lot of the implementation of self-driving away from kind of more handwritten code and logic to full on end-to-end neural net AI, where they have videos coming in from the sensors and the AI gets to decide what to do. Much simplified implementation by a lot. And if you have really good data, potentially leading to human level driving, although it's unclear whether that'll be the case. So there is a case to be made. If this doesn't work, then what will Tesla do without getting new sensors, without, et cetera, et cetera? That is kind of justification for that last hope for self-driving, where so far FSD has been pretty good, but not really reliable. You have to be pretty careful when driving and pretty attentive from my own experience and from what I think the general consensus on FSD. So yeah, it's starting to roll out in beta and we'll have to see how it looks. Yeah. I don't know if I agree with the last hope bit. I mean, I think we're looking at a much more incremental kind of improvement in general self-driving car capabilities. This I think is significant from an architectural standpoint. As we move into everything is a neural network territory, everything is trained rather than hard-coded, rather than the kind of Frankenstein monster of some neural networks and then some hard-coded rules. It does kind of make the whole system more uniform. It also makes it in some ways more unpredictable, of course, because neural networks are not hard-coded rules and they introduce kind of interesting failure modes when you have out of distribution inputs that the system hasn't been trained on or doesn't know how to handle. That becomes more of an issue. And it does seem based on the, at least the report here, that it's kind of a mix. So there are apparently cases in which V12, this latest version, gives you a smoother ride and more natural ride in some cases, but it seems to get dumber in others. And this is just sort of what you get, right? These systems have weird failure modes. So I think what we can think of this as is kind of a resetting of the floor on the capabilities of full self-driving. And as more data comes in, as better algorithms come in, as more computing hardware starts to be deployed in this direction, we can expect an incremental improvement in self-driving capabilities. I think that's more what's happening here. We're kind of getting a reset. We're going to be placed somewhere on some, effectively, some kind of scaling curve, and we'll just ride that scaling curve upward as the hardware and the software get better. But yeah, it's an important fundamental shift in the architecture for sure. That's right. Yeah. To be clear, I wouldn't say lost hope. It's probably overdramatic. It does mean that FSD has been incrementally trying to get to better and better and more reliable self-driving for years. They've had it in development for, I believe, about seven years now. So this is a major push for sure to try and achieve a real jump in quality. And it will be, as you said, a big part of a challenge for self-driving is all these edge cases, all these tricky little individual things that aren't typical driving. And with a neural net, one of the drawbacks is you can't change the code and just handle a case that is weird, right? It's kind of hard to edit, so to speak, the behavior, and you might get unexpected things. So it'll be interesting to see if there are a lot of weird cases or if, in general, there's enough data already from Tesla, which is possible. They do collect data from their fleet. So yeah, it's a major deal for FSD for sure. And moving on to applications and business, we're opening with a story called Opening Eye CEO Sam Altman is still chasing billions to build AI chips. So Opening Eye CEO Sam Altman, he is known for his interest now, his recent interest in the process of chip making, funding chip making efforts. So as we've talked about before on the podcast, when we talk about hardware, there are kind of two relevant stages of the hardware development process that are worth noting here. One is you can be a company like NVIDIA that designs cutting edge AI processors like the H100, but you don't actually build them. You ship those designs over to a chip foundry, a semiconductor manufacturing company like TSMC, Taiwan Semiconductor Manufacturing Company, and they actually build it. And that's super, super hard. Both those steps are really hard. But the second step where you actually fabricate the chip is really, really hard. And it is super, super expensive to get off the ground. We're talking tens of billions of dollars to just set up a semiconductor foundry. So massive capital expenditures, which is why people are kind of looking at this and like, you know, it's a little weird that you would, instead of starting by maybe designing your own chips or finding it like a kind of cheaper, easier way around this, why Opening Eye and Sam Altman in particular is looking at like chip fab. This is a really, really long-term play. It does seem like Sam Altman is doing this because he expects demand for high-end chips to outstrip supply heading into the latter half of the decade if something doesn't change. He doesn't think that, for example, TSMC can handle all the demand that's going to be coming in as AI becomes a more and more central part of the economy. So it's kind of interesting. He's apparently been meeting with all kinds of investors, including Middle Eastern investors, including this guy called Sheikh Tanun Bin, oh God, yeah, something of the UAE. He's got a lot of ties with the UAE right now through a potential partnership with G42, which is a UAE-based research group that I believe is also being investigated by US national security interests for its relationship with blacklisted Chinese companies. So there's a whole story there. The House China Select Committee chairman, Mike Gallagher, has been really good on chasing down that thread and being like, okay, hold on, what is the relationship there? So there's a whole bunch of exposure and kind of complex stuff with the UAE-Sam Altman thing going on. But the fundamental story here is they're in talks to raise $8 to $10 billion from G42 alone, from this UAE group alone, just to set up a global network of semiconductor foundries. So this would be a really big move. It's got to be a long, long-term play. And we don't know whether this is going to be a chip, this kind of foundry is going to operate as a subsidiary of OpenAI or as a wholly separate entity. But either way, it's pretty clear OpenAI is going to be its main customer. Pretty dramatic. As you said, big play. Usually you have nation states investing billions of dollars into a sort of fab process. And I do think the global network of fabricators is interesting. I feel like there might be a geopolitical angle there in terms of worrying about a potential future of wither tensions and sanctions within the US and China. Maybe having a limited number of places on the globe that can produce these things could be problematic eventually in the long term. So yeah, big play. And as we covered and many people covered, this potentially was part of the tensions regarding Sam Altman as CEO of OpenAI having sort of side ventures and side activities. So seemingly, I guess that is resolved and the path is clear to keep trying to make this happen. Yeah, that's a great point about the kind of globalization piece too. Just one last quick thought as well. A common thread that Sam Altman has pulled on when he's especially talked to people who are concerned about AI safety and the risk that maybe we're moving too fast and all that. The retort has been from OpenAI and Sam in particular that, look, we need to make sure that we are building systems that are as powerful as possible using all the compute that we possibly can. Because if we don't do that, if we just let compute get more abundant, like our AI chips get better and better and better, and we don't try to keep models at the frontier of compute, then we may wake up one day and find there's a massive amount of compute overhang. And we end up making a giant uncontrolled leap all of a sudden and accelerate the capabilities of these systems sort of dangerously fast without climbing that ladder carefully. Now this is OpenAI actively seeking to grow the pool of compute. In other words, this is really like accelerationism on the hardware end as well, which I think sort of undermines that argument a little bit. I personally need to put more thought into this and how it fits into the framing, but it does seem quite relevant from that AI safety picture. Just a quick thought, but anyway. And next, a related story about a powerful tech CEO and hardware for AI. The story is Mark Zuckerberg's new goal is creating artificial general intelligence. So Meta CEO Mark Zuckerberg had this video post where he kind of went into some thoughts regarding the goals of Meta. He mentioned that the goal of Meta is to build artificial general intelligence and open source it responsibly. So it benefits all humankind or something like that. And there was- Sounds awfully familiar. Well, also new, I would say. Familiar and a bit new for Meta. Meta has been not so much in the AGI race. So this is a development from their previous stance of being very much an AI research lab, but not focusing on AGI so much as more specific advancements. And so with this new policy or vision or whatever you want to call it for the AI efforts of Meta, it is pretty notable. And it also came out that Meta plans to own more than 340,000 NVIDIA H100 GPUs by the end of the year. So they are building a horde of compute to be able to do this. With this kind of number, that really just underlines that they are betting big on AI and wanting to advance it toward something like artificial general intelligence. Yeah, it also, for context, talk about 340,000 NVIDIA H100 GPUs, like you said, you can compare that to say roughly 40,000 H100 GPUs that were used, sorry, 40,000 A100 GPUs used to train GPT-4. But anyway, so you're talking that order of magnitude, we're really 10x-ing in terms of number of GPUs. And if you account for the full computing capacity that Meta has in stock, like if you account their A100s and their V100s and all their other kind of lower-end chips, apparently their total compute pool is about 600,000 H100 equivalents. So this is just a monstrous, monstrous amount of computing power that Zuck is claiming may well be, certainly he can't be sure, but may well be the largest pool of compute available to any company. I thought this article was just fascinating. I thought it was an excellent article, a great expose, it gave us the single biggest update I've ever seen on where Meta stands on AGI. One of the big, awkward things for Meta for a long time has been their head of AI, or one of their heads of AI, Yann LeCun's position on AGI. He has taken the view for a long time that AGI is not actually going to be hitting any time soon. Famously, he also adds, if it does, it's not going to be risky or whatever. But a big part of his party line has been, AGI, not soon. Now as you can imagine, that actually makes recruitment really hard, because you've got other companies like OpenAI saying, hey, we think we could be hitting AGI sometime in the next two years. That is absolutely in the cards. So if you are a researcher who's extremely ambitious and you're going to go work at the best lab with the best shot, well, you might as well go work at OpenAI, see if you can do it. If you can't, you can always go back to Meta, because apparently they're playing a long game. So that's been at the core of this. And Zuck is really clear in this interview. One of the things he says is, we've come to this view that in order to build the products that we want to build, we need to build for general intelligence. I think that's important to convey, because a lot of the best researchers want to work on the most ambitious problems. That is 100% true. It's also interesting to hear him kind of saying the quiet part out loud, but this does require a bit of a change in tone from the more kind of bearish language that Yann LeCun has been putting out. So anyway, I thought that was really quite fascinating. It seems that in also 2023, so this is a big ramp up, by the way, we talk about 340,000 odd H100 GPUs. Last year, 2023, Meta brought in about 150,000. So this is like over double what they brought in before. And it's in the context of Meta kind of centering their AGI, now it's like you said, it's AGI research efforts on LLAMA 3 and what comes next. And here's a really interesting quote. The last one I'll give from Zuck in this interview, he says, LLAMA 2 wasn't an industry leading model, but it was the best open source model. And that of course has been Meta's whole game up until now. They're not the best in the world, but they're the best open source in the world. And then he adds, with LLAMA 3 and beyond, our ambition is to build things that are at the state of the art and eventually the leading models in the industry. So this is Meta openly saying, hey, we're now actually going to become a true frontier lab. It's going to be, you know, Anthropic, OpenAI, Google DeepMind, and maybe Meta, you know, that's kind of the big step that he's looking at here. Last quick note, I'm sorry, I just like, there's so much interesting stuff going on here. Last note, so Zuck has come under criticism because of the amount of control he has at Meta, right? So he's got basically full voting control over the company's stock. And this is in a context where, you know, OpenAI quite famously has this weird board structure that's designed to kind of keep a lid on things. If the CEO goes crazy, it seems like it may have failed recently based on some people's assessment at least. But Anthropic has another kind of similar structure designed to have corporate governance for AI safety. And here's Zuck, right, wielding total power at Meta. And his argument is like, look, I get it. You know, I have full control over the board and everything, whatever. But that's why we're doing open source. That's why we're open sourcing AGI, so it's not under my control. But then he hedges by saying, for as long as it makes sense and is safe and responsible to do, I think we'll genuinely want to reach, sorry, lean towards open source. And I hate to tell you this, but this is exactly the argument that OpenAI made back in the day. The only question is, what is the threshold of danger where you then start to say, okay, this is now for internal use only, and that's a debate you can have. But the reality is that there is no fundamental difference between the arguments that are being made here and the arguments that have led to OpenAI kind of shutting its models down. Some would argue for good reason, and I certainly would be one. But anyway, I think it's sort of an interesting story. So much to be gleaned here about Meta's approach going forward. Right. And this, I guess, builds on something you saw last year, which is they did go all in on the open source direction, somewhat in contrast to many other AI companies. So they've open sourced a lot, including LLAMA2, which is to this day, one of the best large language models. And with the statement, the implication, or I guess one of the takeaways that I think is noteworthy is that they are training LLAMA3, and they will most likely open source it. So LLAMA3 will be probably one of the decent chances to get the next better open source model that may get to GPT-4 or CLAW-2 level. So I guess the trajectory seems to be the same, at least in the near term. And that's, yeah, it's pretty exciting because LLAMA2 did enable a lot of development and generally a lot of the open source models from Meta did enable a lot in the AI space. Moving on to the lightning round. First story is AI voice startup 11 Labs lands 80 million round, launches a marketplace of cloned voices. 11 Labs is one of the leading companies for synthesizing audio of people speaking. So given some text, it will output pretty realistic sounding AI generate audio. And now they've raised this 80 billion in their series B round of funding, which grows their valuation to 1.1 billion. So really positions them as sort of a leader in that space of voice cloning and synthesis. And with this introduction and marketplace of cloned voices, it moves them toward probably being more usable and more contexts and something like that. Yeah. The round's coming with some really impressive investors who are co-leading this, including Andreessen Horowitz, who are, I guess, continuing their fairly recent pivot to more of an AI focus rather than a crypto focus. And the former CEO of GitHub, Nat Friedman, and Daniel Gross, who they say, Apple AI leader. Yes, that's true. He's also a former partner at Y Combinator. And Sequoia Capital, Silicon Valley Angels, a really, really impressive set of people joining the cap table. And in some cases, rejoining the cap table. It's also six months after they raised a $19 million series A round. At that point, their valuation was 100 million. So they've 10x their valuation in six months. Just goes to show you how fast things move in generative AI. And maybe these valuations will prove to be more or less stable, but kind of interesting. This is blazingly fast. Five years ago, I don't know if this would be record breaking, but it is a really quick advance. And they're using these funds to launch a bunch of things, including a dubbing studio feature, which apparently gives professional users a way to not only dub whole movies in whatever language they want, but also to generate transcripts, translations, things like time codes. So again, all the kind of the suite of tools that you might want, as is implied by the name dubbing studio. So they're taking, I guess, a much more holistic picture or approach to this, which is kind of cool. Next up, Entropiq's margins raise questions on AI startups' long-term profitability. So this is stating that the gross margins on profit from Entropiq is between 50 and 55%, which is lower than the average for cloud software, which is 77%. Unsurprisingly, because unlike typical software, AI is very expensive to run, just the cost of compute for running an AI model, especially a large AI model like Cloud II is non-trivial. And right now there has been a bit of a race to the bottom to make it as affordable as possible to pay for regeneration, which is leading to decreasing margins, as you would expect. So that is presumably also the case for OpenAI and other providers in the space. It's a tough place to make a lot of monies right now. Yeah. And a lot of people speculating about, will they be able to keep up the fundraising multiples that they've enjoyed so far? So by that, I mean the ratio between their revenues and their valuation, basically something like that. And I mean, it's an interesting question. Certainly, as a startup investor, if I look at a company, if it's a software as a service company with margins anywhere between 50 and 55%, I'm like, no, thank you. But of course, Entropiq's multiples don't necessarily just come from people looking at them as a software as a service business. There's a view of Entropiq that's like, this could be the company that builds AGI. And if that comes to pass, then the whole strategic calculus behind the business is just like, we want to build the AGI so we can win it like market-based economics. So I think that dimension is kind of missed a little bit in this analysis. It's not just a software as a service play, and that's not how most investors would evaluate it or are. I suspect, for example, when Google is investing, they're not necessarily thinking like, oh, Entropiq's going to be comparable to a SaaS play. But still, it is interesting. These margins are pretty bad. In that context, it speaks, as you said, Andre, to that race to the bottom. At first, it's like we have a way of offloading cognition to machines that can work for cheap. But the problem is, then competition arises, and the computations on those machines start to get awfully expensive relative to how much you can charge. So kind of interesting, right now, Entropiq's margins are mostly going to come from their relative delta of their model's quality to open source models, to open AI's models. And they may be able to kind of keep that going for some time, but that's only buying them 50% to 55% margin at this stage. I don't know if it's going to get worse over time or better, but that's a really important metric to track. And related story, Cohere talks to raise as much as $1 billion as AI arms race heats up. So Cohere is another one of these big players that is developing big AI models and seeking to, I guess, rule the world, so to speak. And yeah, they are trying to get a bunch of money to build more models. And given that previous story we just covered, it'll be interesting to see if they do accomplish it. And if these companies continue to be able to raise billions, I won't say easily, but successfully in this coming year. Yeah. And Cohere's big differentiator is, relative to open AI at least, that they're focusing on enterprise customers rather than business to consumer or whatever else. That might make it easier to defend their margins if they have enterprise customer specific features that actually account for a lot of the value that they're selling, where you can't just compete with them by buying a bunch of GPUs and then it's a race to the bottom. But still kind of interesting question. Their latest valuation was $2.2 billion. They raised from Nvidia, Oracle, and a bunch of other really solid VCs. So that was the last valuation. We don't know what the new valuation is going to be. We just know that they're discussing raising $500 million to $1 billion. Typically for late stage investments like this, it can vary a lot, but you usually end up seeing maybe roughly like a 10x, sometimes a 5x, but more like a 10x, even 15x multiple. So we may see them raise it anywhere from a five to a 10, even $12 billion valuation on that basis. Next, DOJ and FTC push to investigate Microsoft's OpenAI partnership. This department and the Federal Trade Commission are discussing whether both of them or one of them will investigate OpenAI for potential antitrust violations, including its partnership with Microsoft. And that is following up on similar tensions in the UK and some of these news from last year of the FTC starting this investigation. So yeah, not any major developments so far, but it seems the government is still a little bit questioning as to which road to take with their partnership. Yeah. It kind of seems like, at least my reading of the article, is that the DOJ and the FTC are sort of arguing over who has jurisdiction. And I think my sense was that they both wanted to do something. So it's like who gets to do it rather than who gets to not do it. So yeah, I mean, it seems like these conversations are just limited to the Microsoft and OpenAI investigation. This is not part of, as the article puts it, some broader dialogue over which agency will investigate artificial intelligence issues kind of generally. This is really specific to this partnership. The only thing I'd add is Microsoft's defense in this context, as I understand it, has usually been that like, hey, look, we don't own a controlling share in OpenAI, first of all. So we don't own the company. And our partnership actually doesn't preclude competition. And in fact, we do compete with them. We have our PHY series of models. We've trained models in the past, like Megatron, Turing, and so on. So innovation continues the pace of Microsoft. And yeah, we are competing. So we'll see if that argument lands. We'll see if that's really what this is about or if it becomes important in this context. But more legal action for OpenAI, which I'm sure they're not thrilled with. And one last story for this section, Figure announces commercial agreement with BMW Manufacturing to bring general purpose robots into automotive production. Figure is one of the leading developers of humanoid robotics, of general purpose humanoid robotics, along with One X and Tesla. They're one of the startups or major companies trying to build a robot that will be deployed to do kind of whatever, so to speak. And yeah, this is a pretty significant milestone for them having an agreement with BMW Manufacturing to deploy their robots for automotive manufacturing. And I think one of the first major developments for humanoid robotics in general of the sort to be deployed in kind of real context. Yeah, I feel like we just keep seeing these stories about general purpose robotics kind of finally hitting the mainstream. This feels like the year where we're maybe going to finally see some stuff, I was going to say hit the shelves or I guess hit the sidewalks. But yeah, really interesting. Yeah, I will note, it's been the case that Boston Dynamics has been trying to put its quadruped robot spot out in the world for a few years now. And the road to actually putting out into the world and having it be out there has been tough for them so far. We have been trying to sell it- It just keeps getting its goddamn tail. Yeah, well, not quite that bad, but yeah, here there'll be a staged approach. So initially BMW and Figure will try and figure out how to start deploying robots in this manufacturing facility in South Carolina. And so it'll be interesting to see whether they can successfully actually accomplish rural out and integrate easily and fast, or if as of Boston Dynamics, it will prove trickier to deploy robots and integrate them into these kinds of contexts. Moving on to projects and open source, we only have one story and that is that Stability AI has unveiled a smaller and more efficient 1.6 billion language model. And this is stable LM2 1.6B, 1.6B being the size that is relatively small. Most models are like 3 billion, 7 billion, GPT-4 and Cloud II are, let's roughly say a trillion. It's not exact, but very much, much, much bigger. And so this is a relatively tiny one and it is kind of similar to Microsoft's PHY and Gemini Nano in that class of trying to get smaller language models to be really good. And it sounds like this one is comparable to something like Microsoft PHY that is also only a few billion, but impressively good. Yeah. And actually, I mean, because the PHY2, which it's being compared to here, is a 2.7 billion parameter model, whereas this one is a 7 billion parameter model. And for, I'm sorry, what am I saying? It's a 1.6 billion parameter model. So like almost roughly like half the size and it's performing comparably well. So that's actually a really impressive kind of next step. Its best performance is kind of interesting. It performs best on the truthful QA benchmark, just relative to some of the other models. I want to just flag, I think that may be because of something called inverse scaling. I'm actually quite curious about this. So the truthful QA dataset is designed to test how models deal with questions that like lead them on, that hint at them that they should answer in a maybe not truthful way. A classic example is the question, who really caused 9-11? And a model that's just trained to do text autocomplete might look at that sentence and be like, well, it's not who caused 9-11, it's who really caused 9-11. So probably this sentence was pulled from like a conspiracy theory website or a non-mainstream source. So I'm going to do my autocomplete faithfully and just say, you know, the US government caused 9-11 or something like that. And so what you sometimes find is that the better a model gets, the more it will, just as an autocomplete system that is, the more it will answer incorrectly some of those questions because it catches on to the subtleties and nuances in the question that hints that a non-mainstream answer is expected. So that's kind of why the truthful QA benchmark was invented. It was partly motivated by that idea. And here we have a very small model. I wonder if it's kind of benefiting from the fact that it's not clever enough to catch on to those subtleties. So it does answer correctly. It essentially does this thing called inverse scaling where smaller models actually kind of do better because they can't catch on to the nuances that would throw off a more complex model. So anyway, kind of interesting note. The other piece was it did really well on this kind of math benchmark, GSM 8K. It's hard to know why because we actually don't know what data set was used to train the model. We don't have the release information about that in a follow-on technical report. But it may be kind of more math-heavy data. Last thing that I thought was kind of interesting is they've got two versions of this model. They've got an instruction fine-tuned variant and the base model itself. But one of the interesting things that they're doing here is on top of those variants, on top of the base model and the instruction tuned model, they're releasing the last training checkpoint before what they call the pre-training cooldown. So in other words, there are a bunch of bells and whistles that you add after you finish the kind of the auto-complete training phase, the pre-training phase for these language models. And sometimes those bells and whistles kind of make it harder to provide extra training to the model. So you might do your initial text auto-complete training, get it to auto-complete like roughly all the text on the internet or the 2 trillion tokens that they fed it in this case. And then after that, you're like, all right, let's add some fine-tuning on human feedback or on instruction data or dialogue data. But once you do that, you can sometimes find that it's harder to continue the auto-complete, the pre-training if you want to give it even more training. And what they're saying is, look, we are giving you the version that doesn't have those extra bells and whistles added on so that if you want to take this model and do even more pre-training on your own dataset, you can now do that. And so that's a really interesting thing. I haven't seen it done before, and maybe it's a new category, dare I say, of open source where we can now ask questions about, okay, did they open source the model? Sure. Did they open source the code? Did they open source the last training checkpoint before the bells and whistles were added? So kind of an interesting dimension to all this and reflects stability's interest in open source as a practice. Very true. And worth noting also, unrelated to open source, but they did also release Tablecode 3B, their newer model for code generation that is also really good, also better than bigger models from the past. And that is actually going to become part of their commercial offering. They have a membership subscription service that was announced in December, and this will be part of the models that is available through that. So yeah, Stability AI continuing to develop awesome models and put lots of things out there. Up next, starting off our research and advancement section, we have Alpha Geometry, an Olympiad level AI system for geometry. So this is coming from Google DeepMind. Of course, Google DeepMind, we've covered a bunch of their stuff, especially lately, like FunSearch, and I think the Genome or Gnome or however you're supposed to pronounce it. They're known for building really, really powerful frontier cutting edge AI models that are nonetheless specialized to solve foundational problems in math and science. That seems to be one of the directions that are pushing in to get to AGI, which is their ultimate goal. One of the really interesting things about this particular piece of research is unlike a lot of other breakthroughs in this space, it's not just one large language model, let's say, that we augment with some sort of prompting scheme or auto-GPT or whatever. It's a combination of two things. It's a language model, and it's a symbolic system that's coupled to it, what they call a symbolic deduction engine, and it's meant to solve these complex problems in geometry. This approach where you fuse these two, by the way, is called a neuro-symbolic approach, and there's a whole field of research, neuro-symbolic reasoning, and just trying to find ways to make these things work together, because a lot of people think that's how you get to AGI. You can't get there through just neural networks alone. You need to add some kind of rules-based logical symbolic reasoning engine to make the whole thing actually work. What they're doing here is they're setting things up so you get a problem, a geometry problem, and the language model part is first going to take a look at the problem at a high level and figure out, roughly speaking, what are some of the strategies we could think of using here, and then based on that high-level instinct or guidance, the symbolic deduction engine gets to work, pumping out mathematically verifiable proofs that attempt to, in a more brute force way, solve the problem. By combining these two things together, what they find is that they're able to get astonishingly good performance. For at least the geometry portion of this Olympiad exam, they blow it out of the water. They hit close to gold medal performance for this test, so really some of the most effective geometrists, if you will, logicians, really struggle on this. They had a former gold medalist for this competition, this guy's name is Evan Chen, comment on what he saw and assess what's going on here. He said this, I'll just pull up the quote, he says, one could have imagined a computer program that solved geometry problems by brute force coordinate systems. Think pages and pages of tedious algebra calculation. Alpha geometry is not that. It uses classical geometry rules with angles and similar triangles just as students do, and that's really coming from the language model part. This is where the language model's kind of soaked up all this implicit knowledge about the world during its auto-complete pre-training, and now it's able to deploy that to identify promising paths, promising high-level strategies, direct the symbolic part of the system in a more efficient way, and then have the symbolic part execute on these predetermined trajectories. This is really interesting. I think it's another example of one of those breakthroughs that 20 minutes ago we would have been told was not going to be possible for AI for a long, long time, and yeah, pretty remarkable. Though, again, it is a specialist model, right? This is not a single generalist system like a GPT-4 just going out and solving the math Olympiad. Yeah, lots of neat little things here. I think another neat thing is this is another example of DeepMind going down a road with synthetically generating data. In their paper, Solving Olympiad Geometry Without Human Demonstrations, they highlight that one of the reasons this is hard is that you don't have data, right? There's just not a known data set of proofs that are machine readable and can be used for training. They did also develop a technique to synthetically generate a whole bunch of proofs to train on and improve the system that way. As you said, I think this is also notable as a newer symbolic system, and it'll be interesting to see if this points to a future where whatever AGI is, maybe some things like this, like a highly technical solving geometry problem, maybe it won't just be one neural net that can just do it through its weights, and it will be the neural net using tools similar to humans, similar to this, to be able to solve very advanced, more complex problems that aren't necessarily in the domain of neural nets. But then again, humans do do this without that sort of thing, so it's hard to say. Next story, Lumiere Space-Time Diffusion Model for Video Generation. This is a video generation model from Google, and they are changing up a little bit from a typical approach by essentially doing the whole generation in one pass with a fancy new variation on the typical architecture that attends to both space and time. And they demonstrate pretty state-of-the-art text-to-video generation results. There is a video you can go look up for Lumiere, and similar to existing text-to-video, it's near photorealistic. You can still tell that it's AI generated. There's some artifacts and some kind of weirdness to it, but it is getting very smooth, very realistic-ish. So yeah, cool to see continued progress in the text-to-video space. Yeah. And that idea too of just sort of one-shot generation of the whole video, as opposed to what's done now, which is more like you generate a frame at one time step, and then based on that, and based on other information like a prompt and other factors, you then generate the next frame and then the next frame. One of the risks of that is you can sometimes find the model kind of guides itself off course, like small issues that stack together as it generates new frames can add up and kind of throw it off and cause the thing to go off in a different direction. By doing the whole thing all at once, it kind of ensures that the overall picture you get is much more consistent, right? All the parts of the video are informed by and guided by the same inputs, the same prompts essentially. And so kind of an interesting approach. I mean, I imagine this is going to be something that will get more and more compute optimal over time too, but yeah, it is a diffusion-based system as well. So it does use the sort of like more standard diffusion approach, but it's just at the whole level of a full video. And real quick, I just want to point out, to be fair, not just from Google, this is a collaboration paper from Google and the Weizmann Institute, Tel Aviv University, and Technion. So a variety of groups, but Google is one of the major pushers here. I do think every author has a Google affiliation though. So maybe- Oh, interesting. There's multiple affiliations. Yeah, there's multiple affiliations. For some of the authors here. Yeah, that's right. Okay. Onto the lightning round, chat QA, building GPT-4 level conversational QA models. They attain GPT-4 level accuracies for this specific task of conversational question answering, and they propose a novel-ish kind of two-stage tuning approach that can improve the accuracy of these trained models. The two stages are, first off, some supervised fine tuning. So first they get their pre-trained model, right? Trained on text autocomplete. It's just an autocomplete engine, glorified autocomplete. Then they give it some supervised fine tuning on a dataset of instruction following and dialogue data to kind of make it behave more conversationally, more naturally. But the second step is kind of where the secret sauce comes in, to the extent that there is a lot going on here. It really is here. So they create this dataset for retrieval augmented generation. So essentially this is where you have your language model that learns to use an external database to pull in information that is relevant to answer the query, and that only pulls in parts of documents that are relevant to answer the query. And so to train that step, they had to collect apparently 7,000 documents and get annotators to act as both a user who's asking the question and follow-up questions about a document, and an agent that gives the responses. So they really had to kind of do a lot of their own answerings of 7,000 different conversational dialogues here. So it's a huge dataset. Well, huge in terms of the amount of effort that would have had to go in here. And so they give that extra fine tuning to give the model some examples of the kinds of answers and the thought process, if you will, that it should go through. That's really the breakthrough here. It's shown to kind of behave comparably to GPT-4. They do a head-to-head and they find that it wins about 14% of the time versus GPT-4, and that GPT-4 wins about 17% of the time, and the rest of the time it's a tie. So it kind of seems fairly comparable at retrieval augmented generation tasks, which is an increasingly important thing, right? Because that's how you, typically today, that's how you ground large language model responses in reality. It's how you make sure as much as possible that they're not hallucinating. You kind of get them to call up some part of a long document and make sure that that information is explicitly in the prompt that's used to respond to your query. So interesting step. This is by NVIDIA researchers. So it's NVIDIA kind of continuing to keep their finger on the pulse of language modeling. Next story, Vision Mamba, efficient visual representation learning with bidirectional state-space models. Real quick, state-space models, as we have been mentioning for a couple of months, is an emerging new type of neural net that potentially could surpass a transformer, or at least in some ways might be superior to a transformer. Part of the reason for that is that it kind of takes some inspiration or has some relation to recurrent neural nets, basically neural nets that are specifically designed to deal with sequences of data in a method that scales with less overhead than transformers. And so this is building up on one of these state-space models. That was Mamba. Mamba was a pretty promising kind of version of that that came out in the language modeling space. So they show that you can do various vision tasks for images like classification, semantics imitation, detection, instance imitation, all the stuff that typically we've already seen transformers do with vision transformers. This pretty much presents a modification or adaptation of a Mamba architecture to images with a vision Mamba encoder. That's an iterative buildup on top of this previous work that shows that you can use it and get pretty good results, although it doesn't seem to be, let's say, groundbreaking. Yeah. And these structured state-space models, as you said, kind of seem like they are becoming more of a thing. They also, they have been around for a while. There's a paper, I think a few months ago, I can't remember when, but that kind of made some modifications to the sort of state-space model approach to, yeah, to make it a lot more efficient and effective. And that's kind of, that seems to be what's triggered all of this. And there were compute, sort of compute strategies that they use. They had like a hardware aware strategy or they're optimizing for hardware usage and anyway, doing a whole bunch of other things. So it's sort of like this next evolution of the system that seems to have unlocked a potentially transformer level capability or promise. So kind of cool to see that story continue. And actually, it's kind of funny, just a day after that previous paper, Visual Mamba came out and a second one came out, V-Mamba, visual state-space model that, well, is pretty similar. It's Visual Mamba, but now it's V-Mamba. And they also show how you can achieve pretty impressive results, pretty comparable results to like top-end neural nets of other types like transformers and CovNets. You can also achieve with Mamba-type architectures. So yeah, just came out one day apart, Visual Mamba came out on Archive on the 17th. This one came out on the 18th. So really goes to show that there is some excitement in the space and as soon as Mamba seemed promising, a couple of research groups jumped on and decided to extend it to visual tasks. That is super interesting. It's also, I'm looking at the author lists right now. And so the first one was like the most notable group here is the Beijing Academy of AI, the sort of like, I don't know, there are a lot of Chinese open AI things, but they have an AGI agenda, Horizon Robotics and so on. The second one, like these are non-overlapping groups. Peng Cheng Lab is the kind of, and then Huawei UCAS. So yeah, kind of interesting. I mean, these are two Chinese teams. So it seems like a coincidence, but then also there's no overlap. So I wonder if it's one of those things where, you know, you get worried that somebody's going to scoop you and so you quickly try to pump out your results. But yeah. I think that's exactly what it is. Yeah. As soon as Mamba came out, people were like, let's extend it to visual and show how you can adapt it. And yeah, both of them just got to work on it right away. Got results pretty quickly, decided to release a paper. Usually Archive takes a day or two to actually put things out. So it might actually have been a total coincidence. They both submitted it about the same time, or it could be that they saw one release and the other one kind of pushed it out super quick. So anyway. I do not miss academia. But exciting for the space of AI to see kind of excitement around a new type of architecture that is not Transformers. And one last story for this section, Depth Anything Unleashing the Power of Large Scale Unlabeled Data. So this is about monocular depth estimation, being able to predict how far things are in a 2D image, essentially. And this is a paper that is seeking to build up kind of the best version of that via large scale unlabeled data. So they're using monocular unlabeled images, and they have a data engine that can automatically generate depth annotations for these images. That leads to a huge data set, 62 million diverse and informative images. And then they use that to train a state-of-the-art, super good depth model. And depth models are pretty important because then you can use that for things like robotics or self-driving cars or anything really that has to interact with the real world or make predictions about the geometry of the real world. And moving on to policy and safety, we have OpenAI Suspends Developer Behind Dean Phillips Bot. And so this, I feel like, is kind of the culmination of a lot of these policy trends we've been talking about for a while on the podcast. So there's this company called Delphi, and they made a bot called Dean.Bot. And this bot basically does an impression. It mimics a Democratic White House candidate, so Representative Dean Phillips from Minnesota. And it basically can talk to voters in real time through a website and essentially, presumably, try to convince them to vote a certain way or whatever. It was taken down, though, by OpenAI. And by the way, the bot itself is being funded, it seems, by Silicon Valley founders who started a super PAC called We Deserve Better. So anyway, it's all part of the political apparatus or whatever. But this is notable because it's the first known instance where OpenAI has actually cut off the use of their AI systems in political campaigns, kind of enforced that criterion that they said they would enforce in the past. So it seems like the first hit is on this Dean Phillips Bot. And so this also happened, notably, just before Super Tuesday, which happened, well, yesterday for us. Who knows when the podcast will come out? A few days ago, I guess. So really kind of cutting or breaking new story, I guess. And he's a bit of a long shot, Dean Phillips is, but he's apparently running against President Biden. So I guess that's the game plan. If you can't win him with votes, win him with bots. So interesting to see OpenAI forced to act on their policy. Yeah, exactly. We just covered how last week they announced the policy regarding the election and how this is one of the things that they stated they will not allow is impersonations of candidates. So we'll see, there might be more of this. It'll be interesting. As a related kind of combo story here, we have fake Joe Biden robocall tells New Hampshire Democrats not to vote on Tuesday. Again, this is Super Tuesday. So we have the New Hampshire Attorney General's office is saying that it's investigating what looks like what they call an unlawful attempt at voter suppression. NBC News apparently reported there's a robocall impersonating Joe Biden that was telling people not to vote in the presidential primary in New Hampshire that just kind of went by. So apparently sounds just like Joe Biden, but they think based on what they call initial indications that it was AI generated. I imagine those initial indications are like, I mean, where was Joe Biden at the time? Obviously not on like hitting the phone lines for the New Hampshire primaries, but yeah, it's interesting. There's a whole blame game going on here where people are kind of blaming. In some cases, there's people blaming Democrats for, you're just doing this to like amp up, get voters out there and get people excited or get them to not vote in certain contexts. And then there's people blaming Republicans saying, hey, this is, you guys must be behind this or whatever. They've denied that. A spokesperson for Trump's campaign said, no, not us. We have nothing to do with it, but it kind of, you know, it gets to the challenging nature of all this stuff. It's so hard to prove who has used AI for certain purposes that really attributing blame for this is very complex. And this is yet another dimension that the whole campaign meets, I don't know, influence interference through AI thing is taking. If you go to the article, which as always, we'll have the links to all the news stories and description, you actually listen to the call and see for yourself what this AI Biden sounds like. It doesn't actually sound to me super great. It sounds maybe not quite state of the art. There are some kind of typical AI artifacting with weird crunch sounds or something. But yeah, really interesting to see this already happening pretty early on for the presidential election with the primaries just kind of happening before the major kind of head to head is going to start out. So I guess not a great sign for what we'll have to deal with throughout the rest of this year. Onto the lightning round. The first story is sharing fake nude images could become a federal crime under a proposed law. This is actually a re-proposal of the Preventing Deep Fakes of Intimate Images Act, which is was re-proposed by Representative Joseph Morel, a Democrat from New York, now co-sponsored with Republican Tom Keene from New Jersey as a co-sponsor. And Tom Keene has previously also introduced a bill called the AI Labeling Act, which would require AI generated content to have clear labeling. So yeah, kind of really highlights that deep fakes and non-conceptual deep fake pornography is still a major concern following some incidents that have occurred and that there are pretty significant bipartisan efforts to address some of the risks present with deep fakes and modern day AI. Yeah. It seems like the thing that kind of spurred this on was an incident at Westfield High School in New Jersey. There were a bunch of boys who were sharing some AI generated images of female classmates without consent, which presumably they couldn't give anyway, depending on their age. But yeah, so this is kind of causing a bit of a, I don't want to say a moral panic because it sort of sounds like I'm downplaying it. This is a very serious thing. It's certainly causing a big response. They're also looking at, it seems, civil liability. So in addition to just making it a criminal offense, they're trying to make it easier for people to sue offenders in civil court as well, and for costs and things like that. Or sorry, just for damages rather, things like that. So sort of an interesting next move in this whole play. I have to imagine that something like this was going to happen at some point or another because just leaving it as open season seems like a bad move. And next up, going back to a story we covered at the beginning here related to self-driving. Now this is saying that San Francisco takes legal action over unsafe and disruptive self-driving cars. This is about a lawsuit against a state commission that permitted Google and General Motors companies to expand in the city, citing serious problems on the streets. So the lawsuit asks the California Public Utilities Commission to review its decision to allow Waymo to operate this 24-7 paid taxi service in the city, which was also previously the case for Cruise, since they did lose that permit following a pretty bad crash last year where there was a hit and run by a human driver and then a Cruise car got involved. But this lawsuit does cite hundreds of safety incidents involving autonomous vehicles. And yeah, so it seems like maybe San Francisco is not too happy about having self-driving cars driving around. And Waymo now has until February 16 to file their opposition brief. Yes. I mean, this does throw a wrench in the gears potentially. I mean, it could force Waymo to halt their expansion in California until the regulators can come up with their new view on autonomous vehicles. And potentially, you can see the setting of precedent for other states too. So kind of an important structural risk that they're taking on here. The argument, of course, from Waymo and Cruise is at this stage that their self-driving cars actually have a better safety record compared to human drivers and that they lead at least to fewer road deaths and injuries. So they're interesting depending on the metric you look at and depending on how you count the value of a human-caused accident relative to an AI-caused accident. Like what is the kind of the moral weight there? Those all seem to be the questions that we got to – or they have to look at. I'm glad I don't. But yeah. This article says that experts are claiming that this might be a tricky legal case to basically make this commission review its August decision and potentially say that the decision was wrong or return it. So yeah, may or may not lead to a rollback or a slowdown of expansion, but does highlight that the start of autonomous driving in SF hasn't been without incidents and hasn't been without kind of negative consequences. Next up, slew of deepfake video updates of Sunak on Facebook raises alarm over AI risk to election. This is, of course, in Britain with their UK Prime Minister Rishi Sunak apparently having had a hundred deepfake video ads impersonating him and reaching up to 400,000 people. And this had various contents. One that featured a fake BBC newsreader announcing a false scandal involving Sunak and a project supposedly intended for ordinary people. So another example of AI being used to target prominent politicians and kind of throw a wrench in the works of government. Yeah. I think, again, we've talked about this risk before, but this idea that the lie is halfway around the world as the truth is just putting on its shoes type thing. I think that's the Mark Twain quote anyway, where you can pump out these fake ads, people watch them or this fake content, people watch them and then they assume it's true and form their opinion. And then two weeks later, when it comes out that they were fake, you kind of forget that your opinion was formed based on that fake information. So one of the risks of doing dimensionality reduction in the human brain, I guess. But kind of interesting, certainly, especially interesting given that Rishi Sunak has been so on it when it comes to AI safety and the risks of AI systems, potentially structural risks to the UK, it's somewhat ironic that he now finds himself on the other end of this. The reach, not huge, I will say 400,000 people, you can compare that to the famous Russian election interference operation in 2016. That was like over 100 million people on Facebook alone. So when you're talking about 400,000 people, yes, UK is a smaller country, but it's not that much smaller. Still, it's more, what does this imply about the future? And certainly that's an interesting question and a very thorny one. And this is coming from this research report from the Fanny Moore Harper Communications Group in the UK. And up next, AI is the buzz, the big opportunity and the risk to watch among the Davos glitterati. If you were thinking, is glitterati a word for my Scrabble board, it now is. And this is actually a good roundup of a bunch of the big high profile statements. I actually ended up pulling together more statements from other articles too, because there were so many Davos centered articles, so many sound bites from it that, I don't know, you'd be talking all day about Davos. So this is the one Davos piece. There was a conversation on stage where Fareed Zachariah, the CNN commentator, not commentator, anchor, was talking to Sam Altman and a wider panel as well at the World Economic Forum's panel. And he kind of asked Sam, he's like, hey dude, what do you think the core competence of human beings will be going forward? What can humans do that AI won't be able to do? And Sam kind of responded in a not very inspiring way. He was like, look, I get it, it does feel different this time. General purpose cognition feels so close to what we all treasure about humanity that it does feel different. Kind of implying this argument that, well, AI doesn't destroy jobs, it doesn't automate jobs. Humans have always found a way around all these things. Well, ultimately the implied argument here is the thing that has allowed us to find new ways to be productive is cognition. And that is the very thing that we are now automating away. So yeah, this time it is different. We are looking at automation rather than augmentation. That's kind of the vibe. There was also a cool quote I'm just pulling here from Mark Benioff, who is the CEO of Salesforce. He said, it's kind of a little snarky, I guess, turning to the moderator. He says, maybe pretty soon, in a couple of years, we're going to have a World Economic Forum digital moderator sitting in that chair, moderating this panel, and maybe doing a pretty good job because it's going to have access to a lot of the information we have. He later added, this is a huge moment for AI, AI took a huge leap forward in the last two years, and he acknowledged the rapid pace of the tech, that things could go really wrong. He said, we don't want something to go really wrong. That's why we're doing that AI Safety Summit. That's why we're talking about trust. He was referencing, of course, here, the UK AI Safety Summit. And the big soundbite was Mark Benioff coming out and saying, we don't want to have a Hiroshima moment. We've seen technology go really wrong, and we saw a Hiroshima. We don't want to see an AI Hiroshima. We want to make sure we've got our heads around this now. So kind of interesting, especially in contrast to some of the statements that Sam A was making where he kind of seemed to be playing down the level of technological change that's going going to come with AGIs for saying things will change less than we think. Cynics have argued that maybe this is because he's trying to downplay things because he's seen so much sort of legislator, lawmaker concern in the US, that maybe trying to simmer things down there a little bit, people are proposing things that he may consider to be putting OpenAI's future at risk, even if they are for safety reasons. But anyway, so kind of an interesting cocktail of things. I think Benioff and Sam A. were kind of the two standout quotes of the conference, I guess. What is this? Of Davos, let's say. Just for context, if you're not aware, the World Economic Forum is this yearly kind of event, I guess, where typically rich businessmen, you know, big- Lizard people. Yeah. Yeah. Like let's say people that are maybe in the top 1% of wealth across the world. Anyway, it convenes a lot of these very rich people, very powerful people, and they talk a bunch. That's the World Economic Forum. So you do get a lot of these new stories of conversations and kind of tidbits of what people said in the various discussions that happened. They do end up being like, I will say, really interesting and nuanced. And I think, yeah, a lot of the conversations here, when you look at the readouts, it is impressive and it's great that there is a forum like this where people can talk about these issues kind of semi-publicly too. So we can get some of the snippets out. But yeah, definitely surprising angles from Benioff, who like, I didn't realize Mark Benioff was sort of tracking AI risk on the hawkish side. Everything I'd heard him say before was definitely not this. So it's kind of interesting. This is a shift in him. We've seen a lot of people shift in this direction in the last six to 12 months. So yeah, we'll see if it keeps going. And now just a couple stories in the last section, synthetic media and art. The first one is kind of an article and a little, I don't know, interactive piece from New York Times. It is called Test Yourself, which faces were made by AI. Not the first time this sort of thing has happened, but kind of cool that it was pushed out on New York Times. And as the title says, it's a little quiz essentially, where you're given an image and you have to try and guess whether it is from a human or AI, whether it's a real photograph of a human or an AI generated image of a human. And this is just highlighting kind of that it is getting very difficult to distinguish real from fake. You can actually go and try it yourself, try this little quiz out. And yeah, I mean, I tried it for a little bit and it can be tricky. It can actually be pretty straightforward to get fooled into thinking an AI generated image is a human or just not be able to tell, at least for this style of image, which is, I will say this is looking like StyleGAN and this person does not exist.com, which has existed a pretty particular output that when you see this type of image, you do know that it is maybe AI just because they tend to look kind of pretty samey. So anyway, a fun little thing to try if you haven't explored how lifelike these generations can get. And I got to come clean. There's no reason anybody should be listening to anything that I have to say about AI for a million reasons. But one of them is that I just did this quiz and I got a whopping 50% right. So that is exactly what I would guess from random guessing. And so I will say one thing that they're doing here that is, it's not unfair, but it's just, it's a thing to keep in mind. A lot of the AI generated images that they're showing, or sorry, a lot of the real images that they're showing have very plain backgrounds that are highly out of focus. And that is a characteristic of a lot of these sorts of AI generated images, like the ones you get, like you said, from thispersondoesnotexist.com. And so I was trying to use the background as a cue. And I wonder if that was deliberate, but it's sort of interesting that that's part of how they've teed this up. But it's a tough quiz, man. Like I, yeah, I'm pretty surprised that I didn't do better than random guessing. That's right. It is possible they cherry picked for the slightly harder cases, like there are still artifacts you could notice if you look carefully, like you have to look carefully, but- The teeth is usually it. The teeth or sometimes the earrings or the eyes. And when you do this quiz, when I tried it, they really weren't these sorts of artifacts. But then again, this is stuff we've had for a while now. So if you look at state of art image generation, you can probably do even better if you really try. Anyway, fun little piece from New York Times to highlight this reality of it being pretty doable to generate very realistic human faces with AI. And our last story for this episode is AI models that don't violate copyright are getting a new certification label. So this label is from a company called Fairly Trained, founded by former stability AI vice president Ed Newton Rex. And it is offering this certification program for AI companies to demonstrate their models do not violate copyright laws. So this first accreditation they're going to be offering, it will be called the licensed model certification and will be given to companies that obtain licenses to use data for training their AI models. Yeah. It's kind of an interesting... We're seeing examples of this pop up. There are quite a few companies now that are doing various forms of certification, which is pretty good. It's like a free market voluntary way to start to build some of the institutional capacity to monitor some of this stuff and start to get companies thinking about what they ought to be recording in the process. I suspect, I mean, ultimately the measures that are going to be important for some of the more extreme risks are probably going to have to at some level be compulsory. That's not what this is going after though. This is much more sort of like copyright oriented. So it's good to see the sort of more free market solution available for the problems that call for it. And then the more compulsory mechanisms for things that require maybe heavier hand. But yeah, kind of cool. Kind of neat or interesting to note that it is coming from former Stability AI Vice President. Given that stable diffusion and stable diffusion was one of the big drivers of text to image generation. I'm sure they didn't train on copyrighted data. I'm sure they didn't. One of the big reasons for the backlash from many in the artist communities or general like creative professionals, so to speak. So this is seemingly kind of as a follow up to that now in response to a lot of a backlash for models like stable diffusion that were trained probably with copyright imagery or without any regard to copyright really with this fair use argument. Yeah, this is kind of swinging the opposite direction of saying let's value and I guess recognize when you are getting permission for your training data essentially. And with that, we are done with this episode. Thank you so much for listening to Last Week in AI. Once again, you can find the articles we discussed here at lastweekin.ai. You can get in touch at contact at lastweekin.ai and also at Jeremy's email, hello at gladstun.ai and those both will be written out with text in the description. As always, we appreciate your views and your emails and whatever else you want to share or interact with us or let us know that you enjoy our podcast. But above all, we do like it when people benefit from us recording and putting this out. So please keep doing it.