- Peggy K's Creator Weekly
- Posts
- Peggy K's Creator Weekly: AI Overview in Search, Ask Photos Chat, Veo text-to-video
Peggy K's Creator Weekly: AI Overview in Search, Ask Photos Chat, Veo text-to-video
This week was Google I/O, Google’s annual developer conference. And -- no surprise -- it was all about AI. There are new AI-powered features across Google products and tools to try. Not to be outdone, OpenAI also announced free-to-use GPT-4o, with splashy new features and a sexy (?) new voice interface.
Plus there are updates for video creators, bloggers and more.
Top news and updates this week
There are new generative AI tools to try!
More Gemini extensions for accessing your Keep, Calendar, Tasks and YouTube Music
LearnLM AI learning models with Gems, questions on YouTube, and more
Google AI Search Overview rolls out to all US users
New Google Search “Web” filter to limit search results to links
Veo, Google’s new text-to-video model, soon available in the experimental VideoFX tool
Imagen 3, Google’s latest text-to-image model, soon available in ImageFX
Ask Photos in Google Photos to help find your photos with naturally worded questions
Gemini in the side panel of Gmail, Docs, Drive and Sheets
Contextual Smart Replies in Gmail
Summarize Gmail email threads
Organize and track Gmail email attachments
OpenAI’s new GPT-4o with free access and a friendlier voice
Remix official music videos into YouTube Shorts
Twitch ending support for Twitch Studio software
X has migrated from the twitter.com domain. Mostly.
X Communities getting new features
Threads gets chronological search results
Reddit is bringing back awards
Reddit has new tools for running an AMA
Reddit has a deal with OpenAI that lets them use public content for AI training
Preview files and videos in Google Drive by hovering over thee name or thumbnail
Read on for details and additional updates!
Read on for details and additional updates!
Creator Weekly Live 🔴
What do you think about this week’s updates? Join the live Creator Weekly on Sunday, 10:30AM Pacific time (5:30PM UTC).
#OnEBoardChat 💬
To Do & Try
To try out Google’s AI tools sign up for Workspace Labs and AI Test Kitchen. Note that these may not be available in all countries.
Illuminate is an experimental tool from Google to “turn academic papers into AI-generated audio discussions”. Try it.
Learn About is an experimental learning companion from Google. Join the waitlist.
Teachers and educators can take a new course Generative AI for Educators to earn a certificate. The course was developed by Google and MIT RAISE.
Join the waitlist for VideoFX a new text to video generator.
NotebookLM has been updated with Gemini 1.5. You can upload Docs, slide decks or PDF files, and have the information summarized, turned into a guide or quizzes. And now the Audio Overview uses the uploaded material to create an audio discussion. And you can join into the conversation.Watch the demo.
Sign up for YouTube Podcasting 101. This session on May 29 is timed for Europe and the Middle East, but open to everyone. This is about the basics of podcasts (audio or video first) on YouTube and content strategy. Sign up here.
Google I/O 2024: AI Everywhere
One year ago, at Google I/O 2023 was focused on generative AI tools and features. Most of those announced features have been launched (I just skimmed the list of last year's announcements, so there may be some that fell by the wayside).
No surprise, this year was no different. The pressure is on Google to create some excitement, with Open AI and Microsoft rolling out competitive new generative AI features.
Google’s stated goal: Build AI to benefit people and society.
Is that the direction they are heading? It seems early days yet. It’s notable that one of their demo videos highlights a generated answer with bad information. It seems no AI-generated answer can be trusted unless you have the expertise to know which information is wrong.
Here’s what’s new with Gemini (gemini.google.com) and AI at Google:
Gemini Extensions Gemini uses extensions to access content such as Google Maps, YouTube videos or your Google Drive and Gmail emails. New extensions coming soon: YouTube Music and your Google Keep, Calendar and Tasks.
Gemini Advanced (free trial) now uses the Gemini 1.5 Pro model, which has an expanded context window with 1 million tokens. What does that mean? It can analyze documents up to 1500 pages, up to 100 emails, an hour of video content or 30,000 lines of code. You can upload documents and spreadsheets for analysis. It also is better at understanding images.
Gemini in Messages: Chat with Gemini in Google Messages on supported Android devices (Pixel 6 or later, Pixel Fold, Samsung Galaxy S22 or later, Samsung Galaxy Z Flip or Fold) . This is available in English and French Canadian with RCS Chat enabled. Learn more.
Gemini Live lets you talk to Gemini. You can select a “natural sounding voice”. This will be available for Gemini Advanced subscribers “in the coming months”.
Gemini available in 35 additional languages: Available for Gemini for Enterprise and Gemini for Business, and Gemini Advanced.
LearnLM: A “new family of models fine-tuned for learning, based on Gemini” that help power Google’s AI tools.
Gems learning coach and personal experts in Gemini chat (available “soon” for Gemini Advanced subscribers)
Ask questions about academic YouTube videos (available to “select Android users in the US”)
Adjust Google Search AI overviews to make them easier to understand
Solve math problems using Android Circle to Search
Gemini add-on for Google Workspace for Education: New paid add-ons for education accounts. This is only available to users aged 18+.
Project Astra is Google’s vision for an AI assistant that lives on your phone or smart glasses. Watch the demo.
Read on for an overview of AI in Google Search, Gemini in Google Workspace, and new Photo and Video features.
More:
Will AI in Google Search Kill Websites?
Google’s core business is Search (with advertising), so it’s no surprise that it also has generative AI features. AI search summaries were launched as an experimental lab at last year's Google I/O, and now it’s launching to everyone (in the US anyway).
Here’s how it works: You perform a search and the Gemini-powered AI Search Overview summarizes information from the Google Search results. Referenced web pages are shown as cards at the bottom of the summary.
Additional features:
Adjust the summary by selecting “simplify” or “break it down” (for more details). This will be available with Search Labs in English in the US.
Ask complex questions with multiple steps. This will be available with Search Labs in English in the US.
Meal and trip planning lets Google generate a plan for you that you can edit and export to Google Docs. More categories will be added later this year. Available now with Search Labs in English in the US.
AI-organized search results for inspiration about dining and recipes. Available for English search in the US. It will later be available for search results for movies, music, books, hotels and shopping.
Searching with video lets you ask a question about something you see (that you’ve recorded a video of). This will soon be available with Search Labs in English in the US.
How can you turn AI Overviews off?
These new features add to the clutter of the Google Search Results. What can you do if you hate it? If AI Overviews are enabled for you because you are in the US, there is no way to turn it off.
But, perhaps not coincidentally, Google also just launched a new Search filter. Run a search and click the Web filter button to see just a list of links.
How will this affect websites?
The big question is whether this will kill traffic to the websites that provided the information in the AI overview. If the information someone is looking for is summarized by Google, would they click a link to find out more?
Google says not to worry.
“And we see that the links included in AI Overviews get more clicks than if the page had appeared as a traditional web listing for that query.”
We’ll see if that holds true.
Can you remove your site from AI Overviews?
Barry Schwartz at Search Engine Roundtable reports on an experiment by Glenn Gabe to block Google from using your content in an overview. It requires blocking Google from using snippets from your site in the search result, which I suspect doesn’t help traffic to your site.
Note that this is different from opting out your content from Google’s AI training.
Learn more
Announcement: Generative AI in Search: Let Google do the searching for you
Google Search Central: AI Overviews and Your Website
Sign up for Google Search Labs to try more AI tools.
AI-Powered Creative Tools
Using generative AI for creative work is controversial. Most current AI models are trained on photos and artwork and writing, in large part without permission from creators or content owners. Some generative AI fans loudly proclaim that AI tools will replace human writers and artists and some businesses have chosen generative AI as a cheap replacement for human labor. And there are missteps like Apple’s dystopian “Crush” ad that add to the negative vibe.
That all said, generative AI tools aren’t going to go away. It’s at least a small comfort that Google works with creatives to try to design tools that help support creative work, rather than replace it.
I will note that Google is not forthcoming on how their models are trained. It likely included images available on the web and YouTube videos, used without explicit permission.
Also, the demos very noticeably don’t include generated images or videos of people. Generating non-eerie looking people is difficult, but doing that well would have been quite notable.
Veo: Text-to-Video
Veo is Google’s video generation model. You can enter a text prompt to generate realistic video at 1080p in a wide range of styles.
It can add visuals to an input video
You can input an image along with the text prompt as a reference
It supports masked editing, allowing you to change a specific part of your video
It can extend videos to 60 seconds “and beyond”
It can “maintain visual consistency” across scenes
The output video is watermarked with SynthID
Some of this functionality will be added to the YouTube Shorts editor (eventually, there’s no announced time).
Imagen 3: Text-to-Image
Imagen 3 is Google’s latest text-to-image model. Google says it is better at understanding natural language prompts and now generates higher quality output.
One of the improvements over previous models is output of images with text. Instead of gibberish or oddly shaped letters, it can generate images with nice looking and accurate text.
Like the videos generated with Veo, images are watermarked with SynthID.
This model will eventually be incorporated across Google products, including the Gemini app, Google Workspace, and Google Ads.
Join the waitlist to try the ImageFX tool with Imgen 3. The current tool uses Imgen 2, an earlier model.
More Beats in MusicFX
Google’s MusicFX platform has a new DJ Mode that lets you combine instruments and musical genres. Google suggests using it to create music for your next video.
Gemini in Google Photos
Ask Photos is a version of the Gemini chatbot that lets you search your Google Photos library with naturally phrased questions. It supports complex searches -- one of Google’s examples is “Show me the best photo from each national park I’ve visited”, which is something that would be difficult to do manually.
But, it’s important to note that this is described as “experimental”, and the images in the announcement are labeled “Sequences simulated. Check responses for accuracy”. It may not work as well as expected.
Former Google Photos product manager John Nack is skeptical that people will actually use this, based on his experience that users usually just scroll through their library, even with search available.
It will be available “soon”.
What about privacy? Google says:
People will not review your Ask Photos conversations, “except in rare cases to address abuse or harm”.
Your personal data is not used to train Gemini or other AI products outside of Google Photos.
Your data in Google Photos is not used for ads.
More information:
Google I/O 2024: Introducing Veo and Imagen 3 generative AI tools
Introducing VideoFX, plus new features for ImageFX and MusicFX
Ask Photos: A new way to search your photos with Gemini
Gemini in Google Workspace: Summarize, Analyze, Write
Google Workspace includes Google’s productivity tools, including Gmail, Docs, Calendar, Meet and Chat. A lot of these sound useful, but they aren’t glamorous.
Ryan Broderick at Garbage Day sums it up:
While Google is just trying to repackage what Google already does and are calling it “AI” because no one would care if they said they were building Clippy 2.0. Yesterday, AI evangelists were losing their minds over Google’s new AI agent that can generate a spreadsheet of receipts in your Gmail inbox. I mean, do you hear yourselves?
Useful, a bit gee whiz, but not earth shaking. That said, if you do use Google Workspace apps, these updates should make your writing and other tasks a bit easier.
Note: If you have a personal Google Account, sign up for free Google Workspace Labs or a Google One AI Premium subscription to try out AI tools. Google Workspace customers need a Gemini for Workspace add-on.
Gemini in the side panel: Google Docs, Drive, Gmail, Sheets, and Slides all have a side panel when you can access connected add-ons. Gemini will now be there too, with the option to chat, summarize your document or emails, analyze your sheets, or provide suggestions. You’ll be able to use it to retrieve information across your account, such as having it find information in a Doc to insert into an email.
Availability: Rolling out now to Workspace Labs and Gemini for Workspace Alpha, available in June for Gemini for Workspace and Google One AI Premium.
Summarize Gmail email threads on mobile: There’s no side panel in the Workspace mobile apps, so Gemini integration works a bit differently. First up is a new “Summarize this email” button in the Gmail app, which can summarize long email threads.
Availability: Available “this month” in Workspace Labs, and in June to Gemini for Workspace and Google One AI Premium.
Gmail Contextual Smart Replies: When a short “Smart Reply” isn’t enough, the new Gemini-powered replies take entire email threads into consideration when generating a longer response. It will offer multiple “thoughtful, one tap options” to choose from.
Availability: Available in Workspace Labs on desktop and mobile starting in July.
Gmail Q&A: Ask Gemini to retrieve information from your emails. Ask questions like “When does Susan’s party start on Saturday?” or “Compare my recent roof repair bids by price and availability”, and get nicely formatted answers pulled from your email.
Availability: Available in Workspace Labs on desktop and mobile starting in July.
Organize and track email attachments: Gemini can recognize email attachments like receipt PDFs. It can organize those attachments into folders and add details to a spreadsheet where you can analyze it. Yes, this is what Ryan Broderick was referring to, and it does look pretty useful. But how well does it work if your emailed receipts are a mix of pdfs, image files and plain text in email messages? That’s what I want my AI assistant to handle.
Availability: Available to Workspace Labs “later this year”.
Help Me Write in Portuguese and Spanish: The “help me write” feature in Google Docs and Gmail initially launched in English. It is now available in Spanish and Portuguese as well.
Availability: Rolling out now to Gemini for Google Workspace and Google One AI Premium.
Create a virtual teammate in Google Chat to summarize projects and search conversations.
Availability: Google Workspace only. Not clear if this is currently available.
More
Announcement: New ways to engage with Gemini for Workspace
Announcement: Three new ways to stay productive with Gemini for Google Workspace
Learn how to use Gemini for business: Tips for marketers, sales research, writing good prompts
OpenAI announces GPT-4o
OpenAI chose to announce the new GPT-4o model the day before Google I/O, stealing some of Google’s thunder. This is available for free with usage limits.
GPT-4 level “intelligence” but faster
Better at understanding images. (“For example, you can now take a picture of a menu in a different language and talk to GPT-4o to translate it, learn about the food's history and significance, and get recommendations.”)
Voice Mode with “natural, real-time voice conversation”. This is coming soon in alpha.
Supports more than 50 languages.
Friendlier, more conversational interface.
New desktop ChatGPT app for macOS.
When free users reach their message limit with GPT-4o, it will automatically switch to GPT-3.5 for further conversation.
That all sounds pretty dry, right? But OpenAI is getting some buzz making the chat voice sound like a giggling flirty young woman. As Zeeshan Aleem at MSNBC describes it “OpenAI nurtures a creepy fantasy with its new AI chatbot, GPT-4o”. Ew.
Watch the demo and learn more.
Video Creator and Live Streaming Updates
YouTube CEO Neal Mohan wrote that video Creators should be considered for Emmy awards.
You can now Remix official music videos into YouTube Shorts.
Twitch launched nine new Partner Discords organized by language and region. The Discords are used by Twitch to share updates, get feedback from Partners, and open discussions. It’s also reported that Twitch is moving from permanent Partner Managers to Managers for Partners who meet specific eligibility requirements, with a review for eligibility twice a year.
Zach Bussey reports Twitch is ending support for its Twitch Studio streaming software, due to low usage.
Preview your videos in Google Drive by hovering your mouse over the video thumbnail. You can then click to open the video full screen.
Web Publishers and Search
WordPress Playground lets you develop WordPress sites and plugins in your browser. Now there is a Blueprint Library that lets developers share their setup.
Substack has a new guide for TikTok creators wanting to start on the platform.
I just saw a notice that X has updated the URL of their privacy policy. Why does this matter?Up until now it was a twitter.com URL, rather than an X.com link. Supposedly the site has completed its move to X.com. It may not have completely shifted yet. As of this writing, I’m not seeing twitter.com redirect to X.com. Plus getting a post embed code is still on the “Twitter Publish” page (publish.twitter.com) and the embed code itself still uses a twitter.com URL. Speaking of which, I’m assuming the Twitter domain won’t be completely abandoned, as that would kill all older links and embeds from the site. (Maybe I shouldn’t assume?)
X added several new features to Communities: you will see recommended posts from Communities in your timeline, there are trending Community posts on the Community Explore tab, Communities are searchable by topic, and soon there will be better moderation tools, analytics for mods, and Community Spaces.
Threads now lets you filter your search results by “Recent”, which shows results in chronological order.
Instagram is expanding its Creator Marketplace to more countries. The Marketplace helps brands connect with creators. New countries include Argentina, Mexico, South Korea, Germany, Netherlands, France, Spain, Israel, Turkey, and Indonesia. That’s in addition to the US, Canada, Australia, New Zealand, Japan, India and Brazil.
Reddit is bringing back rewards. There is once again an award button underneath posts and a variety of awards to select from. Awards will not be available for NSFW or mature content. To give an award you will have to pay real money to buy coins, which can then purchase awards - they aren’t bringing back the occasional free awards to regular users. However, Redditors who had to quickly spend up their coin balance under the original system will be given free exclusive awards to pass out.
Reddit is expanding their Contributor Program to more countries. The program is open to Redditors with at least 1,000 gold/awards and 100 karma earned after receiving their first gold. Top awardees can earn real money.
Reddit has new tools for hosting an “Ask Me Anything” (AMA). Hosts can easily schedule and prompt AMAs in advance and include guest collaborators. Participants can sort by Answered versus Unanswered questions.
Workplace from Meta will be shutting down in September 2025. This is basically a private Facebook for businesses. Data will be available for download through May 2026.
Communication and Collaboration
Google’s Project Starline “magic window” technology for almost-like-being-in-the-same-room video calls will be commercialized in 2025, with integration with Google Meet and Zoom.
Preview files in Google Drive by hovering over the file name. The preview will include the file type, file owner and when it was last updated. Video files can be previewed by hovering over the thumbnail.
Microsoft Teams is making it easier to create and join teams and channels and to archive outdated channels.
More AI Updates
ChatGPT now allows users to upload files from Google Drive or Microsoft OneDrive for analysis. This is available for ChatGPT Plus, Enterprise and Team users.
Prominent members of OpenAI’s team that works to ensure their AI won’t harm humanity have left the company. Vox has the story: “I lost trust”: Why the OpenAI team in charge of safeguarding humanity imploded
Reddit has done a deal with OpenAI. OpenAI will access Reddit’s Data API, and Reddit will add OpenAI-powered features. OpenAI is also advertising on the platform.
Meta’s Chief Product Officer Chris Cox says that Meta’s Emu text-to-image model is good because it trained on public Instagram photos. If you don’t want your photos used for future AI training, it seems the only option is to make them private.
More Reading (and watching)
The Markup has a graphical guide: How Your Attention Is Auctioned Off to Advertisers
Thanks for reading! 🌼
Social Media