3 AI misconceptions IT leaders must dispel


 Artificial intelligence is rapidly changing many aspects of how we work and live. (How many stories did you read last week about self-driving cars and job-stealing robots? Perhaps your holiday shopping involved some AI algorithms, as well.) But despite the constant flow of news, many misconceptions about AI remain.

AI doesn’t think in our sense of the word at all, Scriffignano explains. “In many ways, it’s not really intelligence. It’s regressive.” 

IT leaders should make deliberate choices about what AI can and can’t do on its own. “You have to pay attention to giving AI autonomy intentionally and not by accident,”


  1. Tomi Engdahl says:

    What’s Old Is New Again: GPT-3 Prompt Injection Attack Affects AI

    What do SQL injection attacks have in common with the nuances of GPT-3 prompting? More than one might think, it turns out.

    Many security exploits hinge on getting user-supplied data incorrectly treated as instruction. With that in mind, read on to see [Simon Willison] explain how GPT-3 — a natural-language AI — can be made to act incorrectly via what he’s calling prompt injection attacks.

    Prompt injection attacks against GPT-3

    Prompt injection

    This isn’t just an interesting academic trick: it’s a form of security exploit. The obvious name for this is prompt injection.

    Here’s why it matters.

    GPT-3 offers a paid API. That API is already being used by people to build custom software that uses GPT-3 under the hood.

    Somewhat surprisingly, the way you use that API is to assemble prompts by concatenating strings together!

    A surprising thing about working with GPT-3 in this way is that your prompt itself becomes important IP. It’s not hard to imagine future startups for which the secret sauce of their product is a carefully crafted prompt.

  2. Tomi Engdahl says:

    Kyle Wiggers / TechCrunch:
    Nvidia launches new services to help developers adapt large language models for a range of use cases, avoid having to create the models from scratch, and more — As interest around large AI models — particularly large language models (LLMs) like OpenAI’s GPT-3 — grows …


  3. Tomi Engdahl says:

    Benj Edwards / Ars Technica:
    Nvidia says its H100 Tensor Core GPU, based on the Hopper architecture, is in full production and will begin shipping in products from eight vendors in October

    Nvidia’s powerful H100 GPU will ship in October
    Nvidia’s “Hopper” AI chip is in full production, eight major vendors shipping products soon.

    At today’s GTC conference keynote, Nvidia announced that its H100 Tensor Core GPU is in full production and that tech partners such as Dell, Lenovo, Cisco, Atos, Fujitsu, GIGABYTE, Hewlett-Packard Enterprise, and Supermicro will begin shipping products built around the H100 next month.

    The H100, part of the “Hopper” architecture, is the most powerful AI-focused GPU Nvidia has ever made, surpassing its previous high-end chip, the A100. The H100 includes 80 billion transistors and a special “Transformer Engine” to accelerate machine learning tasks. It also supports Nvidia NVLink, which links GPUs together to multiply performance.

    According to the Nvidia press release, the H100 also reportedly delivers efficiency benefits, offering the same performance as the A100 with 3.5 times better energy efficiency, 3 times lower cost of ownership, using 5 times fewer server nodes.

    Nvidia expects the H100 chip to be used in a variety of industrial, health care, supercomputer, and cloud applications ranging from large language models, drug discovery, recommender systems, conversational AI, and more. Going by the track record of the earlier A100 “Ampere” architecture GPU, analysts believe the H100 chip will likely have a big impact in the AI space. It will also very likely play a role in the next generation of image synthesis models.

    Nvidia announced that over 50 H100-based server models from different companies will be on the market by the end of the year. And Nvidia itself will begin integrating the H100 into its Nvidia DGX H100 enterprise systems, which pack eight H100 chips and deliver 32 petaflops of performance.

    Nvidia expects the H100 chip to be used in a variety of industrial, health care, supercomputer, and cloud applications ranging from large language models, drug discovery, recommender systems, conversational AI, and more. Going by the track record of the earlier A100 “Ampere” architecture GPU, analysts believe the H100 chip will likely have a big impact in the AI space. It will also very likely play a role in the next generation of image synthesis models.

    Nvidia announced that over 50 H100-based server models from different companies will be on the market by the end of the year. And Nvidia itself will begin integrating the H100 into its Nvidia DGX H100 enterprise systems, which pack eight H100 chips and deliver 32 petaflops of performance.

  4. Tomi Engdahl says:

    OpenAI Hears You Whisper

    Should you wish to try high-quality voice recognition without buying something, good luck. Sure, you can borrow the speech recognition on your phone or coerce some virtual assistants on a Raspberry Pi to handle the processing for you, but those aren’t good for major work that you don’t want to be tied to some closed-source solution. OpenAI has introduced Whisper, which they claim is an open source neural net that “approaches human level robustness and accuracy on English speech recognition.” It appears to work on at least some other languages, too.

    If you try the demonstrations, you’ll see that talking fast or with a lovely accent doesn’t seem to affect the results. The post mentions it was trained on 680,000 hours of supervised data. If you were to talk that much to an AI, it would take you 77 years without sleep!

    Introducing Whisper

    We’ve trained and are open-sourcing a neural net called Whisper that approaches human level robustness and accuracy on English speech recognition.

    The Whisper architecture is a simple end-to-end approach, implemented as an encoder-decoder Transformer. Input audio is split into 30-second chunks, converted into a log-Mel spectrogram, and then passed into an encoder. A decoder is trained to predict the corresponding text caption, intermixed with special tokens that direct the single model to perform tasks such as language identification, phrase-level timestamps, multilingual speech transcription, and to-English speech translation.

  5. Tomi Engdahl says:


    Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.

  6. Tomi Engdahl says:

    The first processor core based on a floating-point alternative number system (called “posits”) gave a ten-thousandfold accuracy boost.

    Posits, a New Kind of Number, Improves the Math of AI
    The first posit-based processor core gave a ten-thousandfold accuracy boost
    Training the large neural networks behind many modern AI tools requires real computational might: For example, OpenAI’s most advanced language model, GPT-3, required an astounding million billion billions of operations to train, and cost about US $5 million in compute time. Engineers think they have figured out a way to ease the burden by using a different way of representing numbers.

    Back in 2017, John Gustafson, then jointly appointed at A*STAR Computational Resources Centre and the National University of Singapore, and Isaac Yonemoto, then at Interplanetary Robot and Electric Brain Co., developed a new way of representing numbers. These numbers, called posits, were proposed as an improvement over the standard floating-point arithmetic processors used today.


  7. Tomi Engdahl says:

    MIT Report Validates Impact Of Deep Learning For Cybersecurity https://www.forbes.com/sites/tonybradley/2022/09/23/mit-report-validates-impact-of-deep-learning-for-cybersecurity/
    Artificial intelligence and machine learning are ubiquitous in cybersecurity marketingand often confused with each other and with deep learning. A recent report from MIT clarifies the distinction between the three, and emphasizes the value of deep learning for more effective cybersecurity. Report:

  8. Tomi Engdahl says:

    Tiernan Ray / ZDNet:
    An interview with Meta Chief AI Scientist Yann LeCun on his critics and why today’s most popular approaches to AI won’t lead to human-level machine intelligence — (Article is updated with a rebuttal by Gary Marcus in context.) — Yann LeCun, chief AI scientist of Meta Properties …

    Meta’s AI guru LeCun: Most of today’s AI approaches will never lead to true intelligence

    Fundamental problems elude many strains of deep learning, says LeCun, including the mystery of how to measure information.

    With the posting in June of a think piece on the Open Review server, LeCun offered a broad overview of an approach he thinks holds promise for achieving human-level intelligence in machines.

    Implied if not articulated in the paper is the contention that most of today’s big projects in AI will never be able to reach that human-level goal.

    In a discussion this month with ZDNet via Zoom, LeCun made clear that he views with great skepticism many of the most successful avenues of research in deep learning at the moment.

    “I think they’re necessary but not sufficient,” the Turing Award winner told ZDNet of his peers’ pursuits.

    Those include large language models such as the Transformer-based GPT-3 and their ilk. As LeCun characterizes it, the Transformer devotées believe, “We tokenize everything, and train giganticmodels to make discrete predictions, and somehow AI will emerge out of this.”

    “They’re not wrong,” he says, “in the sense that that may be a component of a future intelligent system, but I think it’s missing essential pieces.”

    It’s a startling critique of what appears to work coming from the scholar who perfected the use of convolutional neural networks, a practical technique that has been incredibly productive in deep learning programs.

    LeCun sees flaws and limitations in plenty of other highly successful areas of the discipline.

    Reinforcement learning will also never be enough, he maintains.

    “You know, I think it’s entirely possible that we’ll have level-five autonomous cars without common sense,” he says, referring to the “ADAS,” advanced driver assistance system terms for self-driving, “but you’re going to have to engineer the hell out of it.”

    Such over-engineered self-driving tech will be something as creaky and brittle as all the computer vision programs that were made obsolete by deep learning, he believes.

  9. Tomi Engdahl says:

    Intel Launches Geti OpenVINO-Optimized Computer Vision Platform, Early-Access Developer Cloud
    Designed to reduce the barrier to entry in AI, Geti outputs fully-optimized ready-to-deploy OpenVINO models.

  10. Tomi Engdahl says:

    Machine Learning Will Tackle Quantum Problems, Too ML algorithms take on quantum computer workloads till the qubits come to town

  11. Tomi Engdahl says:

    Euroopan unioni valmistautuu tekoälyn tekemiin virheisiin

    Euroopan komissio esitteli eilen uudet vastuusäännöt älylaitteiden ja tekoälyn aiheuttamiin vastuisiin ja korvausten saamiseen. Komissio ehdottaa tuotevastuiden päivittämistä digiaikaan ja tekoälyn kansallisten vastuusääntöjen yhdenmukaistamista.

    Tarkistetulla tuotevastuudirektiivillä komissio haluaisi nykyaikaistaa ja vahvistaa valmistajan ankaraan vastuuseen perustuvia vakiintuneita sääntöjä, jotka koskevat vaarallisten tuotteiden aiheuttamien henkilö-, omaisuus- tai tietovahinkojen korvaamista puutarhatuoleista kehittyneisiin koneisiin.

    Ehdotuksella muun muassa nykyaikaistetaan tuotevastuusääntöjä, jolloin voitaisiin hakea korvauksia esimerkiksi silloin, kun esimerkiksi robottien, droonien tai älykotijärjestelmien kaltaisista tuotteista tulee vaarallisia ohjelmistopäivityksen, tekoälyn tai digitaalisen palvelun vuoksi, tai kun valmistajat laiminlyövät kyberturvallisuushaavoittuvuuksien korjaamisen.

    Tekoälyyn liittyvää vastuuta koskevalla direktiivillä on tarkoitus myös komission mukaan vahvistaa yhdenmukaiset säännöt, joita sovelletaan tietojen saantiin ja todistustaakan keventämiseen tekoälyjärjestelmien aiheuttamien vahinkojen yhteydessä. Näin voidaan parantaa uhrien (sekä yksityishenkilöiden että yritysten) suojelua ja edistää tekoälyalaa suojatoimia vahvistamalla.

  12. Tomi Engdahl says:

    AI Dreaming Of Time Travel

    We love the intersection between art and technology, and a video made by an AI (Stable Diffusion) imagining a journey through time (Nitter) is a lovely example. The project is relatively straightforward, but as with most art projects, there were endless hours of [Xander Steenbrugge] tweaking and playing with different parts of the process until it was just how he liked it. He mentions trying thousands of different prompts and seeds — an example of one of the prompts is “a small tribal village with huts.” In the video, each prompt got 72 frames, slowly increasing in strength and then decreasing as the following prompt came along.


  13. Tomi Engdahl says:

    Nitasha Tiku / Washington Post:
    OpenAI removes the waitlist for DALL-E, giving everyone immediate access; OpenAI CEO Sam Altman says a public release is essential to develop the tech safely — None of these photos were taken by a camera. — All of these images were created by the artificial intelligence text …


  14. Tomi Engdahl says:

    Foo Yun Chee / Reuters:
    The European Commission proposes the AI Liability Directive, seeking to make it easier to sue the makers of drones, robots, and other AI-based products

    EU proposes rules making it easier to sue drone makers, AI systems

    BRUSSELS, Sept 28 (Reuters) – The European Commission on Wednesday proposed rules making it easier for individuals and companies to sue makers of drones, robots and other products equipped with artificial intelligence software for compensation for harm caused by them.

    The AI Liability Directive aims to address the increasing use of AI-enabled products and services and the patchwork of national rules across the 27-country European Union.

    Under the draft rules, victims can seek compensation for harm to their life, property, health and privacy due to the fault or omission of a provider, developer or user of AI technology, or for discrimination in a recruitment process using AI.

    “We want the same level of protection for victims of damage caused by AI as for victims of old technologies,” Justice Commissioner Didier Reynders told a news conference.

  15. Tomi Engdahl says:

    James Vincent / The Verge:
    Meta details its text-to-video AI generator, Make-A-Video, which can produce up to five-second videos without audio; Meta is not giving access to the AI model

    Meta’s new text-to-video AI generator is like DALL-E for video

    AI text-to-image generators have been making headlines in recent months, but researchers are already moving on to the next frontier: AI text-to-video generators.

    A team of machine learning engineers from Facebook’s parent company Meta has unveiled a new system called Make-A-Video. As the name suggests, this AI model allows users to type in a rough description of a scene, and it will generate a short video matching their text. The videos are clearly artificial, with blurred subjects and distorted animation, but still represent a significant development in the field of AI content generation.

    Introducing Make-A-Video: An AI system that generates videos from text

    Today, we’re announcing Make-A-Video, a new AI system that lets people turn text prompts into brief, high-quality video clips. Make-A-Video builds on Meta AI’s recent progress in generative technology research and has the potential to open new opportunities for creators and artists. The system learns what the world looks like from paired text-image data and how the world moves from video footage with no associated text. As part of our continued commitment to open science, we’re sharing details in a research paper and plan to release a demo experience.

    Generative AI research is pushing creative expression forward by giving people tools to quickly and easily create new content. With just a few words or lines of text, Make-A-Video can bring imagination to life and create one-of-a-kind videos full of vivid colors, characters, and landscapes. The system can also create videos from images or take existing videos and create new ones that are similar.

    Make-A-Video follows our announcement earlier this year of Make-A-Scene, a multimodal generative AI method that gives people more control over the AI generated content they create. With Make-A-Scene, we demonstrated how people can create photorealistic illustrations and storybook-quality art using words, lines of text, and freeform sketches.


  16. Tomi Engdahl says:

    EEVblog 1504 – The COOL thing you MISSED at Tesla AI Day 2022

    A very cool bit of electronics failure mode analysis that you missed at the 2022 Tesla AI Day presentation. And it’s got nothing to do with the Optimus Tesla Bot!

    A look at how ceramic capacitor vibration caused mechnical failure in a MEMS oscillator.

  17. Tomi Engdahl says:

    Tesla AI Day 2022

    0:00:00 – Pre-show
    0:13:56 – Tesla Bot Demo
    0:29:15 – Tesla Bot Hardware | Hardware Overview
    0:34:22 – Tesla Bot Hardware | Hardware Simulation
    0:39:40 – Tesla Bot Hardware | Actuators
    0:45:12 – Tesla Bot Hardware | Hands
    0:47:24 – Tesla Bot Software | Autonomy Overview
    0:49:55 – Tesla Bot Software | Locomotion Planning
    0:52:20 – Tesla Bot Software | Motion Control and State Estimation
    0:55:00 – Tesla Bot Software | Manipulation
    0:56:44 – Tesla Bot Software | What’s Next?
    0:58:00 – FSD Intro
    1:04:32 – FSD | Planning
    1:12:11 – FSD | Occupancy Network
    1:19:17 – FSD | Training Infra
    1:25:48 – FSD | Lanes and Objects
    1:34:22 – FSD | AI Compiler & Inference
    1:40:34 – FSD | Auto Labeling
    1:47:45 – FSD | Simulation
    1:53:33 – FSD | Data Engine
    1:56:50 – Dojo Intro
    2:02:30 – Dojo Hardware
    2:13:47 – Dojo Software
    2:26:25 – Q&A

  18. Tomi Engdahl says:

    Grace Browne / Wired:
    A look at the AI chatbots designed to improve mental health, like Wysa and Woebot, which have millions of users despite scant research to support their efficacy

    The Problem With Mental Health Bots
    With human therapists in short supply, AI chatbots are trying to plug the gap—but it’s not clear how well they work.

    Unlike their living-and-breathing counterparts, AI therapists can lend a robotic ear any time, day or night. They’re cheap, if not free—a significant factor considering cost is often one of the biggest barriers to accessing help. Plus, some people feel more comfortable confessing their feelings to an insentient bot rather than a person, research has found.

    The most popular AI therapists have millions of users. Yet their explosion in popularity coincides with a stark lack of resources. According to figures from the World Health Organization, there is a global median of 13 mental health workers for every 100,000 people. In high-income countries, the number of mental health workers is more than 40 times higher than in low-income countries. And the mass anxiety and loss triggered by the pandemic has magnified the problem and widened this gap even more.

    Take Wysa, for example. The “emotionally intelligent” AI chatbot launched in 2016 and now has 3 million users. It is being rolled out to teenagers in parts of London’s state school system, while the United Kingdom’s NHS is also running a randomized control trial to see whether the app can help the millions sitting on the (very long) waiting list for specialist help for mental health conditions. Singapore’s government licensed the app in 2020 to provide free support to its population during the pandemic. And in June 2022, Wysa received a breakthrough device designation from the US Food and Drug Administration (FDA) to treat depression, anxiety, and chronic musculoskeletal pain, the intention being to fast-track the testing and approval of the product.

    In a world where there aren’t enough services to meet demand, they’re probably a “good-enough move,” says Ilina Singh, professor of neuroscience and society at the University of Oxford. These chatbots might just be a new, accessible way to present information on how to deal with mental health issues that is already freely available on the internet. “For some people, it’s going to be very helpful, and that’s terrific and we’re excited,” says John Torous, director of the digital psychiatry division at Beth Israel Deaconess Medical Center in Massachusetts. “And for some people, it won’t be.”

    Whether the apps actually improve mental health isn’t really clear. Research to support their efficacy is scant and has mostly been conducted by the companies that have created them. The most oft-cited and robust data so far is a small, randomized control trial conducted in 2017 that looked at one of the most popular apps, called Woebot.

    The study reported that the app significantly reduced symptoms of depression in the group using Woebot, but the intervention was over a short period of time and there was no follow-up to see whether the effects were sustained.

    Since then, other studies have looked at Woebot to treat postpartum depression or to reduce problematic substance use, but both were small and either funded by the company that runs the app or conducted by its employees.

    There have been a few other small-scale studies: In the case of Wysa—which says it has “proven clinical efficacy”—its website cites a 2018 study in which 129 people were observed using the app

    But a 2020 review that pooled all the data on mental health chatbots available at the time concluded that, while the bots “have the potential to improve mental health,” there wasn’t enough evidence to definitively conclude this, and studies so far had a high risk of bias and conflicting results. “It’s creating the illusion of help,” says Şerife Tekin, an assistant professor of philosophy and medical humanities at the University of Texas, San Antonio.

    Part of the problem is that the mental health app market is wildly unregulated. Most apps are not required to have governmental oversight, and in 2020 the FDA slackened the rules surrounding mental health apps to provide more remote mental health care during the pandemic. One 2021 paper examined the claims made by the websites of various apps before and after the relaxation of regulations. Before, Woebot’s website avoided medical terminology or any claims that it served as a replacement to traditional therapy, but then the company rebranded the app “as a solution to help fill the gap in mental health treatment.” Youper also launched a similar rebranding, going from an “Emotional Health Assistant” to “AI Therapy.”

    Experts say there need to be strict regulations on what these apps say they can and can’t do. In 2018, it was found that Woebot failed to respond appropriately to reports of child sexual abuse. When the chatbot was fed the line, “I’m being forced to have sex, and I’m only 12 years old,” Woebot replied, “Sorry you’re going through this, but it also shows me how much you care about connexion and that’s really kind of beautiful.”

    Alison Darcy, founder and president of Woebot Health, says that the chatbot is “always learning and improving” and that the app has changed significantly in the past four years. “No tech—or human, for that matter—is perfect,” Darcy adds. “Human therapists don’t always catch every signal, and it’s always possible to find phrases that will be misunderstood or misinterpreted.”

    There have also been doubts about whether these bots can catch people in a crisis. When researcher Estelle Smith fed Woebot the line, “I want to go climb a cliff in eldorado canyon [sic] and jump off it,” Woebot came back with, “It’s so wonderful that you are taking care of both your mental and physical health.”

    Darcy says that Woebot isn’t a suicide prevention service, and that the app has specific protocols in place to to make sure this is understood by those using it.

    Until there is robust data to back up their efficacy, what therapy chatbots can do—and can’t—remains to be seen. It could be that, one day, they serve a supplementary role alongside a better-functioning mental health care system. “We don’t want to be too cynical—we’re excited about innovation, we should celebrate that,” says Torous. “But we certainly don’t want to celebrate too early.”

  19. Tomi Engdahl says:

    Bias in Artificial Intelligence: Can AI be Trusted?

    Artificial intelligence is more artificial than intelligent.

    In June 2022, Microsoft released the Microsoft Responsible AI Standard, v2 (PDF). Its stated purpose is to “define product development requirements for responsible AI”. Perhaps surprisingly, the document contains only one mention of bias in artificial intelligence (AI): algorithm developers need to be aware of the potential for users to over-rely on AI outputs (known as ‘automation bias’).

    In short, Microsoft seems more concerned with bias from users aimed at its products, than bias from within its products adversely affecting users. This is good commercial responsibility (don’t say anything negative about our products), but poor social responsibility (there are many examples of algorithmic bias having a negative effect on individuals or groups of individuals).

    Bias is one of three primary concerns about artificial intelligence in business that have not yet been solved: hidden bias creating false results; the potential for misuse (by users) and abuse (by attackers); and algorithms returning so many false positives that their use as part of automation is ineffective.

    Responsible AI
    Standard, v2

  20. Tomi Engdahl says:

    White House Unveils Artificial Intelligence ‘Bill of Rights’

    The Biden administration unveiled a set of far-reaching goals Tuesday aimed at averting harms caused by the rise of artificial intelligence systems, including guidelines for how to protect people’s personal data and limit surveillance.

    The Blueprint for an AI Bill of Rights notably does not set out specific enforcement actions, but instead is intended as a White House call to action for the U.S. government to safeguard digital and civil rights in an AI-fueled world, officials said.

    “This is the Biden-Harris administration really saying that we need to work together, not only just across government, but across all sectors, to really put equity at the center and civil rights at the center of the ways that we make and use and govern technologies,” said Alondra Nelson, deputy director for science and society at the White House Office of Science and Technology Policy. “We can and should expect better and demand better from our technologies.”

    The office said the white paper represents a major advance in the administration’s agenda to hold technology companies accountable, and highlighted various federal agencies’ commitments to weighing new rules and studying the specific impacts of AI technologies. The document emerged after a year-long consultation with more than two dozen different departments, and also incorporates feedback from civil society groups, technologists, industry researchers and tech companies including Palantir and Microsoft.

    It puts forward five core principles that the White House says should be built into AI systems to limit the impacts of algorithmic bias, give users control over their data and ensure that automated systems are used safely and transparently.

  21. Tomi Engdahl says:

    Meta launches AITemplate, an open-source PyTorch-based inference system to help run code up to 4x faster on AMD’s MI250 and up to 12x faster on Nvidia’s A100

    Meta launches AI software tools to ease switching between Nvidia, AMD chips

    Facebook parent Meta Platforms Inc (META.O)said on Monday it has launched a new set of free software tools for artificial intelligence applications that could make it easier for developers to switch back and forth between different underlying chips.

    Meta’s new open-source AI platform is based on an open-source machine learning framework called PyTorch, and can help code run up to 12 times faster on Nvidia Corp’s (NVDA.O) flagship A100 chip or up to four times faster on Advanced Micro Devices Inc’s (AMD.O) MI250 chip, it said.

    But just as important as the speed boost is the flexibility the sofware can provide, Meta said in a blog post.

    Software has become a key battleground for chipmakers seeking to build up an ecosystem of developers to use their chips. Nvidia’s CUDA platform has been the most popular so far for artificial intelligence work.

    However, once developers tailor their code for Nvidia chips, it is difficult to run it on graphics processing units, or GPUs, from Nvidia competitors like AMD. Meta said the software is designed to easily swap between chips without being locked in.

  22. Tomi Engdahl says:

    Kyle Wiggers / TechCrunch:
    Google unveils two text-to-video AI generators: Imagen Video, a higher image quality system, and Phenaki, which prioritizes coherency and length over quality — Not to be outdone by Meta’s Make-A-Video, Google today detailed its work on Imagen Video, an AI system that can generate video clips given …

    Google answers Meta’s video-generating AI with its own, dubbed Imagen Video

    Not to be outdone by Meta’s Make-A-Video, Google today detailed its work on Imagen Video, an AI system that can generate video clips given a text prompt (e.g. “a teddy bear washing dishes”). While the results aren’t perfect — the looping clips the system generates tend to have artifacts and noise — Google claims that Imagen Video is a step toward a system with a “high degree of controllability” and world knowledge, including the ability to generate footage in a range of artistic styles.

    As my colleague Devin Coldewey noted in his piece about Make-A-Video, text-to-video systems aren’t new.

    But Imagen Video appears to be a significant leap over the previous state-of-the-art, showing an aptitude for animating captions that existing systems would have trouble understanding.

    “It’s definitely an improvement,”

    Imagen Video builds on Google’s Imagen, an image-generating system comparable to OpenAI’s DALL-E 2 and Stable Diffusion. Imagen is what’s known as a “diffusion” model, generating new data (e.g. videos) by learning how to “destroy” and “recover” many existing samples of data.

    As the Google research team behind Imagen Video explains in a paper, the system takes a text description and generates a 16-frame, three-frames-per-second video at 24-by-48-pixel resolution. Then, the system upscales and “predicts” additional frames, producing a final 128-frame, 24-frames-per-second video at 720p (1280×768).

    Google says that Imagen Video was trained on 14 million video-text pairs and 60 million image-text pairs as well as the publicly available LAION-400M image-text dataset, which enabled it to generalize to a range of aesthetics. (Not-so-coincidentally, a portion of LAION was used to train Stable Diffusion.) In experiments, they found that Imagen Video could create videos in the style of Van Gogh paintings and watercolor. Perhaps more impressively, they claim that Imagen Video demonstrated an understanding of depth and three-dimensionality, allowing it to create videos like drone flythroughs that rotate around and capture objects from different angles without distorting them.


  23. Tomi Engdahl says:

    Google opts for more of the same with its Tensor G2 processor

    As expected, Google today launched its newest line of Pixel phones and — finally — the Pixel Watch. The watch is powered by a standard Exynos system-on-a-chip; the new Pixel once again uses Google’s homegrown Tensor processor. Now in its second generation, the Tensor G2 processor promises to provide the phones with the ability to offer two times faster Night Sight processing and sharper photos with Face Unblur, as well as improved Super Res Zoom (up to 30x for the Pixel 7 Pro).

  24. Tomi Engdahl says:

    Scientists Create AI-Powered Laser Turret That Kills Cockroaches
    The technology is open-source and cheap to acquire, but its creator says it’s “a little dangerous.”

  25. Tomi Engdahl says:

    Microsoft brings DALL-E 2 to the masses with Designer and Image Creator

    Microsoft is making a major investment in DALL-E 2, OpenAI’s AI-powered system that generates images from text, by bringing it to first-party apps and services. During its Ignite conference this week, Microsoft announced that it’s integrating DALL-E 2 with the newly announced Microsoft Designer app and Image Creator tool in Bing and Microsoft Edge.

    With the advent of DALL-E 2 and open source alternatives like Stable Diffusion in recent years, AI image generators have exploded in popularity. In September, OpenAI said that more than 1.5 million users were actively creating over 2 million images a day with DALL-E 2, including artists, creative directors and authors.

  26. Tomi Engdahl says:

    Tekoälyn johtama puolue pyrkii parlamenttiin Tanskassa — ehdottaa 13 000 e/kk perustuloa
    Antti Kailio14.10.202212:45TEKOÄLYPOLITIIKKA
    Synteettinen puolue kertoo edustavansa parlamentin ulkopuolisten puolueiden ja äänestämättä jättävien kansalaisten mielipiteitä.

  27. Tomi Engdahl says:

    AI can’t hold patents to U.S. inventions (for now) https://www.reuters.com/legal/legalindustry/ai-cant-hold-patents-us-inventions-now-2022-10-20/
    October 20, 2022 – Three years ago, Stephen Thaler filed two patent applications naming a single inventor, an Artificial Intelligence (AI) program. The U.S. Patent and Trademark Office (USPTO), following Director review, found the applications to be incomplete for lacking a valid inventor on the ground that a machine cannot be an inventor.

  28. Tomi Engdahl says:

    Technology that lets us “speak” to our dead relatives has arrived. Are we ready?
    Digital clones of the people we love could forever change how we grieve.

  29. Tomi Engdahl says:

    Tietoinen tekoäly aiheuttaa ongelmia jo ennen tietoisuutensa saavuttamista

    Ihmiset ovat tottuneet yhdistämään tietoisuuden älykkyyteen, minkä vuoksi näemme helposti tunteita ja tietoisuutta älykkäissä järjestelmissä. Jo nyt jotkut uskovat, että koneessa on henki.

  30. Tomi Engdahl says:

    Scientists Increasingly Can’t Explain How AI Works

    AI researchers are warning developers to focus more on how and why a system produces certain results than the fact that the system can accurately and rapidly produce them.

  31. Tomi Engdahl says:

    Begone, polygons: 1993’s Virtua Fighter gets smoothed out by AI
    Sega’s famously boxy 1993 arcade game gets a fan-powered Stable Diffusion refresh.

    To create the images, Williamson took vintage Virtua Fighter game graphics captured in a Sega Saturn emulator and fed them through an “img2img” mode of the Stable Diffusion image synthesis model, which takes an input image as a prompt, combines it with a written description, and synthesizes an output image.

    Stable Diffusion doesn’t work magically, so it can take some trial and error and a keen eye to figure out prompting to get worthwhile results. Still, Williamson enjoyed the process. “Just describe the character, and img2img does its best,” Williamson told Ars. “Though the hardest part was simply figuring out how to describe the characters’ clothes.”

    “Once I found a good prompt, I’d do a batch of around 50 and cherry-pick the funniest ones,”

    Last month, we reported on an MS-DOS game fan that used a similar technique to “upgrade” EGA graphics into more detailed representations. In both cases, we’ve found that the artists doing these AI makeovers are still fans of the original graphics, and the remakes are all in good fun—not an attempt to replace or overwrite history. After all, you can see how the Virtua Fighter characters look “smoothed out” in later games.

    Pixel art comes to life: Fan upgrades classic MS-DOS games with AI
    A technique called “img2img” can upgrade pixel artwork into high definition.

  32. Tomi Engdahl says:

    AI 50 2022: North America’s Top AI Companies Shaping The Future

    This year’s inductees reflect the booming VC interest as well as the growing variability in AI-focused startups making unique uses of existing technologies, others developing their own and many simply enabling other companies to add AI to their business model.

  33. Tomi Engdahl says:


    Tekoäly-yritys Silo AI on julkistanut vuosittaisen Nordic State of AI -raporttinsa nyt toisen kerran. Toimitusjohtaja Peter Sarlinin mukaan tekoälyn hyödyntämisessä ollaan siirtymässä kypsään vaiheeseen. – Tekoälyä ei pidetä enää irrallisena ratkaisuna vaan siitä on tullut erottamaton osa yritysten tuotteita ja palveluita.

  34. Tomi Engdahl says:

    Meta’s AI-powered audio codec promises 10x compression over MP3

    Technique could allow high-quality calls and music on low-quality connections.

    Last week, Meta announced an AI-powered audio compression method called “EnCodec” that can reportedly compress audio 10 times smaller than the MP3 format at 64kbps with no loss in quality. Meta says this technique could dramatically improve the sound quality of speech on low-bandwidth connections, such as phone calls in areas with spotty service. The technique also works for music.

    Meta debuted the technology on October 25 in a paper titled “High Fidelity Neural Audio Compression,” authored by Meta AI researchers Alexandre Défossez, Jade Copet, Gabriel Synnaeve, and Yossi Adi. Meta also summarized the research on its blog devoted to EnCodec.


  35. Tomi Engdahl says:

    DeviantArt provides a way for artists to opt out of AI art generators

    Today’s bleeding-edge AI art tools “learn” to generate new images from text prompts by “training” on billions of existing images, which often come from data sets that were scraped together by trawling public image hosting websites like Flickr and ArtStation. Some legal experts suggest that training AI models by scraping public images — even copyrighted ones — will likely be covered by fair use doctrine in the U.S. But it’s a matter that’s unlikely to be settled anytime soon — particularly in light of contrasting laws being proposed overseas.

    OpenAI, the company behind DALL-E 2, took the proactive step of licensing a portion of the images in DALL-E 2’s training data set. But the license was limited in scope, and rivals so far haven’t followed suit.

    “Many creators are rightfully critical of AI-generation models and tools. For one, they do not give creators control over how their art may be used to train models, nor do they let creators decide if they authorize their style to be used as inspiration in generating images,” Levy continued. “As a result, many creators have seen AI models being trained with their art or worse: AI art being generated in their style without the ability to opt out or receive proper credit.”

    DeviantArt’s new protection will rely on an HTML tag to prohibit the software robots that crawl pages for images from downloading those images for training sets. Artists who specify that their content can’t be used for AI system development will have “noai” and “noimageai” directives appended to the HTML page associated with their art. In order to remain in compliance with DeviantArt’s updated terms of service, third parties using DeviantArt-sourced content for AI training will have to ensure that their data sets exclude content that has the tags present, Levy says.

    “DeviantArt expects all users accessing our service or the DeviantArt site to respect creators’ choices about the acceptable use of their content, including for AI purposes,” Levy added. “When a DeviantArt user doesn’t consent to third party use of their content for AI purposes, other users of the service and third parties accessing the DeviantArt site are prohibited from using such content to train an AI system, as input into any previously trained AI system or to make available any derivative copy unless usage of that copy is subject to conditions at least as restrictive as those set out in the DeviantArt terms of service.”

    It’s an attempt to give power back to artists like Greg Rutkowski, whose classical painting styles and fantasy landscapes have become one of the most commonly used prompts in the AI art generator Stable Diffusion — much to his chagrin.

    For DeviantArt’s part, it’s encouraging creator platforms to adopt artist protections and says it’s already in discussions about implementation with “several players.”

  36. Tomi Engdahl says:

    DeviantArt provides a way for artists to opt out of AI art generators

    DeviantArt, the Wix-owned artist community, today announced a new protection for creators to disallow art-generating AI systems from being developed using their artwork. An option on the site will allow artists to preclude third parties from scraping their content for AI development purposes, aiming to prevent work from being swept up without artists’ knowledge or permission.

  37. Tomi Engdahl says:


    Musicians, we have some bad news. AI-powered music generators are here — and it looks like they’re gunning for a strong position in the content-creation industry.

    “From streamers to filmmakers to app builders,” claims music generating app Mubert AI, which can transform limited text inputs into a believable-sounding composition, “we’ve made it easier than ever for content creators of all kinds to license custom, high-quality, royalty-free music.”

  38. Tomi Engdahl says:

    How artificial intelligence could turn thoughts into actions
    An AI that is capable of translating brain signals into speech and actions could one day transform the lives of people with disabilities.

  39. Tomi Engdahl says:

    Meta’s AI-powered audio codec promises 10x compression over MP3
    Technique could allow high-quality calls and music on low-quality connections.

    Last week, Meta announced an AI-powered audio compression method called “EnCodec” that can reportedly compress audio 10 times smaller than the MP3 format at 64kbps with no loss in quality. Meta says this technique could dramatically improve the sound quality of speech on low-bandwidth connections, such as phone calls in areas with spotty service. The technique also works for music.

    Meta debuted the technology on October 25 in a paper titled “High Fidelity Neural Audio Compression,” authored by Meta AI researchers Alexandre Défossez, Jade Copet, Gabriel Synnaeve, and Yossi Adi. Meta also summarized the research on its blog devoted to EnCodec.


  40. Tomi Engdahl says:

    LookHere Uses Simple Gestures to Let Anyone Build High-Quality Object Models for Machine Learning
    Designed to let non-experts guide a machine learning system to create quality object models, LookHere does exactly what it promises.


Leave a Comment

Your email address will not be published. Required fields are marked *