• Generally Intelligent #9: Drew Linsley, Brown, on inductive biases for vision and generalization

    RSS · Spotify · Apple Podcasts · Pocket Casts


    Drew Linsley (Google Scholar) (Website) is a Paul J. Salem senior research associate at Brown, advised by Thomas Serre. He is working on building computational models of the visual system that serve the dual purpose of (1) explaining biological function and (2) extending artificial vision. Prior to his work in the Serre lab, he completed a PhD in computational neuroscience at Boston College and a BA in Psychology at Hamilton College.

    His most recent paper at NeurIPS is Stable and expressive recurrent vision models. It presents an alternative to back-propagation through time (BPTT) for recurrent vision models called “contractor recurrent back-propagation” (C-RBP), which has O(1) complexity for an N step model vs. O(N) for BPTT, and which learns long-range spatial dependencies in cases where BPTT cannot.

    Drew is also organizing an ICLR 2021 workshop named Generalization Beyond the Training Distribution in Brains and Machines on Friday, May 7th, 2021. Find them on the website and @ICLR_brains.

    Lastly, Drew is looking to work with collaborators in robotics, so feel free to reach out!

    Highlights from our conversation:

    🧠 Building brain-inspired inductive biases into computer vision

    🖼 A learning algorithm to improve recurrent vision models (C-RBP)

    🤖 Creating new benchmarks to move towards generalization


    Below are the show notes and full transcript. As always, please feel free to reach out with feedback, ideas, and questions!


  • Generally Intelligent #8: Giancarlo Kerg, Mila, on approaching deep learning from mathematical foundations

    RSS · Spotify · Apple Podcasts · Pocket Casts


    Giancarlo Kerg (Google Scholar) is a PhD student at Mila, supervised by Yoshua Bengio and Guillaume Lajoie. He is working on out-of-distribution generalization and modularity in memory-augmented neural networks. Prior to his PhD, he studied pure mathematics at Cambridge and Université Libre de Bruxelles.

    His most recent paper at NeurIPS is Untangling tradeoffs between recurrence and self-attention in neural networks. It presents a proof for how self-attention mitigates the gradient vanishing problem when trying to capture long-term dependencies. Building on this, it proposes a way to scalably use sparse self-attention with recurrence, via a relevancy screening mechanism that mirrors the cognitive process of memory consolidation.

    Highlights from our conversation:

    🧮 Pure math foundations as an approach to progress and structural understanding in deep learning research

    🧠 How a formal proof on the way self-attention mitigates gradient vanishing when capturing long-term dependencies in RNNs led to a relevancy screening mechanism resembling human memory consolidation

    🎯 Out-of-distribution generalization through modularity and inductive biases

    🏛 Working at Mila with Yoshua Bengio and other collaborators


    Below are the show notes and full transcript. As always, please feel free to reach out with feedback, ideas, and questions!


  • Generally Intelligent #7: Yujia Huang, Caltech, on neuro-inspired generative models

    RSS · Spotify · Apple Podcasts · Pocket Casts


    Yujia Huang (@YujiaHuangC) is a PhD student at Caltech, working at the intersection of deep learning and neuroscience. She worked on optics and biophotonics before venturing into machine learning. Now, she hopes to design “less artificial” artificial intelligence.

    Her most recent paper at NeurIPS is Neural Networks with Recurrent Generative Feedback, introducing Convolutional Neural Networks with Feedback (CNN-F).

    Yujia is open to working with collaborators from many areas: neuroscience, signal processing, and control experts — in addition to those interested in generative models for classification. Feel free to reach out to her!

    Generative modeling in human brains on a picture of a blurry cat
    CNN-F is inspired by a recurrent generative feedback model in human visual perception. In experiments, humans use recurrent feedback to recognize challenging and obfuscated images more easily. Illustrated here, the brain classifies an image of a blurry cat, using a generative model to update posterior beliefs.
    Results of CNN-F on adversarial robustness
    CNN-F-5 improves adversarial robustness of CNNs.

    Highlights from our conversation:

    🏗 How recurrent generative feedback, a neuro-inspired design, improves adversarial robustness and achieves similar performance with fewer labels

    🧠 Drawing inspiration from classical ML & neuroscience for new deep learning models

    📊 What a new Turing test for “less artificial” AI could look like

    💡 Tips for new machine learning researchers!


    Below are the show notes and full transcript. As always, please feel free to reach out with feedback, ideas, and questions!


  • A PyTorch Implementation of Slot Attention

    TL;DR: We’re open sourcing a PyTorch implementation of Object-Centric Learning with Slot Attention, one of our favorite papers from NeurIPS 2020.

    Check out our implementation at https://github.com/untitled-ai/slot_attention.

    Also, we’re hiring! You can find our job postings here, and email us at jobs@generallyintelligent.ai if you find a role for you.

    Slot Attention Outputs
    Outputs of our slot attention model, demonstrating the model's ability to divide objects, or parts of objects, into slots.

  • Generally Intelligent #6: Julian Chibane, MPI-INF, on 3D reconstruction using implicit functions

    RSS · Spotify · Apple Podcasts · Pocket Casts

    Our next guest, Julian Chibane, is a PhD student at the Real Virtual Humans group at the Max Planck Institute for Informatics in Germany.

    His recent work centers around implicit functions for 3D reconstruction, and his most recent paper at NeurIPS is Neural Unsigned Distance Fields for Implicit Function Learning. He also introduced Implicit Feature Networks (IF-Nets) in Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion. Julian is open to collaborators interested in similar work, so feel free to reach out!

    Results of Neural Unsigned Distance Fields (NDF)
    Neural Unsigned Distance Fields (NDF) can represent and reconstruct complex open surfaces. Given a sparse test point cloud (left), it generates a detailed, completed scene (right).
    Results of IF-Nets
    IF-Nets can reconstruct point clouds where a part is is occluded, e.g. the human's back (left), with continuous outputs (right).

    Highlights from our conversation:

    🖼 How, surprisingly, the IF-Net architecture learned reasonable representations of humans & objects without priors

    🔢 A simple observation that led to Neural Unsigned Distance Fields, which handle 3D scenes without a clear inside vs. outside (most scenes!)

    📚 Navigating open questions in 3D representation, and the importance of focusing on what’s working


    Below is the full transcript. As always, please feel free to reach out with feedback, ideas, and questions!


  • Josh Albrecht: A New AI Research Lab

    Our CTO Josh Albrecht was interviewed on Machine Learning Engineered! Josh talks about pivoting from an AI recruiting startup to Generally Intelligent, touches on what we’re working on now, and discusses how to think about creating a great research environment.

    You can find show notes and links mentioned in the podcast on Charlie You’s blog post on this episode.

    Quotes we enjoyed:

    “I think that other primates or dolphins are actually extremely intelligent. And biologically, our brain is almost identical to that of many primates, though their language abilities are much more limited, and their ability to do symbolic reasoning is very limited. The interesting thing about intelligence is it’s 99% there in the monkey brain. I think neural networks can approximate huge swaths of the neocortex and get very good correlations of whole clusters of neurons.”

    “Machine learning is not really about automation. I think it’s much more about augmenting people in the tasks they’re actually doing, meeting people where they are, and considering the human side of the equation. That’s something that’s stuck with us now, and the way that we’re thinking about our current work. ML and AI should be viewed from the lens of how they are serving human values, and how they are helping people do the things that they want to do.”

    Yann LeCun’s cake analogy: “The cake itself is unsupervised learning, the frosting is supervised, and the cherry on top is reinforcement learning.”

    On how curated datasets don’t represent the real world: “None of those images are a picture of a blank wall, or a picture of the grass or something, because no one takes a picture of the wall, because it’s boring. But what do you see in real life? Most of the time you have a bunch of boring walls. The natural setting is, almost everything is boring, except this extremely long tail of things that are interesting.”


  • Generally Intelligent #5: Katja Schwarz, MPI-IS, on GANs, implicit functions, and 3D scene understanding

    RSS · Spotify · Apple Podcasts · Pocket Casts

    Katja Schwartz (Google Scholar) came to machine learning from physics, and is now working on 3D geometric scene understanding at the Max Planck Institute for Intelligent Systems. Her most recent work, “Generative Radiance Fields for 3D-Aware Image Synthesis,” revealed that radiance fields are a powerful representation for generative image synthesis, leading to 3D consistent models that render with high fidelity.

    Highlights from our conversation:

    🥦 the role 3D generation plays in conceptual understanding

    📝 tons of practical tips on GAN training

    〰 continuous functions as representations for 3D objects


    Below is the full transcript. As always, please feel free to reach out with feedback, ideas, and questions!


  • Generally Intelligent #4: Joel Lehman, OpenAI, on evolving intelligence, open-endedness, and reinforcement learning

    RSS · Spotify · Apple Podcasts · Pocket Casts

    Our fourth episode features Joel Lehman (Google Scholar), previously a founding member at Uber AI Labs and assistant professor at the IT University of Copenhagen. He’s now a research scientist at OpenAI, where he focuses on open-endedness, reinforcement learning, and AI safety.

    Joel’s PhD dissertation introduced the novelty search algorithm. That work inspired him to write the popular science book, “Why Greatness Cannot Be Planned”, with his PhD advisor Ken Stanley, which discusses what evolutionary algorithms imply for how individuals and society should think about objectives.

    Highlights from our conversation:

    • How discovering novelty search totally changed Joel’s philosophy of life
    • Sometimes, can you reach your objective more quickly by not trying to reach it?
    • Better ways to evolve intelligence
    • Why reinforcement learning is a natural framework for open-endedness


    Below is the full transcript. As always, please feel free to reach out with feedback, ideas, and questions!


  • Generally Intelligent #3: Cinjon Resnick, NYU, on activity and scene understanding

    RSS · Spotify · Apple Podcasts · Pocket Casts

    On this episode of Generally Intelligent, we interview Cinjon Resnick (Google Scholar), formerly from Google Brain and now doing his PhD at NYU, about why he believes scene understanding is critical to out of distribution generalization, and how his theses have evolved since he started his PhD.

    Highlights from our conversation:

    • How Cinjon started his research by trying to grow a baby through language and games, before running into a wall with this approach
    • How spending time at circuses 🎪 and with gymnasts 🤸🏽‍♂️ re-invigorated his research, and convinced him to focus on video, motion, and activity recognition
    • Why MetaSIM and MetaSIM II are underrated papers
    • Two research ideas Cinjon would like to see others work on


    Below is the full transcript. As always, please feel free to reach out with feedback, ideas, and questions!


  • Generally Intelligent #2: Sarah Jane Hong, Latent Space, on neural rendering & research process

    RSS · Spotify · Apple Podcasts · Pocket Casts

    Excited to release our second episode of Generally Intelligent! This time we’re featuring Sarah Jane Hong, co-founder of Latent Space, a startup building the first fully AI-rendered 3D engine in order to democratize creativity.

    Sarah co-authored Low Distortion Block-Resampling with Spatially Stochastic Networks from NeurIPS 2020, a very cool method that lets you realistically resample part of a generated image. For example, maybe you’ve generated a castle, but you’d like to change the style of a tower - you could then resample that tower until you get something you like.

    Highlights from our conversation:

    • What it was like taking classes under Geoff Hinton in 2013
    • How not to read papers
    • The downsides of only storing information in model weights
    • Why using natural language prompts to render a scene is much harder than you’d expect
    • Why a model’s ability to scale is more important than getting state-of-the-art results


    Below is the full transcript. As always, please feel free to reach out with feedback, ideas, and questions!


  • Generally Intelligent #1: Kelvin Guu, Google AI, on language models & overlooked problems

    Generally Intelligent is a podcast made for deep learning researchers (you can learn more about it here).

    RSS · Spotify · Pocket Casts

    Our first guest is Kelvin Guu, a senior research scientist at Google AI, where he develops new methods for machine learning and language understanding. Kelvin is the co-author of REALM: Retrieval-Augmented Language Model Pretraining. The conversation is a wide-ranging tour of language models, how computers interact with world knowledge, and much more;

    Highlights from our conversation:

    • Why language models like GPT-3 seem to generalize so well to tasks beyond just predicting words
    • How you can store knowledge in a database, in the weights of a model, or a with mix of both approaches
    • What interesting problems and data sets have been overlooked by the research community
    • Why cross-entropy might not be such a great objective function
    • Creative and impactful ways language and knowledge models might be used in the future


    Below is the full transcript. We love feedback, ideas, and questions, so please feel free to reach out!


  • Generally Intelligent #0: A podcast for deep learning researchers

    Over the past few years, we’ve been a part of countless conversations with various deep learning researchers about the hunches and processes that inform their work. These conversations happen as part of informal paper reading groups, lab discussions, or casual chats with friends, and they have proved critical to our own research.

    The intimacy and informality of these environments lends itself to stimulating conversations. Yet it often felt like a shame that the deeper understandings and hard-earned lessons from research failures were shared with just a select few, when so many others could use this knowledge to advance the frontier.

    That’s why, today, we’re launching “Generally Intelligent,” a publicly available podcast made by deep learning researchers, for deep learning researchers.


  • Understanding self-supervised and contrastive learning with "Bootstrap Your Own Latent" (BYOL)

    Summary

    Unlike prior work like SimCLR and MoCo, the recent paper Bootstrap Your Own Latent (BYOL) from DeepMind demonstrates a state of the art method for self-supervised learning of image representations without an explicitly contrastive loss function. This simplifies training by removing the need for negative examples in the loss function. We highlight two surprising findings from our work on reproducing BYOL:

    (1) BYOL often performs no better than random when batch normalization is removed, and

    (2) the presence of batch normalization implicitly causes a form of contrastive learning.

    These findings highlight the importance of contrast between positive and negative examples when learning representations and help us gain a more fundamental understanding of how and why self-supervised learning works.

    The code used for this post can be found at https://github.com/untitled-ai/self_supervised.


  • Appendix for "Understanding self-supervised and contrastive learning with 'Bootstrap Your Own Latent' (BYOL)"

    Appendix

    This post contains the extra data and detail for our post “Understanding self-supervised and contrastive learning with ‘Bootstrap Your Own Latent’ (BYOL)”