Hella New AI Papers – Aug 24, 2024

Dr.Wooz August 28, 2024

6 16 2 minutes read

Read/listen to the substack newsletter:

Support my learning journey either by clicking the Join button above, becoming a Patreon member, or a one-time Venmo!

Discuss this stuff with other Tunadorks on Discord

All my other links

Timestamps:
00:00 Intro
01:09 Tree Attention – Topology-aware Decoding for Long-Context Attention on GPU clusters
02:46 MoFO – Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning
03:31 Multi-Meta-RAG – Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata
04:03 xGen-MM (BLIP-3) – A Family of Open Large Multimodal Models
04:39 Automated Design of Agentic Systems
06:15 KAN 2.0 – Kolmogorov-Arnold Networks Meet Science
07:11 Solving a Rubik’s Cube Using its Local Graph Structure
07:49 Transfusion – Predict the Next Token and Diffuse Images with One Multi-Modal Model
08:55 Scaling Law with Learning Rate Annealing
09:49 Recurrent NNs Learn to Store and Generate Sequences using Non-Linear Representations
11:17 Learning Randomized Algorithms with Transformers
12:48 Beyond English-Centric LLMs – What Language Do Multilingual LMs Think in?
15:16 HMoE – Heterogeneous MoE for LMing
16:13 Strategist – Learning Strategic Skills by LLMs via Bi-Level Tree Search
17:00 Demystifying the Communication Characteristics for Distributed Transformers
18:05 The Exploration-Exploitation Dilemma Revisited – An Entropy Perspective
18:53 Performance Law of LLMs
19:47 Importance Weighting Can Help LLMs Self-Improve
20:46 ML with Physics Knowledge for Prediction – A Survey
21:12 Faster Adaptive Decentralized Learning Algorithms
21:38 AdapMoE – Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference
22:45 Acquiring Bidirectionality via Large and Small LMs
24:31 Attention is a smoothed cubic spline
25:28 Latent Causal Probing – A Formal Perspective on Probing with Causal Models of Data
26:40 From pixels to planning – scale-free active inference
28:14 Critique-out-Loud Reward Models
29:20 FocusLLM – Scaling LLM’s Context by Parallel Decoding
30:55 Memorization In In-Context Learning
32:32 First Activations Matter – Training-Free Methods for Dynamic Activation in LLMs
33:01 Empirical Equilibria in Agent-based Economic systems with Learning agents
34:23 Matmul or No Matmal in the Era of 1-bit LLMs
35:13 Scaling Laws with Vocabulary – Larger Models Deserve Larger Vocabularies
35:50 LLM Pruning and Distillation in Practice – The Minitron Approach
36:20 Mission – Impossible LMs
38:39 Let Me Speak Freely – A Study on the Impact of Format Restrictions on Performance of LLMs
39:42 Controllable Text Generation for LLMs – A Survey
40:25 Jamba-1.5 – Hybrid Transformer-Mamba Models at Scale
41:59 Not All Samples Should Be Utilized Equally – Towards Understanding and Improving Dataset Distillation
43:04 Search-Based LLMs for Code Optimization
43:35 Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
44:17 Loss of plasticity in deep continual learning
45:50 What’s Really Going On in ML? Some Minimal Models
46:18 The graphical brain – Belief propagation and active inference
46:59 Outro

[ad_2]

source

6 Comments

@seriousbusiness2293 says:
August 28, 2024 at 1:33 pm
Not to many carzy banger papers here this time. More like loads of small progressions.
Small request: In case anything interesting in bayesian program learning comes out then id kindly ask you to concider covering it.
I strongly belive the theories of bpl could fundamentally enhance LLMs more then any novel solution.
@hoskdjciianbflso says:
August 28, 2024 at 1:33 pm
Another nothingburger of a video 🔥
@c0ffe3caf3 says:
August 28, 2024 at 1:33 pm
I like the lack of face tracking indication
@renegadephalanx says:
August 28, 2024 at 1:33 pm
"Although LLMs generate one token at a time, the entire sequence of past tokens must still be stored in memory"….."compute attention scores". Still gets me every time that this is what goes on in the background of massive LLMs when they're busy inferencing.
@MyrLin8 says:
August 28, 2024 at 1:33 pm
Thanks, very good content with excellent coverage. Saves me a lot of time.
@justjeremiah4255 says:
August 28, 2024 at 1:33 pm
Let's go!

Hella New AI Papers – Aug 24, 2024

Dr.Wooz

6 Comments

Leave a Reply Cancel reply

Level Up Your Minecraft Server with Proxmox – Install and Setup

How To Install Tiny 11 Without a USB Drive | Windows 11 Lite Installation

Linux error on my phone I’m glad I got out of it it didn’t delete my Samsung account

GEN VS BLG | Ván 1 | CKTG 2023 – VÒNG TỨ KẾT | NGÀY 2

How to Choose Fast Web Hosting | Fast WordPress Hosting | CWP Hosting

How to use Hyper-V to create Virtual Domain Network- Part 01

MEDIA STATION X – НАИЛУЧШИЕ Стартовые Параметры(Start Parameter)

How To Solved Error 𝟎𝐱𝟎𝟎𝟎𝟎𝟎𝟏𝟏𝐁 Share Printer Not Connect In Windows 10 / 11

Level Up Your Minecraft Server with Proxmox – Install and Setup

“How to Install and Play Ubisoft Connect Games on Linux – Step by Step Guide”

MEDIA STATION X – НАИЛУЧШИЕ Стартовые Параметры(Start Parameter)

How to Use blackarch-install Command, Set Up, and Update BlackArch Linux with Pacman

Dr.Wooz

Subscribe to our mailing list to get the new updates!

Related Articles

Papo Enfrenta a un Hater en Liga Bazooka y Detiene la Batalla: ¡Momento Tenso!

This Would Be a TOP 10 COURSE for Me, BUT…

Avengers Endgame Captain America 1:4 Legacy Replica Statue by Iron Studios #ironstudios

AWS re:Invent 2024 – Andy Jassy shares his final thoughts from re:Invent | Amazon Web Services

6 Comments

Leave a Reply Cancel reply

Level Up Your Minecraft Server with Proxmox – Install and Setup

How To Install Tiny 11 Without a USB Drive | Windows 11 Lite Installation

Linux error on my phone I’m glad I got out of it it didn’t delete my Samsung account

GEN VS BLG | Ván 1 | CKTG 2023 – VÒNG TỨ KẾT | NGÀY 2

How to Choose Fast Web Hosting | Fast WordPress Hosting | CWP Hosting

How to use Hyper-V to create Virtual Domain Network- Part 01

MEDIA STATION X – НАИЛУЧШИЕ Стартовые Параметры(Start Parameter)

How To Solved Error 𝟎𝐱𝟎𝟎𝟎𝟎𝟎𝟏𝟏𝐁 Share Printer Not Connect In Windows 10 / 11

Level Up Your Minecraft Server with Proxmox – Install and Setup

“How to Install and Play Ubisoft Connect Games on Linux – Step by Step Guide”

MEDIA STATION X – НАИЛУЧШИЕ Стартовые Параметры(Start Parameter)

How to Use blackarch-install Command, Set Up, and Update BlackArch Linux with Pacman