https://github.com/cloneofsimo/insightful-nn-papers

These papers will provide unique insightful concepts that will broaden your perspective on neural networks and deep learning

https://github.com/cloneofsimo/insightful-nn-papers

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.5%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

These papers will provide unique insightful concepts that will broaden your perspective on neural networks and deep learning

Basic Info
  • Host: GitHub
  • Owner: cloneofsimo
  • Default Branch: main
  • Size: 4.88 KB
Statistics
  • Stars: 47
  • Watchers: 8
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 3 years ago · Last pushed almost 3 years ago
Metadata Files
Readme

README.md

Insightful Neural Network Papers

Great job!. You've finished your linear algebra, calculous, maybe good amount of probability theory, statistics and maybe even some course on optimization. You've gone through your assignments, written some code (perhaps with pytorch), and done some projects.

Now what? Before you jump into the hype-train like LLAMAs and Stable Diffusion, NeRFS etc etc, you should be aware: the field is changing rapidly. I've curated some of the unique & foundational papers that you should read to understand the field better. And by foundational, I mean insightful papers that explores stuff that are generally applicable to many different real-life problems. Deep learning and neural networks are exciting and have far more interesting literature other than just theory (I do think theory is incredibly important though).

You might even consider this as survey of surveys in neural network, as many of the literature I will mention here are extremely unique.

[WIP. I WILL ADD THEM AS I FIND MORE TIME]

Generalization

We need to rethink generalization

Tendency to find low-rank solution

Tendency to find smoother solution

Tendency to find low-frequency solution

Learning in High Dimension Always Amounts to Extrapolation:

Deep Learning without Poor Local Minima:

Does neural network memorize the training data?

What happens when you scale up the neural network?

Scaling Law from dimensionality of the data: $\alpha \sim 4/d$, where $d$ is dimensionality of the data. This is a very interesting result, as it shows that the scaling constant of neural network has a fascinating connection with the intrinsic property of the data.

Pruning Data to improve scaling: Quality of the data matters, even in the large-scale setting!

Scaling Reward Model : Reward modeling + RL is a promising approach in the field of deep learning, popularized by famous chatGPT (instructGPT). Scaling helps, and we should limit the KL divergence during optimization.

Ensemble instead of scaling?

Batchsize and Learning rate: what you should know about them

In-batch variance : smaller the better?

How large should your batch be?

Larger batch size, larger learning rate?

Emergent Capabilities vs Inverse scaling :

Fascinating Aspects of Neural Network

Shortcut learning, Gradient Starvation : Neural network tends to "cheat in learning" when it has the chance.

Dataset Distillation : Did you know that you can reversely train the dataset, so that neural network can learn faster? The field has grown very much.

Localization and Edit : Maybe this is too limited, but the way they do causal tracing to find which layer is responsible for certain output is very generally applicable.

Information Bottleneck :

Double Descent :

Grokking :

Bootstrapping, self-distillation, ensemble... Learning from itself? How does that even make sense? :

Adversarial Examples Are Not Bugs, They Are Features:

Lottery ticket hypothesis:

Neural Collapse:

These might provide some alternative insights

Infinite width Neural Networks : Of course, we see that neural network works well in practice especially in the large-scale setting. But since their analytical training dynamics are clearly intractable, we can't really say much about them. Instead, infinite width neural networks are much easier to work with. NNGP, NTK, and Tensor Programs are some of the most fundamental papers in this field. It maybe bit too math heavy, I recommend you to read this blog by Lilian Weng (as always) first.

Infinite Matrix Factorizations : Alternatively, training dynamics of matrix factorization actually give you a very good grasp of what might be happening in the neural network.

Neural ODE

Diffusion & Score Matching

Common variable trick (I made this term up):

Reparameterizations:

Gradient estimation:

Mechanistic interpretability (CNN, Transformer):

Specific to Reinforcement Learning

Why not just, learn from expert data?

Do we really need deep learning for RL?

Features of the MLP is not that great when it comes to RL

Owner

  • Name: Simo Ryu
  • Login: cloneofsimo
  • Kind: user
  • Company: Corca AI

Cats are Turing machines cloneofsimo@gmail.com

GitHub Events

Total
  • Watch event: 4
Last Year
  • Watch event: 4

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels