https://github.com/cloneofsimo/insightful-nn-papers
These papers will provide unique insightful concepts that will broaden your perspective on neural networks and deep learning
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.5%) to scientific vocabulary
Repository
These papers will provide unique insightful concepts that will broaden your perspective on neural networks and deep learning
Basic Info
- Host: GitHub
- Owner: cloneofsimo
- Default Branch: main
- Size: 4.88 KB
Statistics
- Stars: 47
- Watchers: 8
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Insightful Neural Network Papers
Great job!. You've finished your linear algebra, calculous, maybe good amount of probability theory, statistics and maybe even some course on optimization. You've gone through your assignments, written some code (perhaps with pytorch), and done some projects.
Now what? Before you jump into the hype-train like LLAMAs and Stable Diffusion, NeRFS etc etc, you should be aware: the field is changing rapidly. I've curated some of the unique & foundational papers that you should read to understand the field better. And by foundational, I mean insightful papers that explores stuff that are generally applicable to many different real-life problems. Deep learning and neural networks are exciting and have far more interesting literature other than just theory (I do think theory is incredibly important though).
You might even consider this as survey of surveys in neural network, as many of the literature I will mention here are extremely unique.
[WIP. I WILL ADD THEM AS I FIND MORE TIME]
Generalization
We need to rethink generalization
Tendency to find low-rank solution
Tendency to find smoother solution
Tendency to find low-frequency solution
Learning in High Dimension Always Amounts to Extrapolation:
Deep Learning without Poor Local Minima:
Does neural network memorize the training data?
What happens when you scale up the neural network?
Scaling Law from dimensionality of the data: $\alpha \sim 4/d$, where $d$ is dimensionality of the data. This is a very interesting result, as it shows that the scaling constant of neural network has a fascinating connection with the intrinsic property of the data.
Pruning Data to improve scaling: Quality of the data matters, even in the large-scale setting!
Scaling Reward Model : Reward modeling + RL is a promising approach in the field of deep learning, popularized by famous chatGPT (instructGPT). Scaling helps, and we should limit the KL divergence during optimization.
Batchsize and Learning rate: what you should know about them
In-batch variance : smaller the better?
How large should your batch be?
Larger batch size, larger learning rate?
Emergent Capabilities vs Inverse scaling :
Fascinating Aspects of Neural Network
Shortcut learning, Gradient Starvation : Neural network tends to "cheat in learning" when it has the chance.
Dataset Distillation : Did you know that you can reversely train the dataset, so that neural network can learn faster? The field has grown very much.
Localization and Edit : Maybe this is too limited, but the way they do causal tracing to find which layer is responsible for certain output is very generally applicable.
Grokking :
Bootstrapping, self-distillation, ensemble... Learning from itself? How does that even make sense? :
Adversarial Examples Are Not Bugs, They Are Features:
These might provide some alternative insights
Infinite width Neural Networks : Of course, we see that neural network works well in practice especially in the large-scale setting. But since their analytical training dynamics are clearly intractable, we can't really say much about them. Instead, infinite width neural networks are much easier to work with. NNGP, NTK, and Tensor Programs are some of the most fundamental papers in this field. It maybe bit too math heavy, I recommend you to read this blog by Lilian Weng (as always) first.
Infinite Matrix Factorizations : Alternatively, training dynamics of matrix factorization actually give you a very good grasp of what might be happening in the neural network.
Common variable trick (I made this term up):
Mechanistic interpretability (CNN, Transformer):
Specific to Reinforcement Learning
Why not just, learn from expert data?
Owner
- Name: Simo Ryu
- Login: cloneofsimo
- Kind: user
- Company: Corca AI
- Website: https://fb.com/MLPaperFetchingCat
- Twitter: cloneofsimo
- Repositories: 10
- Profile: https://github.com/cloneofsimo
Cats are Turing machines cloneofsimo@gmail.com
GitHub Events
Total
- Watch event: 4
Last Year
- Watch event: 4
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0