691 followers 152 článků/týdně
[D] How do you get better at reading proof in the ML papers, with background in CS only?

Hi everyone, as the title, how do you get better at reading proof in the ML papers? The ML papers I mentioned are those in adversarial ML, e.g. Certified Adversarial Robustness via Randomized Smoothing. For context, I have basic knowledge of calculus, linear algebra, but most of the time when reading the proof, sometime I feel that one line just come...

Tue May 14, 2024 15:42
Mamba discussion[D]

I have been given a task to : Research & modify where required (or create from scratch) a training script that can train the model in the provided repository. Finally you should package this training application into a container and deploy in a cloud environment of your choice for 3 use cases: -distributed training -hyperparameter tuning -training...

Tue May 14, 2024 15:42
[P] Time series forecasting

Time series Forecasting Hi everyone I am trying first forecasting project. I have a time series over 1 year which is made by users check-ins everyday in a physical center located on a single country/nation. I want to produce synthetic data to do forecasting and simulations. Now I would like to understand if I need to use ML algorithm or just pick up...

Tue May 14, 2024 15:42
[D] The usefulness of the last linear layer of each transformer layer

This is a pretty obvious. I recently see that the last linear layer of transformer is kind of a waste of parameters. A transformer model is a stack of many transformer layers. These layers starts with 3 QKV Linear Transformation and ends with FFN Network, which consists of two linear layers. The last one costs (d_model * d_dim_feedforward) parameter...

Tue May 14, 2024 15:42
Need help with RAG chatbot [Project]

I'm building a RAG chatbot that gives you the contextual information on the documents uploaded into the database connected to the chatbot. Now, I'm trying to implement a feature wherein the user can use a hash(#) to instruct the bot to point to a specific document within a db and ask questions about that specific doc. Please help me on how to implement...

Tue May 14, 2024 12:42
[R] How Well Can Transformers Emulate In-context Newton's Method?

Paper: https://arxiv.org/abs/2403.03183 Code: https://anonymous.4open.science/r/transformer_higher_order-B80B/ Abstract: Transformer-based models have demonstrated remarkable in-context learning capabilities, prompting extensive research into its underlying mechanisms. Recent studies have suggested that Transformers can implement first-order optimization...

Tue May 14, 2024 12:42

Vytvořte si vlastní zdroj

Jste připraveni to vyzkoušet?
Spusťte 14denní zkušební verzi bez nutnosti platební karty.

Vytvořit účet