679 followers 149 articles/week
[Research] xLSTM: Extended Long Short-Term Memory

Abstract: In the 1990s, the constant error carousel and gating were introduced as the central ideas of the Long Short-Term Memory (LSTM). Since then, LSTMs have stood the test of time and contributed to numerous deep learning success stories, in particular they constituted the first Large Language Models (LLMs). However, the advent of the Transformer...

Wed May 8, 2024 09:39
Non Technical ML Podcasts? [D]

Hey everyone. For context, I’m a recent CS graduate and current entry level Data Engineer, and I’ve always loved learning about ML models and techniques and how to implement, deploy, and scale them. I’m looking for a good podcast to keep my knowledge of ML trends up to date, but the challenge is that I don’t really like listening to podcasts that are...

Wed May 8, 2024 09:39
[D] PEFT techniques actually used in the industry

A lot of works on parameter efficient fine tuning of transformers are coming out, but how much of them are actually being applied? Also I was curious what techniques do you normally use in the industry? submitted by /u/Inner_Programmer_329 [link] [comments]

Wed May 8, 2024 09:39
[D] weighted pruning question

Hi I'm doing weighted pruning, but I have one issuse here , so let's say I have a tensor so most of the tensors are nearly to zero so I changed that to zero , so nearly 40percent of the tensors zero now, does that mean my matrix is a sparse one or is it still dense , if it's not a sparse matrix , the computation will be same right , all row and column...

Wed May 8, 2024 09:39
[D] Can anyone with the expertise speak to the overlap, or not, between Nvidia's hardware and Apple's hardware?

I'm curious to understand how much realistic potential there is that Apple can compete with Nvidia IF we make an assumption that they're starting with what we know about in the M series chips. Could they pull some of this IP to make purpose built "AI" chips that might compete? Context: Rumors that Apple might try to do this.. submitted by /u/playstation3d...

Wed May 8, 2024 09:39
[P] Skyrim - Open-source model zoo for Large Weather Models

Github link Hey all, I'm Efe from Secondlaw AI. We are building physics-informed large AI models. Currently, we are focusing on weather modelling. To benchmark SOTA, we had to build a forecasting infra for all available large weather models and we could not find a solid tooling to do so, so we built Sykrim. Within <5 mins and <5 LOC you can run...

Wed May 8, 2024 09:39

Build your own newsfeed

Ready to give it a go?
Start a 14-day trial, no credit card required.

Create account