What inputs to attention are recoverable? Like if you were doing cross-attention across two text embeddings (one is the Query, and the other is the Key/Value), and you passed the attention outputs through a linear, and the target label was the text used to generate the Query, would you be able to acheive near zero-loss? What about if the target was...
32m
At DocuStack, Our goal is to help organisations turn their internal knowledge into an answering bot, saving time and increasing efficiency for teams. If you're familiar with the challenges of managing internal knowledge and answering repetitive questions, I believe you will find DocuStack to be a valuable tool. With features such as a searchable database,...
1h
Hi all, I'm trying to fine-tune Whisper AI to transcribe albanian speech to text but I have a problem in that I don't know how the dataset for training whisper model should look like. I already have voice audios and the transcript for that audio file but I need to know how to reformat it into a valid dataset for training Whisper. Thanks in advance!...
1h
Hello everyone! I'm having a really big dataset of trajectories that are categorized based on 2 values, let's say (Y1, Y2)[targets] and instead of having a "vanilla" LSTM-based network that learns these huge trajectories and works like a Regressor, I firstly train a VAE and then for all trajectories extract the latent output (means) which I use after...
4h
This may be a silly question for those familiar with the field, but don't machine learning researchers expect any more prospects for traditional methods (I mean, "traditional" is other than deep learning)? I feel that most of the time when people talk about machine learning in the world today, they are referring to deep learning, but is this the same...
4h
hey i've been looking at this paper from deepmind https://arxiv.org/pdf/1807.01281.pdf where they train agents to play capture the flag based off of only visual input. what i'm curious about is are there any tricks going on here? Is the ai looking at a "screen" the same way a human would and then encodes it's observations after? or is it just looking...
4h
TLDR: Poorly informed people have misleading discussions about "AI" in the media. Where can I find people who know what they're talking about? I've never had high expectations when any kind of tech was covered by general media, but even tech media is failing at properly informing people about what ChatGPT is even after months of continuous discussions...
4h
MusicLM is a model generating high-fidelity music from text descriptions such as "a calming violin melody backed by a distorted guitar riff". MusicLM casts the process of conditional music generation as a hierarchical sequence-to-sequence modeling task, and it generates music at 24 kHz that remains consistent over several minutes. MusicLM can be conditioned...
8h
Doing a survey of object detection papers with plausible application to pose-estimation tasks. Came across the paper "You Only Learn One Representation" and, while the theory seems interesting, I want to hear people's opinions before doing a deep dive into the theory. submitted by /u/answersareallyouneed [link] [comments]
16h
https://peltarion.com/blog/data-science/towards-a-token-free-future-in-nlp submitted by /u/EducationalCicada [link] [comments]
18h
Follow RSS Feeds, Blogs, Podcasts, Twitter searches, Facebook pages, even Email Newsletters! Get unfiltered news feeds or filter them to your liking.
Get Inoreader