729 followers 137 статия/седмица
[D]: Transformer Keys, Queries, Values Intuitions outside NLP

Usually K = V, and if Q =/= K then it's cross attention, otherwise it is self attention. Transformer blocks basically enrich V (value) vectors with context after each block. In my case I have Q =/= K =/= V, and while mathematically fine, I haven't come across an application that did this. I want a behavior that when K = Q, then V_new = Transformer...

Mon Jun 3, 2024 05:24
[D] Where is https://ai.papers.bar/papers/weekly

This site used to provide weekly hot papers! A screenshot of the website Okay, they discontinued this project: https://labml.ai/#discontinued submitted by /u/Realistic_Thanks3282 [link] [comments]

Mon Jun 3, 2024 05:24
[D] Is there any way to perform encoding a bit faster when creating FAISS indexes?

I'm currently training a text embedding model that I'm evaluating using benchmarks like MTEB or MIRACL. Most of the code that I've referenced is using FAISS indexes to search results, which makes sense. The problem is that when building FAISS indexes, the encoding of text is taking way too long. I'm currently using a single machine with four A6000 GPU...

Mon Jun 3, 2024 05:24
[Discussion] Why next token prediction doesn't work for Recommender System? (or am I wrong?)

I'm working on a research project that aims at applying next-token-prediction models to build/improve recommender systems. As a feasibility assessment study, I built and trained a GPT model to predict the next product to buy using the Instacart dataset. To be more specific, I treated each product_id as a "word", each order as a "sentence" and each user's...

Mon Jun 3, 2024 02:24
[D] An Open Source model that creates 2D Image to 3D video?

I'm currently searching for an open Source model that can create a short 3D video out of a 2D image. The video would just kind of show a zoom in zoom out or something like that, like for example with Immersity Ai, which unfortunately cost quite a lot of money. Does somebody know anything thats free, it would be best if its an open source model. I have...

Mon Jun 3, 2024 02:24
[R] The Challenges of Building Effective LLM Benchmarks: A 5 minute deep-dive 🧠

With the field moving fast and models being released every day, there's a need for comprehensive benchmarks. With trustworthy evaluation you and I can know which LLM to choose for our task: coding, instruction following, translation, problem solving, etc. TL;DR: The article dives into the challenges of evaluating large language models (LLMs). 🔍 From...

Mon Jun 3, 2024 02:24

Създайте своя емисия с новини

Готови ли сте да опитате?
Стартирайте 14-дневен пробен период, не се изисква кредитна карта.

Създаване на акаунт