727 followers 138 articles/week
[D] Teacher student training strategy

I am planning on using a LLM (say llama3) to extract training data via a prompt, and then using a smaller model with a CLS token to do a custom training to try and match the accuracy of the LLM. Suppose that I can run the prompt on 1M+ data (although I suspect I won't need as many). Prompt: Does the following sentence contain apples or oranges: Examples:...

Sun Jun 2, 2024 08:22
[D] What are your real-world production use cases for LLMs?

I think we should share more production use cases for LLMs instead of just theoretical best practices. Can you share the use cases you've seen/built in production? It should include the following details: The problem it solves The implementation details (models, infrastructure, etc.) The business impact it had submitted by /u/madredditscientist...

Sun Jun 2, 2024 08:22
[D] Is it a good idea to combine 3 datasets into one unique dataset, knowing that the 3 are related to the same topic?

They are basically of the same topic and have the same labels, the only difference is the dataset per se. It's for the goal of differentiating the images of the first dataset (which would be morphed into one dataset from the other three) from the other dataset that it'll be created from scratch in conjunction with my research colleagues submitted...

Sun Jun 2, 2024 08:22
Implementing "Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet" paper for open source models.[D]

I recently came across an interesting paper titled "Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet" which explores using sparse autoencoders to extract interpretable features from the activations of a large language model. The methodology seems promising for gaining insights into the model's internal representations...

Sun Jun 2, 2024 08:22
[D] Alternatives to the LAION Aesthetics dataset?

So the LAION dataset is currently closed for a safety review. In the meantime, are there any large datasets that could stand in place for their aesthetics subset? I'm looking for "aesthetically pleasing" images, like artwork, drawing, nice photos, etc. submitted by /u/thehomelessman0 [link] [comments]

Sun Jun 2, 2024 02:23
[D] large size transaction classification problem

Hi, I have a general question that I'm seeking advice on: my product is based around being able to do good transaction classifications. Here is an example of our input: "AMZN", "20231123", "$24.09" Respectively this refers to the company name, datestring, and amount paid. We have millions of transactions a day! Put simply, when a user makes a transaction,...

Sun Jun 2, 2024 02:23

Build your own newsfeed

Ready to give it a go?
Start a 14-day trial, no credit card required.

Create account