For example comparing LLaMA-3 to GPT-4. To me it doesn't make sense because it feels like comparing a model to an actual product. A product being a model with a bunch of other things that have been applied (e.g., pre-/post-processing techniques). submitted by /u/Seankala [link] [comments]
I have a dataframe with 30 rows and 10 columns. 5 of the columns are input features and the other 5 are output/target columns. The target columns contain classes represented as 0, 1, 2. I want to split the dataset into train and test such that, in the train set, for each output column, the proportion of class 1 is between 0.15 and 0.3. (I am not bothered...
Hi, I'm interested in leveraging the Stable Diffusion v1 model as a feature extractor for a downstream task. Specifically, I want to do this without involving the text prompt mechanism. To achieve this, I'm considering: Eliminating the text encoder module Removing the cross-attention layers in the UNet Is this a feasible approach? Has anyone experimented...
If I can get a good deal on a Siena Epyc CPU 32 or 64 core with say 256GB RAM is that a decent starting point to get into a local setup to run models? I need the lanes for the GPUs. submitted by /u/Kltpzyxmm [link] [comments]
Hi all! I am currently working on a model that combines CNNs and MultiHeadAttention (PyTorch) to analyze speech. I am having some issues with residual connections. Would somebody be able to help me? I can share code on GitHub. If interested, send a DM over here.🤗 Thank you! submitted by /u/NeatFox5866 [link] [comments]
I submitted to the ACL rolling review in June and found the reviewers' evaluation scores very subjective. Although the ACL committee has an instruction on some basic reviewing guidelines, there lacks of a preliminary test for the reviewers to explicitly show the evaluation standards. Maybe we should provide some paper-score examples to prompt the reviewers...
Build your own newsfeed
Ready to give it a go?
Start a 14-day trial, no credit card required.