• 8 Posts
  • 51 Comments
Joined 5 years ago
cake
Cake day: June 30th, 2020

help-circle


  • It really depends on how you quantize the model and the K/V cache as well. This is a useful calculator. https://smcleod.net/vram-estimator/ I can comfortably fit most 32b models quantized to 4-bit (usually KVM or IQ4XS) on my 3090’s 24 GB of VRAM with a reasonable context size. If you’re going to be needing a much larger context window to input large documents etc then you’d need to go smaller with the model size (14b, 27b etc) or get a multi GPU set up or something with unified memory and a lot of ram (like the Mac Minis others are mentioning).
















  • The US government’s position on this can be summed up as “massive unaccountable US tech firms having all of your data and manipulating public opinion via their black box algorithms is okay, but Chinese companies doing that is a national security concern”. I call BS. The degree to which China is actually a US adversary is being massively overstated by the US government as they see this as a threat to US geopolitical hegemony and America’s ability to propagandize its own citizens. I have spent some time on RedNote (Xiaohongshu) and all I have seen is friendly cross-cultural exchange and discussion between these supposed ‘adversaries’.






  • Okay, can I hijack this thread to ask the question from some fellow coffee enthusiasts: Do decaf beans actually tend to suck? I would be interested in a decaf or half-caff blend but curious what the connoisseurs think… and sorry no I haven’t searched for a post on this in the community so feel free to downvote the crap out of me / take mercy on me and link one of it exists…