Oh and I typically get 16-20 tok/s running a 32b model on Ollama using Open WebUI. Also I have experienced issues with 4-bit quantization for the K/V cache on some models myself so just FYI
- 8 Posts
- 51 Comments
It really depends on how you quantize the model and the K/V cache as well. This is a useful calculator. https://smcleod.net/vram-estimator/ I can comfortably fit most 32b models quantized to 4-bit (usually KVM or IQ4XS) on my 3090’s 24 GB of VRAM with a reasonable context size. If you’re going to be needing a much larger context window to input large documents etc then you’d need to go smaller with the model size (14b, 27b etc) or get a multi GPU set up or something with unified memory and a lot of ram (like the Mac Minis others are mentioning).
FrankLaskey@lemmy.mlto Piracy@lemmy.ml•I want to be able to stream live sports on my jellyfin serverEnglish1·2 months agoThis would be a great potential improvement in UX for streaming sports feeds for sure - not having to navigate web pages and start / manage streams manually etc. Does anyone know if this is possible for sites serving these streams like FirstRowSports or StreamEast etc?
FrankLaskey@lemmy.mlto Cool Guides@lemmy.ca•A cool guide to the minimum wage by state in the USA (source: reddit)English4·2 months agoIt would be more interesting to see this with a cost of living figure for each state as well.
FrankLaskey@lemmy.mlto Technology@lemmy.ml•Nvidia teams up with DeepSeek for R1 optimizations on Blackwell, boosting revenue by 25xEnglish31·2 months agoHopefully these improvements will become available to other Nvidia GPU architectures like Ada and Ampere in the future as well.
FrankLaskey@lemmy.mlto Technology@beehaw.org•Apple Maps May Soon Feature Ads, But Not Everyone's Onboard - gHacks Tech NewsEnglish1·2 months agoIs it possible to use StreetComplete on iOS?
FrankLaskey@lemmy.mlto Technology@lemmy.world•Perplexity open sources R1 1776, a version of the DeepSeek R1 model that CEO Aravind Srinivas says has been “post-trained to remove the China censorship”.English246·2 months agoI think we can all agree that modifications to these models which remove censorship and propaganda on behalf of one particular country or party is valuable for the sake of accuracy and impartiality, but reading some of the example responses for the new model I honestly find myself wondering if they haven’t gone a bit further than that by replacing some of the old non-responses and positive portrayals of China and the CPC with a highly critical perspective typified by western governments which are hostile to China (in particular the US). Even the name of the model certainly doesn’t make it sound like neutrality and accuracy is their primary aim here.
FrankLaskey@lemmy.mlto Technology@lemmy.world•Linux's Sole Wireless/WiFi Driver Maintainer Is Stepping Down - PhoronixEnglish1730·2 months agoI used to daily drive Ubuntu some years ago for work/personal use but have been back on Win 10 primarily for the last 4-5 years. I was considering trying to go back due to how much Windows sucks (despite some proprietary software only being available on it) but remembering the trouble I had with some networking/printer drivers and troubleshooting those issues and then seeing this article Is definitely making me reconsider…
FrankLaskey@lemmy.mlOPtoLemmy Support@lemmy.ml•Is it possible to set a minimum number of upvotes for a post to be shown in my feed?English5·2 months agoYeah I use voyager pretty much exclusively on my iPhone so maybe I should request a feature like that there? Seems like it would be something that many people would appreciate. Not sure why I end up seeing posts with -10, -15 votes… Those are generally trash haha
Is this what the Leonard Cohen song is about?
FrankLaskey@lemmy.mlto You Should Know@lemmy.world•YSK there's a tool to check US non-profit compensationEnglish1·3 months agoIt would be cool if they would provide some useful statistics about the aggregated data as well. Maybe something like showing the percentile for pay to the ED/CEO or for the total compensation compared to other organizations in the sector.
I didn’t scour the site so maybe this does exist.
FrankLaskey@lemmy.mlto News@lemmy.world•Chinese TikTok alternative RedNote could pose greater security risks, experts sayEnglish3913·3 months agoThe US government’s position on this can be summed up as “massive unaccountable US tech firms having all of your data and manipulating public opinion via their black box algorithms is okay, but Chinese companies doing that is a national security concern”. I call BS. The degree to which China is actually a US adversary is being massively overstated by the US government as they see this as a threat to US geopolitical hegemony and America’s ability to propagandize its own citizens. I have spent some time on RedNote (Xiaohongshu) and all I have seen is friendly cross-cultural exchange and discussion between these supposed ‘adversaries’.
First time I’ve seen y’all refer to each other as Beeple and I’m just dropping in here to say I love the term.
FrankLaskey@lemmy.mlto Privacy@lemmy.ml•Organic Maps Turns 4 Years: The Privacy-Focused Alternative to Google MapsEnglish1·4 months agoTrue; I think I used LineageOS or similar back when I was still in Android but if you’re not in the 0.01% who do have a custom Android OS installed it seems like a privacy focused map app is still of limited use potentially.
FrankLaskey@lemmy.mlto Privacy@lemmy.ml•Organic Maps Turns 4 Years: The Privacy-Focused Alternative to Google MapsEnglish3·4 months agoThis looks like it has come a long way, but since this is a privacy community I have to ask: Realistically, whether you are on iOS or Android, isn’t it likely Google or Apple are still tracking your location much of the time directly from the OS?
Okay, can I hijack this thread to ask the question from some fellow coffee enthusiasts: Do decaf beans actually tend to suck? I would be interested in a decaf or half-caff blend but curious what the connoisseurs think… and sorry no I haven’t searched for a post on this in the community so feel free to downvote the crap out of me / take mercy on me and link one of it exists…
FrankLaskey@lemmy.mlto Lemmy Shitpost@lemmy.world•Interesting. It's a constant reminderEnglish5·5 months agoBased on the differences in color for each handle it makes me wonder if the one for not washing your hands is a different material. Maybe an antimicrobial metal like a copper alloy.
Looks like it now has Docling Content Extraction Support for RAG. Has anyone used Docling much?