Is there a Ben Eater's Bread Board Computer/6502 type of content creator for home networks?

𞋴𝛂𝛋𝛆@lemmy.world · 3 months ago

Arduino tit…? Sounds… Interesting. Is that analog or digital read?

𞋴𝛂𝛋𝛆@lemmy.world · 6 months ago

What is involved with passing this through a whitelist firewall filter?

𞋴𝛂𝛋𝛆@lemmy.world · 7 months ago

I don't commute or ride in traffic any more. I have no margin left. My last hit was in early 2014. Bosch drive e-bikes became retail available around the summer of 2013 in south Orange County California, and were not present in substantial numbers until around 2018.

Now, drivers are much more aware of faster bikes in bike lanes. In all the crashes I was in between 2009 and 2014, I was even faster than most e-bikes are now, but I was an extreme anomaly in that respect. Bikes were not super rare on the road, but racers on general roads commuting have always been rare. Like if you’re going to train, it is not on the surface streets. Several of my crashes were from a time when I rode a 33 mile route each way to and from work 5-6 days a week. I’m one of the most hardcore all-weather, nothing-stops-me roadies you’ll ever meet. Like I ride home with broken bones just to say I made it.

Anyways, I’m on a tangent. On the road, around unpredictable drivers, my rather rare speed led to crashes. I had hundreds, if not thousands, of near misses. I had 6 crashes from cars in 150k miles of riding and have had none since. I am at around 250k now. I’m a lot slower by average speed, and I never ride around traffic like I did back then. Both of my bad crashes were from someone making an illegal u-turn. That is the one event where intuition lies and there is nothing a person can do to escape.

It looks exactly like all of the hundreds of times when someone has pulled out in front of you and cut you off. So you instinctively swerve, but as you do so, the car keeps going and closes the escape route faster than the brain will reprocess the inputs.

It is no different for a driver in a passing car. The worst scenario is being on a bike, right behind that passing car, and being as fast as the cars on a slight down hill when someone pulls a sudden u-turn into a passing SUV. That is what got me. The car in front of me was doing 35mph and never braked. It was a Jeep Grand Cherokee t-boning a Mitsubishi Montero. I know all about it from court stuff, but I went black retroactively to the moment I merged behind the Jeep until I was in the ICU 3 hours later. I braked according to witnesses, but my Garmin GPS computer showed I made contact at 29.7mph. I was folded in half backwards.

All but one of my crashes were like that, where it was absolutely due to errors of dumb drivers. All were also in the most southern parts of Orange County CA, in smaller areas with poor infrastructure. At the time, I rode mostly in more developed areas of city with better infrastructure and those are generally much safer. I had a lot of close calls in those areas but they are usually avoidable within the space available, unlike people that get lost or are dopey on the fringes where there is no proper infrastructure.

𞋴𝛂𝛋𝛆@lemmy.world · 7 months ago

I’ve been hit by 7 cars in 6 crashes. Three caused only a few scratches and bruises, one made a wheel taco, one left the bike frame in two pieces, and the last cost me 8.5 of 9 lives to fight and total two SUVs. I can’t say that I recommend any, but I will say definitely don’t fight two at once

𞋴𝛂𝛋𝛆@lemmy.world · 8 months ago

deleted by creator

𞋴𝛂𝛋𝛆@lemmy.world · 8 months ago

Trick is to find a gem with a few bedrooms and sublet. I got a little lucky I guess in that I liked to paint cars and that business has a massive overhead cost. I was never in the black or even green really, but rent was less significant against my business expenses. I could leverage my skills to stretch my paint supplies if I had a roommate come up short. Plus the used car lots I worked for exposed me to a lot of people that needed a place close by and often lacked transportation in general. I used them as runners for bigger jobs I did from a home shop in the back yard.

Getting started with a business like that is hard. I was given a few thousand dollars to start off, but I mostly had to deal with expanding over time where I would buy some new thing or products when a job I had could pay for the tool or thing.

Still the recession of 2007 - 2008 killed the used car market for several months in a row, even at the most scammy ‘buy-here pay-here’ lots, and took me out.

𞋴𝛂𝛋𝛆@lemmy.world · 8 months ago

I don’t know about that one. Get a place of your own and be a little bit social. In my experience, the girl thing just happened like that. It’s like they can smell house vibes.

𞋴𝛂𝛋𝛆@lemmy.world · 8 months ago

I haven’t looked into the issue of PCIe lanes and the GPU.

I don’t think it should matter with a smaller PCIe bus, in theory, if I understand correctly (unlikely). The only time a lot of data is transferred is when the model layers are initially loaded. Like with Oobabooga when I load a model, most of the time my desktop RAM monitor widget does not even have the time to refresh and tell me how much memory was used on the CPU side. What is loaded in the GPU is around 90% static. I have a script that monitors this so that I can tune the maximum number of layers. I leave overhead room for the context to build up over time but there are no major changes happening aside from initial loading. One just sets the number of layers to offload on the GPU and loads the model. However many seconds that takes is irrelevant startup delay that only happens once when initiating the server.

So assuming the kernel modules and hardware support the more narrow bandwidth, it should work… I think. There are laptops that have options for an external FireWire GPU too, so I don’t think the PCIe bus is too baked in.

𞋴𝛂𝛋𝛆@lemmy.world · edit-2 8 months ago

Anything under 16 is a no go. Your number of CPU cores are important. Use Oobabooga Textgen for an advanced llama.cpp setup that splits between the CPU and GPU. You'll need at least 64 GB of RAM or be willing to offload layers using the NVME with deepspeed. I can run up to a 72b model with 4 bit quantization in GGUF with a 12700 laptop with a mobile 3080Ti which has 16GB of VRAM (mobile is like that).

I prefer to run a 8×7b mixture of experts model because only 2 of the 8 are ever running at the same time. I am running that in 4 bit quantized GGUF and it takes 56 GB total to load. Once loaded it is about like a 13b model for speed but is ~90% of the capabilities of a 70b. The streaming speed is faster than my fastest reading pace.

A 70b model streams at my slowest tenable reading pace.

Both of these options are exponentially more capable than any of the smaller model sizes even if you screw around with training. Unfortunately, this streaming speed is still pretty slow for most advanced agentic stuff. Maybe if I had 24 to 48gb it would be different, I cannot say. If I was building now, I would be looking at what hardware options have the largest L1 cache, the most cores that include the most advanced AVX instructions. Generally, anything with efficiency cores are removing AVX and because the CPU schedulers in kernels are usually unable to handle this asymmetry consumer junk has poor AVX support. It is quite likely that all the problems Intel has had in recent years has been due to how they tried to block consumer stuff from accessing the advanced P-core instructions that were only blocked in microcode. It requires disabling the e-cores or setting up a CPU set isolation in Linux or BSD distros.

You need good Linux support even if you run windows. Most good and advanced stuff with AI will be done with WSL if you haven’t ditched doz for whatever reason. Use https://linux-hardware.org/ to see support for devices.

The reason I mentioned avoid consumer e-cores is because there have been some articles popping up lately about all p-core hardware.

The main constraint for the CPU is the L2 to L1 cache bus width. Researching this deeply may be beneficial.

Splitting the load between multiple GPUs may be an option too. As of a year ago, the cheapest option for a 16 GB GPU in a machine was a second hand 12th gen Intel laptop with a 3080Ti by a considerable margin when all of it is added up. It is noisy, gets hot, and I hate it many times, wishing I had gotten a server like setup for AI, but I have something and that is what matters.

𞋴𝛂𝛋𝛆@lemmy.world · 9 months ago

Abstract solutions for content recognition with a bot on a server is not a platform specific issue. The dev is skilled and likely on Matrix too.

𞋴𝛂𝛋𝛆@lemmy.world · 9 months ago

It is a bot that identifies CSAM images. They are a very skilled dev. The problem is content recognition on a server. So in abstract, it is the same problem.

𞋴𝛂𝛋𝛆@lemmy.world · 9 months ago

Search for posts or contact db0. IIRC they worked with LW admin and others to create a filter for this using a very small AI model. It should be on their Git.

𞋴𝛂𝛋𝛆@lemmy.world · 9 months ago

Plan 9

𞋴𝛂𝛋𝛆@lemmy.world · 9 months ago

Need max AVX instructions. Anything with P/E cores is junk. Only enterprise P cores have the max AVX instructions. When P/E are mixed the advanced AVX is disabled in microcode because the CPU scheduler is unable to determine if a process thread contains an AVX instruction and there is no asymmetrical scheduler that handles this. Prior to early 12k series Intel, the microcode for P enterprise could allegedly run if swapped manually. This was “fused off” to prevent it, probably because Linux could easily be adapted to asymmetrical scheduling but Windows would probably not. The whole reason W11 had to be made was because of the E-cores and the way the scheduler and spin up of idol cores works, at least according to someone on Linux Plumbers for the CPU scheduler ~2020. There are already asymmetric schedulers in Android ARM.

Anyways I think it was on Gamer’s Nexus in the last week or two that Intel was doing some all P core consumer stuff. I’d look at that. According to chips and cheese, the primary CPU bottleneck for tensors is the bus width and clock management of the L2 to L1 cache.

I do alright with my laptop, but haven’t tried R1 stuff yet. The 70B llama2 stuff that I ran was untenable for CPU only with a 12700 with just CPU. It is a little slower than my reading pace when split with a 16 GB GPU, and that was running a 4 bit quantization version.

𞋴𝛂𝛋𝛆@lemmy.world · 10 months ago

Not unless an http port is open too. If the only port is https, you have to have the certificate. Like with my AI stuff it acts like the host is down if I try to connect with http. You have to have the certificate to decrypt anything at all from the host.

𞋴𝛂𝛋𝛆@lemmy.world · 10 months ago

Sorta, you have to install your certificate authority into the browser and it might complain about verifying that but it will still connect with the encryption.

𞋴𝛂𝛋𝛆@lemmy.world · 10 months ago

deleted by creator

𞋴𝛂𝛋𝛆@lemmy.world · 10 months ago

I mean more like a self signed TLS certificate with your own host manually set in the browser. Then only make the TLS port available, or something like that. If you have access to both(all) devices, you should be able to fully encrypt by bruit force and without registering the certificate with anyone. That is what I do with AI at home.

𞋴𝛂𝛋𝛆@lemmy.world · 10 months ago

I’ve half ass thought about this but never have tried to actually self host. If you have access to all devices, why not just use your own self signed certificates to encrypt everything and require the certificate for all connections? Then there is never a way to log in or connect right? The only reason for any authentication is to make it possible to use any connection to dial into your server. So is that a bug or a feature. Maybe I’m missing something fundamental in this abstract concept that someone will tell me?

𞋴𝛂𝛋𝛆@lemmy.world · 11 months ago

I’ve tried 3 times so far in Python/gradio/Oobabooga and never managed to get certs to work or found a complete visual reference guide that demonstrates a complete working example like what I am looking for in a home network. (Only really commenting to subscribe to watch this post develop, and solicit advice:)

𞋴𝛂𝛋𝛆@lemmy.world · 1 year ago

Is there a Ben Eater's Bread Board Computer/6502 type of content creator for home networks?

𞋴𝛂𝛋𝛆@lemmy.world · 2 years ago

Is there a way to run old bare metal hardware on LAN for a dedicated computing task like AI?

𞋴𝛂𝛋𝛆@lemmy.world · 2 years ago

Is it practical to use containers on an OS like Silverblue only for Nvidia GPU stuff while using the APU for a Wayland only desktop?

𞋴𝛂𝛋𝛆@lemmy.world · 2 years ago

What's the best open hardware cheap DIY NAS to toss on a router like OpenWRT?

𞋴𝛂𝛋𝛆@lemmy.world · edit-2 2 years ago

Is there a goto conference talk about Activity Pub, Fediverse, and Lemmy yet?

𞋴𝛂𝛋𝛆